Octave also supports linear least squares minimization.  That is,
Octave can find the parameter b such that the model
y = x*b
fits data (x,y) as well as possible, assuming zero-mean
Gaussian noise.  If the noise is assumed to be isotropic the problem
can be solved using the ‘\’ or ‘/’ operators, or the ols
function.  In the general case where the noise is assumed to be anisotropic
the gls is needed.
[beta, sigma, r] = ols (y, x) ¶Ordinary least squares (OLS) estimation.
OLS applies to the multivariate model y = x*b + e where y is a t-by-p matrix, x is a t-by-k matrix, b is a k-by-p matrix, and e is a t-by-p matrix.
Each row of y is a p-variate observation in which each column represents a variable. Likewise, the rows of x represent k-variate observations or possibly designed values. Furthermore, the collection of observations x must be of adequate rank, k, otherwise b cannot be uniquely estimated.
The observation errors, e, are assumed to originate from an
underlying p-variate distribution with zero mean and
p-by-p covariance matrix S, both constant conditioned
on x.  Furthermore, the matrix S is constant with respect to
each observation such that
mean (e) = 0 and
cov (vec (e)) = kron (s, I).
(For cases
that don’t meet this criteria, such as autocorrelated errors, see
generalized least squares, gls, for more efficient estimations.)
The return values beta, sigma, and r are defined as follows.
The OLS estimator for matrix b.
beta is calculated directly via
inv (x'*x) * x' * y if the matrix
x'*x is of full rank.
Otherwise, beta = pinv (x) * y where
pinv (x) denotes the pseudoinverse of x.
The OLS estimator for the matrix s,
sigma = (y-x*beta)' * (y-x*beta) / (t-rank(x))
The matrix of OLS residuals, r = y - x*beta.
[beta, v, r] = gls (y, x, o) ¶Generalized least squares (GLS) model.
Perform a generalized least squares estimation for the multivariate model y = x*B + E where y is a t-by-p matrix, x is a t-by-k matrix, b is a k-by-p matrix and e is a t-by-p matrix.
Each row of y is a p-variate observation in which each column represents a variable. Likewise, the rows of x represent k-variate observations or possibly designed values. Furthermore, the collection of observations x must be of adequate rank, k, otherwise b cannot be uniquely estimated.
The observation errors, e, are assumed to originate from an
underlying p-variate distribution with zero mean but possibly
heteroscedastic observations.  That is, in general,
mean (e) = 0 and
cov (vec (e)) = (s^2)*o
in which s is a scalar and o is a
t*p-by-t*p
matrix.
The return values beta, v, and r are defined as follows.
The GLS estimator for matrix b.
The GLS estimator for scalar s^2.
The matrix of GLS residuals, r = y - x*beta.
See also: ols.
x = lsqnonneg (c, d) ¶x = lsqnonneg (c, d, x0) ¶x = lsqnonneg (c, d, x0, options) ¶[x, resnorm] = lsqnonneg (…) ¶[x, resnorm, residual] = lsqnonneg (…) ¶[x, resnorm, residual, exitflag] = lsqnonneg (…) ¶[x, resnorm, residual, exitflag, output] = lsqnonneg (…) ¶[x, resnorm, residual, exitflag, output, lambda] = lsqnonneg (…) ¶Minimize norm (c*x - d) subject to
x >= 0.
c and d must be real matrices.
x0 is an optional initial guess for the solution x.
options is an options structure to change the behavior of the
algorithm (see optimset).  lsqnonneg
recognizes these options: "MaxIter", "TolX".
Outputs:
The squared 2-norm of the residual: norm (c*x-d)^2
The residual: d-c*x
An indicator of convergence. 0 indicates that the iteration count was exceeded, and therefore convergence was not reached; >0 indicates that the algorithm converged. (The algorithm is stable and will converge given enough iterations.)
A structure with two fields:
"algorithm": The algorithm used ("nnls")
"iterations": The number of iterations taken.
Lagrange multipliers.  If these are nonzero, the corresponding x
values should be zero, indicating the solution is pressed up against a
coordinate plane.  The magnitude indicates how much the residual would
improve if the x >= 0 constraints were relaxed in that
direction.
x = lscov (A, b) ¶x = lscov (A, b, V) ¶x = lscov (A, b, V, alg) ¶[x, stdx, mse, S] = lscov (…) ¶Compute a generalized linear least squares fit.
Estimate x under the model b = Ax + w, where the noise w is assumed to follow a normal distribution with covariance matrix {\sigma^2} V.
If the size of the coefficient matrix A is n-by-p, the size of the vector/array of constant terms b must be n-by-k.
The optional input argument V may be an n-element vector of positive weights (inverse variances), or an n-by-n symmetric positive semi-definite matrix representing the covariance of b. If V is not supplied, the ordinary least squares solution is returned.
The alg input argument, a guidance on solution method to use, is currently ignored.
Besides the least-squares estimate matrix x (p-by-k), the function also returns stdx (p-by-k), the error standard deviation of estimated x; mse (k-by-1), the estimated data error covariance scale factors (\sigma^2); and S (p-by-p, or p-by-p-by-k if k > 1), the error covariance of x.
Reference: Golub and Van Loan (1996), Matrix Computations (3rd Ed.), Johns Hopkins, Section 5.6.3
() ¶options = optimset () ¶options = optimset (par, val, …) ¶options = optimset (old, par, val, …) ¶options = optimset (old, new) ¶Create options structure for optimization functions.
When called without any input or output arguments, optimset prints
a list of all valid optimization parameters.
When called with one output and no inputs, return an options structure with
all valid option parameters initialized to [].
When called with a list of parameter/value pairs, return an options structure with only the named parameters initialized.
When the first input is an existing options structure old, the values are updated from either the par/val list or from the options structure new.
Valid parameters are:
Request verbose display of results from optimizations. Values are:
"off" [default]No display.
"iter"Display intermediate results for every loop iteration.
"final"Display the result of the final loop iteration.
"notify"Display the result of the final loop iteration if the function has failed to converge.
When enabled, display an error if the objective function returns an invalid
value (a complex number, NaN, or Inf).  Must be set to "on" or
"off" [default].  Note: the functions fzero and
fminbnd correctly handle Inf values and only complex values or NaN
will cause an error in this case.
When set to "on", the function to be minimized must return a
second argument which is the gradient, or first derivative, of the
function at the point x.  If set to "off" [default], the
gradient is computed via finite differences.
When set to "on", the function to be minimized must return a
second argument which is the Jacobian, or first derivative, of the
function at the point x.  If set to "off" [default], the
Jacobian is computed via finite differences.
Maximum number of function evaluations before optimization stops. Must be a positive integer.
Maximum number of algorithm iterations before optimization stops. Must be a positive integer.
A user-defined function executed once per algorithm iteration.
Termination criterion for the function output.  If the difference in the
calculated objective function between one algorithm iteration and the next
is less than TolFun the optimization stops.  Must be a positive
scalar.
Termination criterion for the function input.  If the difference in x,
the current search point, between one algorithm iteration and the next is
less than TolX the optimization stops.  Must be a positive scalar.
See also: optimget.
optval = optimget (options, optname) ¶optval = optimget (options, optname, default) ¶Return the specific option optname from the optimization options
structure options created by optimset.
If optname is not defined then return default if supplied, otherwise return an empty matrix.
See also: optimset.