Least squares

Introduction

The method of Least Squares is a standard approach in statistical regression analysis for minimizing the sum of the squares of the differences between observed and estimated values. It is a fundamental technique in the field of statistics and is widely used in data analysis, econometrics, engineering, and machine learning. The least squares method provides a solution to the problem of fitting a model to a set of observations by minimizing the discrepancies between the observed data and the model's predictions.

Historical Background

The least squares method was first developed by Carl Friedrich Gauss and Adrien-Marie Legendre in the early 19th century. Legendre introduced the method in 1805, while Gauss claimed to have used it since 1795. The method gained widespread acceptance after Gauss applied it to the problem of predicting the orbit of the asteroid Ceres.

Mathematical Formulation

The least squares method can be formulated as follows: Given a set of observations \((x_i, y_i)\) for \(i = 1, 2, \ldots, n\), where \(x_i\) are the independent variables and \(y_i\) are the dependent variables, the goal is to find the parameters \(\beta\) of the model \(y = f(x, \beta)\) that minimize the sum of the squared residuals:

\[ S(\beta) = \sum_{i=1}^{n} [y_i - f(x_i, \beta)]^2 \]

In the simplest case of linear regression, the model is \(y = \beta_0 + \beta_1 x\), and the objective is to find \(\beta_0\) and \(\beta_1\) that minimize:

\[ S(\beta_0, \beta_1) = \sum_{i=1}^{n} [y_i - (\beta_0 + \beta_1 x_i)]^2 \]

Linear Least Squares

Ordinary Least Squares (OLS)

Ordinary Least Squares (OLS) is the most common form of linear least squares. It assumes that the relationship between the independent variable \(x\) and the dependent variable \(y\) is linear. The OLS estimator is obtained by solving the normal equations:

\[ X^T X \beta = X^T y \]

where \(X\) is the design matrix, \(y\) is the vector of observed values, and \(\beta\) is the vector of parameters.

Properties of OLS Estimators

The OLS estimators have several important properties:

**Unbiasedness**: The OLS estimators are unbiased, meaning that the expected value of the estimators equals the true parameter values.
**Efficiency**: Among all linear unbiased estimators, the OLS estimators have the smallest variance.
**Consistency**: As the sample size increases, the OLS estimators converge to the true parameter values.

Nonlinear Least Squares

Nonlinear least squares is used when the relationship between the independent and dependent variables is nonlinear. The objective is to minimize the sum of the squared residuals for a nonlinear model \(y = f(x, \beta)\). This requires iterative optimization techniques such as the Gauss-Newton algorithm, Levenberg-Marquardt algorithm, or gradient descent.

Weighted Least Squares

Weighted Least Squares (WLS) is a generalization of the ordinary least squares method that allows for heteroscedasticity, where the variance of the errors is not constant. In WLS, each observation is assigned a weight \(w_i\), and the objective is to minimize the weighted sum of squared residuals:

\[ S(\beta) = \sum_{i=1}^{n} w_i [y_i - f(x_i, \beta)]^2 \]

Applications

The least squares method is widely used in various fields:

**Econometrics**: For estimating economic models and forecasting.
**Engineering**: In control systems, signal processing, and curve fitting.
**Machine Learning**: In training algorithms such as linear regression and support vector machines.
**Physics**: For data fitting and parameter estimation in experimental data.

Computational Aspects

The computational efficiency of least squares methods is crucial, especially for large datasets. Techniques such as the QR decomposition, singular value decomposition (SVD), and Cholesky decomposition are used to solve the normal equations efficiently.

Robustness and Alternatives

While the least squares method is powerful, it is sensitive to outliers. Robust regression techniques, such as least absolute deviations (LAD) and RANSAC, provide alternatives that are less sensitive to outliers.

Conclusion

The least squares method is a cornerstone of statistical analysis and modeling. Its versatility and effectiveness make it an essential tool in various scientific and engineering disciplines. Understanding its mathematical foundations, properties, and applications is crucial for anyone involved in data analysis and modeling.