The OLS Formula
If you choose an option to add a trend line in excel or other programs like Python, SAS or R, they are essentially plotting OLS-dervied coordinates. In this section, I explain in detail what the computer does when it tries to find the line that best fits the data points.
The basic idea is very simple. The computer decomposese the actual value of an observation ($Y$) into fitted value ($\hat{Y}$) and a residual ($e$). The best fitted line is the line that minimizes the overall distance between the actual and the predicted values.
Since negative and positive residuals cancel each other, we won't be taking simple sum. One way to resolve this problem and find the "best fit" at the same time is to minimize the sum of squared vertical distances between data points and the equation line - hence, the name of this method is ordinary least squares, OLS.
Since we write $y_i = \hat{\alpha}+ \hat{\beta} x_i + e_i $ and we want to find $\hat{a}$ and $\hat{b}$ such that minimizes $\sum^N_i=1 e^2_i$, we have:
The first order conditions (FOC) for this minimization are:
where $\bar{X} = \frac{\sum^N_{i=1} x_i}{N} = E(X)$, the sample average.
Solving this system of equations, we obtain:
Note: This simplified formula does not work on multivariate regression.
'Study Note > Econometrics' 카테고리의 다른 글
【Econometrics】The Fitted Line : Estimation of Beta (0) | 2020.07.10 |
---|---|
【Econometrics】The Fitted Line : Basic Idea (0) | 2020.07.10 |
【Econometrics】Data Generating Process (0) | 2020.07.10 |
【Econometrics】Proof of Properties of Estimator (0) | 2020.06.27 |
【Econometrics】Estimator (0) | 2020.06.27 |
【Econometrics】Derivatives (0) | 2020.06.17 |
【Econometrics】Data Types (0) | 2020.06.17 |
【Econometrics】Econometric Questions (0) | 2020.06.17 |