The Fitted Line
Estimation of $\beta$
It is important at this point to note that the value of our estimates depends on the particular sample we drew from the population (If drawn samples are significantly different from each other, the estimates will be different). Using another sample of $P$ points would have resulted in different estimates values.
To make this point clear let's consider an example. Suppose that we have two variables - years of schooling ($X$) and hour wages ($Y$) - that are correlated as:
$$Y=2+0.5X+u$$.
This is our DGP (Data Generating Process). Now we do the following:
-
For every value of $X$ = 1, ... , 20 we generate on random value of $u$
(Here, $u \sim N(0,1)$, i.e., assume $u$ is drawn from a normal distribution with $\mu$ (mean) = 0 and $\sigma^2$ (variance) = 1 (unit variance)). -
Compute the corresponding values of $Y$, thus, obtaining a sample of 20 (based on $X$ with range of 1 throuh 20) observations.
-
From this newly generated sample, compute estimates of $\hat{\alpha}$ and $\hat{\beta}$.
Schematically, it can be summarized as:
Coefficient Formula
$\hat{\beta}=\frac{cov(X,Y)}{\sigma_x^2}$
$\hat{\alpha}=\bar{Y} -\hat{\beta}\bar{X}$
Now, repeat this experiment several times. Each time, 20 observations are drawn (from normal distribution) (essentially generated). Then, compute the estimates $\hat\beta$ and $\hat\alpha$. The results of each iteration are depicted below.
Dotted line: non-random part of the true model
Solid line: fitted line using a given sample
If we repeat this experiment 100 times and then construct a histogram for the $\hat\beta$, we obtain:
This experiment shows that our regression estimates ($\alpha$ and $\beta$) are random variables that under certain conditions (in this example, that disturbance term follows a normal distribution) are distributed around the true values of the parameters. But we usually observe just one estimate of $\hat\beta$, not the true value of $\beta$ (because however the data is prepared, the sample will remain sample unless the entire population is used for calculation). The remaining task is to make inference about plausible values of $\beta$ on the basis of our estimate $\hat\beta$. This is what we do through hypothesis testing.
To be Continued.
'Study Note > Econometrics' 카테고리의 다른 글
【Econometrics】Ordinary Least Square (OLS) (0) | 2020.07.16 |
---|---|
【Econometrics】The Fitted Line : Basic Idea (0) | 2020.07.10 |
【Econometrics】Data Generating Process (0) | 2020.07.10 |
【Econometrics】Proof of Properties of Estimator (0) | 2020.06.27 |
【Econometrics】Estimator (0) | 2020.06.27 |
【Econometrics】Derivatives (0) | 2020.06.17 |
【Econometrics】Data Types (0) | 2020.06.17 |
【Econometrics】Econometric Questions (0) | 2020.06.17 |