본문으로 바로가기

 

Data Generating Process

 

Suppose that a variable $Y$ is a linear function of another variable $X$, with unkown parameters $\beta_1$ and $\beta_2$ that we wish to estimate, as displayed in Fig. 1.

 

 

 

Figure 1

Suppose that we have 4 observations with $X$ values as shown. If the relationship were an exact one, the observations would lie on the straight line and we would have no trouble obtaining accurate estimates of $\beta_1$ and $\beta_2$. 

 

Figure 2

 

In practice, however, most economic relationships are not exact and the actual values of $Y$ are different from those corresponding to the straight line. The points $P_1$-$P_4$ displayed in Fig. 3 correspond to such a situation.

 

Figure 3

 

 

Non-Random ($\beta_1+\beta_{2}X$) and random ($u$) parts

To allow for such divergences, we write the model as: $Y = \beta_1+\beta_{2}X + u$, where $u$ is a disturbance term. In other words, we are assuming that the data for $Y$ is generated by the equation  $Y = \beta_1+\beta_{2}X + u$. This is our Data Generating Process (DGP).

 

 

Each value of $Y$ thus has a nonrandom component and a random component. 

 

Let's take a look at the first observation $X_1$. The first observation has been decomposed into nonrandom and random component.

 

Figure 4

 

To be continued in the next post.


Related Posts

2020/06/27 - [Studies/Econometrics] - 【Econometrics】Proof of Properties of Estimator

2020/06/27 - [Studies/Econometrics] - 【Econometrics】Estimator