Maximum likelihood estimation with flow data

From HandWiki

Maximum likelihood estimation with flow data is a parametric approach to deal with flow sampling data.

Description

Assume that we have observations of ai the time a person enters the state of interest, some observables xi, and the censoring of the flow data takes on a particular form. In particular [math]\displaystyle{ t_i = \min(t_i^U,L) }[/math], where ti is the observed duration outcome, [math]\displaystyle{ t_i^U }[/math] is the underlying continuous variable and L is the censoring threshold.[1] For instance, when thinking about unemployment spells, ai is the data of entering unemployment, xi is a vector of worker characteristics, and ti is the observed unemployment duration. If we only follow the workers for a certain period of time, this variable is necessarily a censored version of the true unemployment duration.

Two key assumptions allow for setting up the loglikelihood. First, a distributional form for the latent variable [math]\displaystyle{ t_i^U }[/math] needs to be assumed. Second, independence between the true duration and the starting point of the spell is assumed, i.e.,

[math]\displaystyle{ F(t_i^U \lor x_i, a_i, L) = F(t_i^U \lor x_i) }[/math]

where F is the conditional distribution of the underlying duration variable.[2] This latter assumption allows us to model the probability that the variable is censored, i.e.,

[math]\displaystyle{ \Pr(t_i^U \ge L \lor x_i) = 1 - F(L \lor x_i) }[/math]

which leads to the following log likelihood:

[math]\displaystyle{ \sum_{i=1}^n [ d_i \log(f(t_i \lor x_i)) + (1-d_i) \log(1-F(L \lor x_i )) ] }[/math]

where f is the density associated with the distribution F and di is an indicator denoting whether ti = L.[3] Additionally, it is possible to have the threshold vary at the observational level, by replacing L by Li in the formulas above.[4]

Tests of specification in duration models encompass testing for the validity of the imposed functional form. Tests of restrictions on the functional form are similar to those testing for unobserved heterogeneity, where the restriction imposes no such heterogeneity. Nevertheless, it is often desirable to test for such heterogeneity, as this can bias the estimation of the hazard rate.[5] Similarly, tests for censoring exist that compare the distribution of the generalized error under the censored and the uncensored assumption.[6]

References

  1. Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Massachusetts
  2. Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Massachusetts
  3. Hayashi, F. (2000): Econometrics. Princeton University Press, New Jersey.
  4. Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Massachusetts
  5. Cameron A. C. and P. K. Trivedi (2005): Microeconometrics: Methods and Applications. Cambridge University Press, New York.
  6. Jaggia, S. and P. K. Trivedi (1994): Joint and Separate Score Test for Heterogeneity in a Censored Exponential Model. Review of Economics and Statistics, 79, pp. 340–343.