Tobit model

Short description: Statistical model for censored regressands

In statistics, a tobit model is any of a class of regression models in which the observed range of the dependent variable is censored in some way.^[1] The term was coined by Arthur Goldberger in reference to James Tobin,^[2]^{[lower-alpha 1]} who developed the model in 1958 to mitigate the problem of zero-inflated data for observations of household expenditure on durable goods.^[3]^{[lower-alpha 2]} Because Tobin's method can be easily extended to handle truncated and other non-randomly selected samples,^{[lower-alpha 3]} some authors adopt a broader definition of the tobit model that includes these cases.^[4]

Tobin's idea was to modify the likelihood function so that it reflects the unequal sampling probability for each observation depending on whether the latent dependent variable fell above or below the determined threshold.^[5] For a sample that, as in Tobin's original case, was censored from below at zero, the sampling probability for each non-limit observation is simply the height of the appropriate density function. For any limit observation, it is the cumulative distribution, i.e. the integral below zero of the appropriate density function. The tobit likelihood function is thus a mixture of densities and cumulative distribution functions.^[6]

The likelihood function

Below are the likelihood and log likelihood functions for a type I tobit. This is a tobit that is censored from below at $y_{L}$ when the latent variable $y_{j}^{*} \leq y_{L}$ . In writing out the likelihood function, we first define an indicator function $I$ :

I (y) = {\begin{cases} 0 & if y \leq y_{L}, \\ 1 & if y > y_{L} . \end{cases}

Next, let $Φ$ be the standard normal cumulative distribution function and $φ$ to be the standard normal probability density function. For a data set with N observations the likelihood function for a type I tobit is

ℒ (β, σ) = \prod_{j = 1}^{N} {(\frac{1}{σ} φ (\frac{y_{j} - X_{j} β}{σ}))}^{I (y_{j})} {(1 - Φ (\frac{X_{j} β - y_{L}}{σ}))}^{1 - I (y_{j})}

and the log likelihood is given by

\begin{aligned} \log ℒ (β, σ) & = \sum_{j = 1}^{n} I (y_{j}) \log (\frac{1}{σ} φ (\frac{y_{j} - X_{j} β}{σ})) + (1 - I (y_{j})) \log (1 - Φ (\frac{X_{j} β - y_{L}}{σ})) \\ = \sum_{y_{j} > y_{L}} \log (\frac{1}{σ} φ (\frac{y_{j} - X_{j} β}{σ})) + \sum_{y_{j} = y_{L}} \log (Φ (\frac{y_{L} - X_{j} β}{σ})) \end{aligned}

Reparametrization

The log-likelihood as stated above is not globally concave, which complicates the maximum likelihood estimation. Olsen suggested the simple reparametrization $β = δ / γ$ and $σ^{2} = γ^{- 2}$ , resulting in a transformed log-likelihood,

\log ℒ (δ, γ) = \sum_{y_{j} > y_{L}} {\log γ + \log [φ (γ y_{j} - X_{j} δ)]} + \sum_{y_{j} = y_{L}} \log [Φ (γ y_{L} - X_{j} δ)]

which is globally concave in terms of the transformed parameters.^[7]

For the truncated (tobit II) model, Orme showed that while the log-likelihood is not globally concave, it is concave at any stationary point under the above transformation.^[8]^[9]

Consistency

If the relationship parameter $β$ is estimated by regressing the observed $y_{i}$ on $x_{i}$ , the resulting ordinary least squares regression estimator is inconsistent. It will yield a downwards-biased estimate of the slope coefficient and an upward-biased estimate of the intercept. Takeshi Amemiya (1973) has proven that the maximum likelihood estimator suggested by Tobin for this model is consistent.^[10]

Interpretation

The $β$ coefficient should not be interpreted as the effect of $x_{i}$ on $y_{i}$ , as one would with a linear regression model; this is a common error. Instead, it should be interpreted as the combination of

the change in $y_{i}$ of those above the limit, weighted by the probability of being above the limit;
the change in the probability of being above the limit, weighted by the expected value of $y_{i}$ if above.^[11]

$\frac{\partial 𝔼 [Y_{i} ∣ X_{i}]}{\partial x_{i k}} = \frac{\partial 𝔼 [Y_{i} ∣ Y_{i} > 0, X_{i}]}{\partial x_{i k}} \cdot ℙ (Y_{i} > 0 ∣ X_{i}) + \frac{\partial ℙ (Y_{i} > 0 ∣ X_{i})}{\partial x_{i k}} \cdot 𝔼 [Y_{i} ∣ Y_{i} > 0, X_{i}] .$

Variations of the tobit model

Variations of the tobit model can be produced by changing where and when censoring occurs. (Amemiya 1985) classifies these variations into five categories (tobit type I – tobit type V), where tobit type I stands for the first model described above. Schnedler (2005) provides a general formula to obtain consistent likelihood estimators for these and other variations of the tobit model.^[12]

Type I

The tobit model is a special case of a censored regression model, because the latent variable $y_{i}^{*}$ cannot always be observed while the independent variable $x_{i}$ is observable. A common variation of the tobit model is censoring at a value $y_{L}$ different from zero:

y_{i} = {\begin{cases} y_{i}^{*} & if y_{i}^{*} > y_{L}, \\ y_{L} & if y_{i}^{*} \leq y_{L} . \end{cases}

Another example is censoring of values above $y_{U}$ .

y_{i} = {\begin{cases} y_{i}^{*} & if y_{i}^{*} < y_{U}, \\ y_{U} & if y_{i}^{*} \geq y_{U} . \end{cases}

Yet another model results when $y_{i}$ is censored from above and below at the same time.

y_{i} = {\begin{cases} y_{i}^{*} & if y_{L} < y_{i}^{*} < y_{U}, \\ y_{L} & if y_{i}^{*} \leq y_{L}, \\ y_{U} & if y_{i}^{*} \geq y_{U} . \end{cases}

The rest of the models will be presented as being bounded from below at 0, though this can be generalized as done for Type I.

Type II

Type II tobit models introduce a second latent variable.^[13]

y_{2 i} = {\begin{cases} y_{2 i}^{*} & if y_{1 i}^{*} > 0, \\ 0 & if y_{1 i}^{*} \leq 0 . \end{cases}

In Type I tobit, the latent variable absorbs both the process of participation and the outcome of interest. Type II tobit allows the process of participation (selection) and the outcome of interest to be independent, conditional on observable data.

The Heckman selection model falls into the Type II tobit,^[14] which is sometimes called Heckit after James Heckman.^[15]

Type III

Type III introduces a second observed dependent variable.

y_{1 i} = {\begin{cases} y_{1 i}^{*} & if y_{1 i}^{*} > 0, \\ 0 & if y_{1 i}^{*} \leq 0 . \end{cases}

y_{2 i} = {\begin{cases} y_{2 i}^{*} & if y_{1 i}^{*} > 0, \\ 0 & if y_{1 i}^{*} \leq 0 . \end{cases}

The Heckman model falls into this type.

Type IV

Type IV introduces a third observed dependent variable and a third latent variable.

y_{1 i} = {\begin{cases} y_{1 i}^{*} & if y_{1 i}^{*} > 0, \\ 0 & if y_{1 i}^{*} \leq 0 . \end{cases}

y_{2 i} = {\begin{cases} y_{2 i}^{*} & if y_{1 i}^{*} > 0, \\ 0 & if y_{1 i}^{*} \leq 0 . \end{cases}

y_{3 i} = {\begin{cases} y_{3 i}^{*} & if y_{1 i}^{*} \leq 0, \\ 0 & if y_{1 i}^{*} < 0 . \end{cases}

Type V

Similar to Type II, in Type V only the sign of $y_{1 i}^{*}$ is observed.

y_{2 i} = {\begin{cases} y_{2 i}^{*} & if y_{1 i}^{*} > 0, \\ 0 & if y_{1 i}^{*} \leq 0 . \end{cases}

y_{3 i} = {\begin{cases} y_{3 i}^{*} & if y_{1 i}^{*} \leq 0, \\ 0 & if y_{1 i}^{*} > 0 . \end{cases}

Non-parametric version

If the underlying latent variable $y_{i}^{*}$ is not normally distributed, one must use quantiles instead of moments to analyze the observable variable $y_{i}$ . Powell's CLAD estimator offers a possible way to achieve this.^[16]

Applications

Tobit models have, for example, been applied to estimate factors that impact grant receipt, including financial transfers distributed to sub-national governments who may apply for these grants. In these cases, grant recipients cannot receive negative amounts, and the data is thus left-censored. For instance, Dahlberg and Johansson (2002) analyse a sample of 115 municipalities (42 of which received a grant).^[17] Dubois and Fattore (2011) use a tobit model to investigate the role of various factors in European Union fund receipt by applying Polish sub-national governments.^[18] The data may however be left-censored at a point higher than zero, with the risk of mis-specification. Both studies apply Probit and other models to check for robustness. Tobit models have also been applied in demand analysis to accommodate observations with zero expenditures on some goods. In a related application of tobit models, a system of nonlinear tobit regressions models has been used to jointly estimate a brand demand system with homoscedastic, heteroscedastic and generalized heteroscedastic variants.^[19]

Notes

↑ When asked why it was called the "tobit" model, instead of Tobin, James Tobin explained that this term was introduced by Arthur Goldberger, either as a portmanteau of "Tobin's probit", or as a reference to the novel The Caine Mutiny, a novel by Tobin's friend Herman Wouk, in which Tobin makes a cameo as "Mr Tobit". Tobin reports having actually asked Goldberger which it was, and the man refused to say. See Shiller, Robert J. (1999). "The ET Interview: Professor James Tobin". Econometric Theory 15 (6): 867–900. doi:10.1017/S0266466699156056.
↑ An almost identical model was independently suggested by Anders Hald in 1949, see Hald, A. (1949). "Maximum Likelihood Estimation of the Parameters of a Normal Distribution which is Truncated at a Known Point". Scandinavian Actuarial Journal 49 (4): 119–134. doi:10.1080/03461238.1949.10419767.
↑ A sample $(y_{i}, 𝐱_{i})$ is censored in $y_{i}$ when $𝐱_{i}$ is observed for all observations $i = 1, 2, \dots, n$ , but the true value of $y_{i}$ is known only for a restricted range of observations. If the sample is truncated, both $𝐱_{i}$ and $y_{i}$ are only observed if $y_{i}$ falls in the restricted range. See Breen, Richard (1996). Regression Models : Censored, Samples Selected, or Truncated Data. Thousand Oaks: Sage. pp. 2–4. ISBN 0-8039-5710-6. https://books.google.com/books?id=btrvKnZSqIIC&pg=PA4.

References

↑ Hayashi, Fumio (2000). Econometrics. Princeton: Princeton University Press. pp. 518–521. ISBN 0-691-01018-8. https://archive.org/details/econometrics00haya_012.
↑ Goldberger, Arthur S. (1964). Econometric Theory. New York: J. Wiley. pp. 253–55. ISBN 9780471311010. https://archive.org/details/econometrictheor0000gold.
↑ Tobin, James (1958). "Estimation of Relationships for Limited Dependent Variables". Econometrica 26 (1): 24–36. doi:10.2307/1907382. http://cowles.yale.edu/sites/default/files/files/pub/d00/d0003-r.pdf.
↑ Amemiya, Takeshi (1984). "Tobit Models: A Survey". Journal of Econometrics 24 (1–2): 3–61. doi:10.1016/0304-4076(84)90074-5.
↑ Kennedy, Peter (2003). A Guide to Econometrics (Fifth ed.). Cambridge: MIT Press. pp. 283–284. ISBN 0-262-61183-X.
↑ Bierens, Herman J. (2004). Introduction to the Mathematical and Statistical Foundations of Econometrics. Cambridge University Press. p. 207. https://archive.org/details/introductiontoma00bier_187.
↑ Olsen, Randall J. (1978). "Note on the Uniqueness of the Maximum Likelihood Estimator for the Tobit Model". Econometrica 46 (5): 1211–1215. doi:10.2307/1911445.
↑ Orme, Chris (1989). "On the Uniqueness of the Maximum Likelihood Estimator in Truncated Regression Models". Econometric Reviews 8 (2): 217–222. doi:10.1080/07474938908800171.
↑ Iwata, Shigeru (1993). "A Note on Multiple Roots of the Tobit Log Likelihood". Journal of Econometrics 56 (3): 441–445. doi:10.1016/0304-4076(93)90129-S.
↑ Amemiya, Takeshi (1973). "Regression analysis when the dependent variable is truncated normal". Econometrica 41 (6): 997–1016. doi:10.2307/1914031.
↑ McDonald, John F.; Moffit, Robert A. (1980). "The Uses of Tobit Analysis". The Review of Economics and Statistics 62 (2): 318–321. doi:10.2307/1924766.
↑ Schnedler, Wendelin (2005). "Likelihood estimation for censored random vectors". Econometric Reviews 24 (2): 195–217. doi:10.1081/ETC-200067925. http://www.uni-heidelberg.de/md/awi/forschung/dp417.pdf.
↑ Amemiya, Takeshi (1985). "Tobit Models". Advanced econometrics. Cambridge, Mass: Harvard University Press. p. 384. ISBN 0-674-00560-0. OCLC 11728277. https://archive.org/details/advancedeconomet00amem.
↑ Heckman, James J. (1979). "Sample Selection Bias as a Specification Error". Econometrica 47 (1): 153–161. doi:10.2307/1912352. ISSN 0012-9682.
↑ Sigelman, Lee; Zeng, Langche (1999). "Analyzing Censored and Sample-Selected Data with Tobit and Heckit Models". Political Analysis 8 (2): 167–182. doi:10.1093/oxfordjournals.pan.a029811. ISSN 1047-1987.
↑ Powell, James L (1 July 1984). "Least absolute deviations estimation for the censored regression model". Journal of Econometrics 25 (3): 303–325. doi:10.1016/0304-4076(84)90004-6.
↑ Dahlberg, Matz; Johansson, Eva (2002-03-01). "On the Vote-Purchasing Behavior of Incumbent Governments". American Political Science Review 96 (1): 27–40. doi:10.1017/S0003055402004215. ISSN 1537-5943.
↑ Dubois, Hans F. W.; Fattore, Giovanni (2011-07-01). "Public Fund Assignment through Project Evaluation". Regional & Federal Studies 21 (3): 355–374. doi:10.1080/13597566.2011.578827. ISSN 1359-7566.
↑ Baltas, George (2001). "Utility-consistent Brand Demand Systems with Endogenous Category Consumption: Principles and Marketing Applications" (in en). Decision Sciences 32 (3): 399–422. doi:10.1111/j.1540-5915.2001.tb00965.x. ISSN 0011-7315.

v t e Economics
Economic theory Political economy Applied economics
Methodology	Economic model Economic systems Microfoundations Mathematical economics Econometrics Computational economics Experimental economics Publications
Microeconomics	Aggregation problem Budget set Consumer choice Convexity Cost Average Marginal Opportunity Social Sunk Transaction Cost–benefit analysis Deadweight loss Distribution Economies of scale Economies of scope Elasticity Equilibrium General Externality Firm Goods and services Goods Service Indifference curve Interest Intertemporal choice Market Market failure Market structure Competition Monopolistic Perfect Monopoly Bilateral Monopsony Oligopoly Oligopsony Non-convexity Pareto efficiency Preference Price Production set Profit Public good Rate of profit Rationing Rent Returns to scale Risk aversion Scarcity Shortage Surplus Social choice Supply and demand Trade Uncertainty Utility Expected Marginal Value Wage Publications
Macroeconomics	Aggregate demand Balance of payments Business cycle Capacity utilization Capital flight Central bank Consumer confidence Currency Deflation Demand shock Depression Great Disinflation DSGE Effective demand Expectations Adaptive Rational Fiscal policy General Theory of Keynes Growth Indicators Inflation Hyperinflation Interest rate Investment IS–LM model Measures of national income and output Models Money Creation Demand Supply Monetary policy NAIRU National accounts Price level PPP Recession Saving Shrinkflation Stagflation Supply shock Unemployment Publications
Mathematical economics	Contract theory Decision theory Econometrics Game theory Input–output model Mathematical finance Mechanism design Operations research
Applied fields	Agricultural Business Demographic Development Economic geography Economic history Education Industrial Engineering Civil Engineering Environmental Financial Health Industrial organization International Knowledge Labour Law and economics Monetary Natural resource Economic planning Economic policy Public economics Public choice Regional Service Socioeconomics Economic sociology Economic statistics Transportation Urban Welfare
Schools (history) of economic thought	American (National) Ancient thought Anarchist Mutualism Austrian Behavioral Buddhist Chartalism Modern Monetary Theory Chicago Classical Disequilibrium Ecological Evolutionary Feminist Georgism Heterodox Historical Institutional Keynesian Neo- (neoclassical–Keynesian synthesis) New Post- Circuitism Mainstream Malthusianism Marginalism Marxian Neo Mercantilism Neoclassical Lausanne New classical Real business-cycle theory New institutional Physiocracy Socialist Stockholm Supply-side Thermoeconomics
Notable economists and thinkers within economics	François Quesnay Adam Smith David Ricardo Thomas Robert Malthus Johann Heinrich von Thünen Friedrich List Hermann Heinrich Gossen Jules Dupuit Antoine Augustin Cournot John Stuart Mill Karl Marx William Stanley Jevons Henry George Léon Walras Alfred Marshall Georg Friedrich Knapp Francis Ysidro Edgeworth Vilfredo Pareto Friedrich von Wieser John Bates Clark Thorstein Veblen John R. Commons Irving Fisher Wesley Clair Mitchell John Maynard Keynes Joseph Schumpeter Arthur Cecil Pigou Frank Knight John von Neumann Alvin Hansen Jacob Viner Edward Chamberlin Ragnar Frisch Harold Hotelling Michał Kalecki Oskar R. Lange Jacob Marschak Gunnar Myrdal Abba P. Lerner Roy Harrod Piero Sraffa Simon Kuznets Joan Robinson E. F. Schumacher Friedrich Hayek John Hicks Tjalling Koopmans Nicholas Georgescu-Roegen Wassily Leontief John Kenneth Galbraith Hyman Minsky Herbert A. Simon Milton Friedman Paul Samuelson Kenneth Arrow William Baumol Gary Becker Elinor Ostrom Robert Solow Amartya Sen Robert Lucas Jr. Joseph Stiglitz Richard Thaler Paul Krugman Thomas Piketty more
International organizations	Asia-Pacific Economic Cooperation Economic Cooperation Organization European Free Trade Association International Monetary Fund Organisation for Economic Co-operation and Development World Bank World Trade Organization
Category Index Lists Outline Publications Business and economics portal

Anonymous

Search

Tobit model

Namespaces

More

Page actions

Contents

The likelihood function

Reparametrization

Consistency

Interpretation

Variations of the tobit model

Type I

Type II

Type III

Type IV

Type V

Non-parametric version

Applications

See also

Notes

References

Further reading

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Tobit model

The likelihood function

Reparametrization

Consistency

Interpretation

Variations of the tobit model

Type I

Type II

Type III

Type IV

Type V

Non-parametric version

Applications

See also

Notes

References

Further reading

Navigation

Wiki tools

Page tools

Other projects

Categories