Smoothed maximum score estimator

From HandWiki

When modelling discrete choice model, it is always assumed that the choice is determined by the comparison of the underlying latent utility.[1] Denote the population of the agents as T, the common choice set for each agent as C . For agent t∈T , denote her choice as yt,i , which is equal to 1 if choice i∈C is chosen and 0 otherwise. Assume the linearity of the parameters and the additivity of the error term: for an agent t∈T ,

yt,i = 1 ↔ xt,iβ + εt,i > xt,jβ + εt,j, ∀ j ≠ i and j ∈ C

where xt,i and xt,j are the q- dimensional observable covariates about the agent and the choice, and εt,i and εt,j are the decision errors caused by some cognitive reasons or information incompleteness. The construction of the observable covariates is very general. For instance, if C is a set of different brands of coffee, then xt,i includes the characteristics both of the agent t , such as age, gender, income and ethnicity, and of the coffee i , such as price, taste and whether it is local or imported.

Manski (1975) proposed a non-parametric model to estimate the parameters. In this model, denote the number of the elements of the choice set as J , the total number of the agents as N , and W ( J - 1) > W (J - 2) > ... > W (1) > W (0) is a sequence of real numbers. The Maximum Score (MS) [2] estimator2 is defined as:

[math]\displaystyle{  \hat {b}_{MS} =  {\operatorname{arg\max}}_b \frac {1}{N} \sum_{t=1}^N \sum_{i=1}^J y_{t,i} W (\sum\nolimits_{j \in C, j \neq i} 1 (x_{t,i}b \gt  x_{t,j}b))  }[/math]


Here, [math]\displaystyle{ (\sum\nolimits_{j \in C, j \neq i} 1 (x_{t,i}b \gt x_{t,j}b)) }[/math] is the ranking of the certainty part of the underlying utility of choosing i . Under certain conditions, the maximum score estimator can be weak consistent, but its asymptotic property will be very complicated.[3] This issue mainly comes from the non-smooth of the objective function. Horowitz (1992) proposed a Smoothed Maximum Score (SMS) [4] estimator which has much better asymptotic property. The basic idea of this new estimator is just to replace the non-smoothed weight function [math]\displaystyle{ W (\sum\nolimits_{j \in C, j \neq i} 1 (x_{t,i}b \gt x_{t,j}b)) }[/math] with a smoothed one. Define a smooth kernel function K satisfying following conditions:

(1) |K(·)| is bounded over R ;

(2) [math]\displaystyle{ \lim_{u\to -\infty} K (u) = 0 and \lim_{u\to +\infty} K (u) =1 }[/math] ;

(3) [math]\displaystyle{ \dot {K} (u) = \dot {K} (-u) }[/math]


Here, the kernel function is analogous to a CDF whose PDF is symmetric around 0. Then, the SMS estimator is defined as:

[math]\displaystyle{ \hat {b}_{SMS} = {\operatorname{arg\max}}_b \frac {1}{N} \sum_{t=1}^N \sum_{i=1}^J y_{t,i} \sum\nolimits_{j \in C, j \neq i} K ( X_ {t,i}b - x_{t,j} b / h_N) }[/math]

where [math]\displaystyle{ (h_N, N = 1,2, ...) }[/math] is a sequence of strictly positive numbers and [math]\displaystyle{ \lim_{N\to +\infty} h_N = 0 }[/math] . Here, the intuition is the same with the construction of the traditional MS estimator: it is more likely to choose the choice with higher certainty part of the utility. Under certain conditions, SMS estimator is consistent, and more importantly, it has an asymptotic normal distribution. Therefore, all the testing and inference based on asymptotic distribution can be implemented.[5]

  1. For more example, refer to: Smith, Michael D. and Brynjolfsson, Erik, Consumer Decision-Making at an Internet Shopbot (October 2001). MIT Sloan School of Management Working Paper No. 4206-01.
  2. Charles F. Manski (1975), “Maximum Score Estimation of the Stochastic Utility Model of Choice”, Journal of Econometrics 3, pp. 205-228.
  3. Jeankyung Kim; David Pollard (1990), “Cube Root Asymptotics”, The Annals of Statistics 1, pp.191-219.
  4. Joel L. Horowitz (1992), “A Smoothed Maximum Score Estimator for the Binary Response Model”, Econometrica 3, pp. 505-531.
  5. For a survey study, refer to: Jin Yan (2012), “A Smoothed Maximum Score Estimator for Multinomial Discrete Choice Models”, Working Paper.