Philosophy:Decision-theoretic rough sets

From HandWiki

In the mathematical theory of decisions, decision-theoretic rough sets (DTRS) is a probabilistic extension of rough set classification. First created in 1990 by Dr. Yiyu Yao,[1] the extension makes use of loss functions to derive [math]\displaystyle{ \textstyle \alpha }[/math] and [math]\displaystyle{ \textstyle \beta }[/math] region parameters. Like rough sets, the lower and upper approximations of a set are used.

Definitions

The following contains the basic principles of decision-theoretic rough sets.

Conditional risk

Using the Bayesian decision procedure, the decision-theoretic rough set (DTRS) approach allows for minimum-risk decision making based on observed evidence. Let [math]\displaystyle{ \textstyle A=\{a_1,\ldots,a_m\} }[/math] be a finite set of [math]\displaystyle{ \textstyle m }[/math] possible actions and let [math]\displaystyle{ \textstyle \Omega=\{w_1,\ldots, w_s\} }[/math] be a finite set of [math]\displaystyle{ s }[/math] states. [math]\displaystyle{ \textstyle P(w_j\mid[x]) }[/math] is calculated as the conditional probability of an object [math]\displaystyle{ \textstyle x }[/math] being in state [math]\displaystyle{ \textstyle w_j }[/math] given the object description [math]\displaystyle{ \textstyle [x] }[/math]. [math]\displaystyle{ \textstyle \lambda(a_i\mid w_j) }[/math] denotes the loss, or cost, for performing action [math]\displaystyle{ \textstyle a_i }[/math] when the state is [math]\displaystyle{ \textstyle w_j }[/math]. The expected loss (conditional risk) associated with taking action [math]\displaystyle{ \textstyle a_i }[/math] is given by:

[math]\displaystyle{ R(a_i\mid [x]) = \sum_{j=1}^s \lambda(a_i\mid w_j)P(w_j\mid[x]). }[/math]

Object classification with the approximation operators can be fitted into the Bayesian decision framework. The set of actions is given by [math]\displaystyle{ \textstyle A=\{a_P,a_N,a_B\} }[/math], where [math]\displaystyle{ \textstyle a_P }[/math], [math]\displaystyle{ \textstyle a_N }[/math], and [math]\displaystyle{ \textstyle a_B }[/math] represent the three actions in classifying an object into POS([math]\displaystyle{ \textstyle A }[/math]), NEG([math]\displaystyle{ \textstyle A }[/math]), and BND([math]\displaystyle{ \textstyle A }[/math]) respectively. To indicate whether an element is in [math]\displaystyle{ \textstyle A }[/math] or not in [math]\displaystyle{ \textstyle A }[/math], the set of states is given by [math]\displaystyle{ \textstyle \Omega=\{A,A^c\} }[/math]. Let [math]\displaystyle{ \textstyle \lambda(a_\diamond\mid A) }[/math] denote the loss incurred by taking action [math]\displaystyle{ \textstyle a_\diamond }[/math] when an object belongs to [math]\displaystyle{ \textstyle A }[/math], and let [math]\displaystyle{ \textstyle \lambda(a_\diamond\mid A^c) }[/math] denote the loss incurred by take the same action when the object belongs to [math]\displaystyle{ \textstyle A^c }[/math].

Loss functions

Let [math]\displaystyle{ \textstyle \lambda_{PP} }[/math] denote the loss function for classifying an object in [math]\displaystyle{ \textstyle A }[/math] into the POS region, [math]\displaystyle{ \textstyle \lambda_{BP} }[/math] denote the loss function for classifying an object in [math]\displaystyle{ \textstyle A }[/math] into the BND region, and let [math]\displaystyle{ \textstyle \lambda_{NP} }[/math] denote the loss function for classifying an object in [math]\displaystyle{ \textstyle A }[/math] into the NEG region. A loss function [math]\displaystyle{ \textstyle \lambda_{\diamond N} }[/math] denotes the loss of classifying an object that does not belong to [math]\displaystyle{ \textstyle A }[/math] into the regions specified by [math]\displaystyle{ \textstyle \diamond }[/math].

Taking individual can be associated with the expected loss [math]\displaystyle{ \textstyle R(a_\diamond\mid[x]) }[/math]actions and can be expressed as:

[math]\displaystyle{ \textstyle R(a_P\mid[x]) = \lambda_{PP}P(A\mid[x]) + \lambda_{PN}P(A^c\mid[x]), }[/math]
[math]\displaystyle{ \textstyle R(a_N\mid[x]) = \lambda_{NP}P(A\mid[x]) + \lambda_{NN}P(A^c\mid[x]), }[/math]
[math]\displaystyle{ \textstyle R(a_B\mid[x]) = \lambda_{BP}P(A\mid[x]) + \lambda_{BN}P(A^c\mid[x]), }[/math]

where [math]\displaystyle{ \textstyle \lambda_{\diamond P}=\lambda(a_\diamond\mid A) }[/math], [math]\displaystyle{ \textstyle \lambda_{\diamond N}=\lambda(a_\diamond\mid A^c) }[/math], and [math]\displaystyle{ \textstyle \diamond=P }[/math], [math]\displaystyle{ \textstyle N }[/math], or [math]\displaystyle{ \textstyle B }[/math].

Minimum-risk decision rules

If we consider the loss functions [math]\displaystyle{ \textstyle \lambda_{PP} \leq \lambda_{BP} \lt \lambda_{NP} }[/math] and [math]\displaystyle{ \textstyle \lambda_{NN} \leq \lambda_{BN} \lt \lambda_{PN} }[/math], the following decision rules are formulated (P, N, B):

  • P: If [math]\displaystyle{ \textstyle P(A\mid[x]) \geq \gamma }[/math] and [math]\displaystyle{ \textstyle P(A\mid[x]) \geq \alpha }[/math], decide POS([math]\displaystyle{ \textstyle A }[/math]);
  • N: If [math]\displaystyle{ \textstyle P(A\mid[x]) \leq \beta }[/math] and [math]\displaystyle{ \textstyle P(A\mid[x]) \leq \gamma }[/math], decide NEG([math]\displaystyle{ \textstyle A }[/math]);
  • B: If [math]\displaystyle{ \textstyle \beta \leq P(A\mid[x]) \leq \alpha }[/math], decide BND([math]\displaystyle{ \textstyle A }[/math]);

where,

[math]\displaystyle{ \alpha = \frac{\lambda_{PN} - \lambda_{BN}}{(\lambda_{BP} - \lambda_{BN}) - (\lambda_{PP}-\lambda_{PN})}, }[/math]
[math]\displaystyle{ \gamma = \frac{\lambda_{PN} - \lambda_{NN}}{(\lambda_{NP} - \lambda_{NN}) - (\lambda_{PP}-\lambda_{PN})}, }[/math]
[math]\displaystyle{ \beta = \frac{\lambda_{BN} - \lambda_{NN}}{(\lambda_{NP} - \lambda_{NN}) - (\lambda_{BP}-\lambda_{BN})}. }[/math]

The [math]\displaystyle{ \textstyle \alpha }[/math], [math]\displaystyle{ \textstyle \beta }[/math], and [math]\displaystyle{ \textstyle \gamma }[/math] values define the three different regions, giving us an associated risk for classifying an object. When [math]\displaystyle{ \textstyle \alpha \gt \beta }[/math], we get [math]\displaystyle{ \textstyle \alpha \gt \gamma \gt \beta }[/math] and can simplify (P, N, B) into (P1, N1, B1):

  • P1: If [math]\displaystyle{ \textstyle P(A\mid [x]) \geq \alpha }[/math], decide POS([math]\displaystyle{ \textstyle A }[/math]);
  • N1: If [math]\displaystyle{ \textstyle P(A\mid[x]) \leq \beta }[/math], decide NEG([math]\displaystyle{ \textstyle A }[/math]);
  • B1: If [math]\displaystyle{ \textstyle \beta \lt P(A\mid[x]) \lt \alpha }[/math], decide BND([math]\displaystyle{ \textstyle A }[/math]).

When [math]\displaystyle{ \textstyle \alpha = \beta = \gamma }[/math], we can simplify the rules (P-B) into (P2-B2), which divide the regions based solely on [math]\displaystyle{ \textstyle \alpha }[/math]:

  • P2: If [math]\displaystyle{ \textstyle P(A\mid[x]) \gt \alpha }[/math], decide POS([math]\displaystyle{ \textstyle A }[/math]);
  • N2: If [math]\displaystyle{ \textstyle P(A\mid[x]) \lt \alpha }[/math], decide NEG([math]\displaystyle{ \textstyle A }[/math]);
  • B2: If [math]\displaystyle{ \textstyle P(A\mid[x]) = \alpha }[/math], decide BND([math]\displaystyle{ \textstyle A }[/math]).

Data mining, feature selection, information retrieval, and classifications are just some of the applications in which the DTRS approach has been successfully used.

See also

References

  1. Yao, Y.Y.; Wong, S.K.M.; Lingras, P. (1990). "A decision-theoretic rough set model". Methodologies for Intelligent Systems, 5, Proceedings of the 5th International Symposium on Methodologies for Intelligent Systems (Knoxville, Tennessee, USA: North-Holland): 17–25. 

External links