Strong and weak sampling

From HandWiki

Strong and weak sampling are two sampling approach[1] in Statistics, and are popular in computational cognitive science and language learning.[2] In strong sampling, it is assumed that the data are intentionally generated as positive examples of a concept,[3] while in weak sampling, it is assumed that the data are generated without any restrictions.[4]

Formal Definition

In strong sampling, we assume observation is randomly sampled from the true hypothesis:

[math]\displaystyle{ P(x|h) = \begin{cases} \frac{1}{|h|} & \text{, if } x \in h \\ 0 & \text{, otherwise} \end{cases} }[/math]

In weak sampling, we assume observations randomly sampled and then classified:

[math]\displaystyle{ P(x|h) = \begin{cases} 1 & \text{, if } x \in h \\ 0 & \text{, otherwise} \end{cases} }[/math]

Consequence: Posterior computation under Weak Sampling

[math]\displaystyle{ P(h|x) = \frac{P(x|h) P(h)}{\sum\limits_{h'} P(x|h') P(h')} = \begin{cases} \frac{P(h)}{\sum\limits_{h': x \in h'} P(h')} & \text{, if } x \in h \\ 0 & \text{, otherwise} \end{cases} }[/math]

Therefore the likelihood [math]\displaystyle{ P(x|h') }[/math] for all hypotheses [math]\displaystyle{ h' }[/math] will be "ignored".

References

External links