W-test

From HandWiki

In statistics, the W-test is designed to test the distributional differences between cases and controls for categorical variable set, which can be a single SNP, SNP-SNP, or SNP-environment pairs. It takes a combined log of odds ratio form, calculated from the contingency table of the variable set. The test inherits a chi-squared distribution with data-set adaptive degrees of freedom f, estimated from smaller bootstrapped samples of the data. The flexible and data-corrected probability distribution allows W-test to give relatively accurate p-values under complex genetic architectures.

Applications

Theoretically, the test is not restricted to pairwise interactions, and can go to higher order if sample size of the data can support it. The W-test's application for pairwise interaction effect has been tested in common genome-wide association study (GWAS) dataset with less than 5,000 subjects [1]. Since it corrects for probability distribution bias due to sparse data through the bootstrapped parameters, it has persistent power in low frequency variant environment, when the minor allele frequency (MAF) of single-nucleotide polymorphism (SNP) is between 1% and 5%.

Software

The W-test C++ software, linux version and R package are available from the wtest official website.

References