PermuTest
From IEETA
Title | ON USING PERMUTATION TESTS TO ESTIMATE CLASSIFICATION SIGNIFICANCE |
---|---|
Advisor | Mohammed S. Al-Rawi |
Level | First |
Target | neuroimaging, MVPA, other pattern classification domains, genome wide association studies GWAS |
Area | science and engineering |
See also |
Recent pattern classification recipes give more interest on using permutation tests to estimate statistical significance of a classifier (p-value). When a dataset has some limitations, e.g. low sample size and high dimensionality as in the case of functional magnetic resonance imaging studies or genome analyses, k-fold cross validation techniques might be used and the classification performance is usually calculated as the average performance over the k folds. This average classification performance can be used to estimate the significance of the classifier, i.e., how well it is above chance-level. Using average performance to estimate the significance may camouflage lowest recognizable classes and the resultant p-value will be biased towards the most recognizable classes, thus low p-values might sometimes be the result of undercoverage. In this work, we investigate this problem and propose a solution that implements permutation tests based on newly introduced test statistics. We also propose a model that is based on partial scrambling of the testing samples to judge p-value’s tolerance and draw conclusion about which statistic is superior. We validate the proposed model using functional magnetic resonance imaging data and show that they can better help in accepting or rejecting the hypothesis that all classes in the problem domain are drawn from the same distribution.