Common component classification: What can we learn from machine learning?

被引:2
|
作者
Anderson, Ariana [1 ,2 ]
Labus, Jennifer S. [2 ,3 ,4 ]
Vianna, Eduardo P. [2 ,3 ,4 ]
Mayer, Emeran A. [2 ,3 ,4 ]
Cohen, Mark S. [1 ,2 ]
机构
[1] Univ Calif Los Angeles, Ctr Cognit Neurosci, David Geffen Sch Med, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, David Geffen Sch Med, Dept Psychiat & Behav Sci, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, David Geffen Sch Med, Ctr Neurobiol Stress, Los Angeles, CA 90095 USA
[4] Univ Calif Los Angeles, David Geffen Sch Med, Brain Res Inst, Los Angeles, CA 90095 USA
基金
美国国家卫生研究院;
关键词
Classification; Discrimination; fMRI; Bias; Machine learning; Independent components analysis; Cross-validation; Irritable bowel; FUNCTIONAL MRI; FMRI;
D O I
10.1016/j.neuroimage.2010.05.065
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Machine learning methods have been applied to classifying fMRI scans by studying locations in the brain that exhibit temporal intensity variation between groups, frequently reporting classification accuracy of 90% or better. Although empirical results are quite favorable, one might doubt the ability of classification methods to withstand changes in task ordering and the reproducibility of activation patterns over runs, and question how much of the classification machines' power is due to artifactual noise versus genuine neurological signal. To examine the true strength and power of machine learning classifiers we create and then deconstruct a classifier to examine its sensitivity to physiological noise, task reordering, and across-scan classification ability. The models are trained and tested both within and across runs to assess stability and reproducibility across conditions. We demonstrate the use of independent components analysis for both feature extraction and artifact removal and show that removal of such artifacts can reduce predictive accuracy even when data has been cleaned in the preprocessing stages. We demonstrate how mistakes in the feature selection process can cause the cross-validation error seen in publication to be a biased estimate of the testing error seen in practice and measure this bias by purposefully making flawed models. We discuss other ways to introduce bias and the statistical assumptions lying behind the data and model themselves. Finally we discuss the complications in drawing inference from the smaller sample sizes typically seen in fMRI studies, the effects of small or unbalanced samples on the Type 1 and Type 2 error rates, and how publication bias can give a false confidence of the power of such methods. Collectively this work identifies challenges specific to fMRI classification and methods affecting the stability of models. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:517 / 524
页数:8
相关论文
共 50 条