Abstract
Genomics profiling based on high dimensional data from high throughput experiments that measure the expression of tens of thousands of genes or biomarkers holds great promises for clinical application. Diagnosis, prognosis and treatment selection for individual patient can become more accurate with strong statistical prediction models based on robust informative gene lists. Numerous studies have been published claiming to have built accurate prediction models. However the initial enthusiasm has been tempered by the uncovering of many false claims. The reason for these false claims lies mainly in the inadequate statistical methodology that is being used to develop the quantitative model underlying prediction or classification. Predictive modeling in gene expression data is challenging and it suffers from a lack of reproducibility as well as instability of the findings, which is strongly associated to the curse of dimensionality in these datasets (a very low number of samples relative to the number of available genes). Literatures showed there is no unique gene signature list resulting from different prediction models that were constructed on the same data. The set of genes involved in predictive models depended heavily on the chosen subset of samples in predictive modeling. We used information from published gene expression studies to serve the two following goals. First, we evaluated potential factors affecting the accuracy of predictive models in binary outcome data. Second, meta-analysis was performed as a method to generate a more accurate list of differentially expressed genes, as well as to evaluate the added value of meta-analysis as a feature selection method in predictive modeling.
Original language | English |
---|---|
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 2 Jun 2015 |
Place of Publication | 's-Hertogenbosch |
Publisher | |
Print ISBNs | 978-94-6295-187-7 |
Publication status | Published - 2 Jun 2015 |
Keywords
- gene expression
- microarray
- classification
- class prediction
- meta-analysis
- sequential meta-analysis
- accuracy
- correlation