Statistical method for modeling sequencing data from different technologies in longitudinal studies with application to Huntington disease

Angga M Fuady, Willeke M C van Roon-Mom, Szymon M Kiełbasa, Hae-Won Uh, Jeanine J Houwing-Duistermaat

    Research output: Contribution to journalArticleAcademicpeer-review

    14 Downloads (Pure)

    Abstract

    Advancement of gene expression measurements in longitudinal studies enables the identification of genes associated with disease severity over time. However, problems arise when the technology used to measure gene expression differs between time points. Observed differences between the results obtained at different time points can be caused by technical differences. Modeling the two measurements jointly over time might provide insight into the causes of these different results. Our work is motivated by a study of gene expression data of blood samples from Huntington disease patients, which were obtained using two different sequencing technologies. At time point 1, DeepSAGE technology was used to measure the gene expression, with a subsample also measured using RNA-Seq technology. At time point 2, all samples were measured using RNA-Seq technology. Significant associations between gene expression measured by DeepSAGE and disease severity using data from the first time point could not be replicated by the RNA-Seq data from the second time point. We modeled the relationship between the two sequencing technologies using the data from the overlapping samples. We used linear mixed models with either DeepSAGE or RNA-Seq measurements as the dependent variable and disease severity as the independent variable. In conclusion, (1) for one out of 14 genes, the initial significant result could be replicated with both technologies using data from both time points; (2) statistical efficiency is lost due to disagreement between the two technologies, measurement error when predicting gene expressions, and the need to include additional parameters to account for possible differences.

    Original languageEnglish
    Pages (from-to)745-760
    Number of pages16
    JournalBiometrical Journal
    Volume63
    Issue number4
    Early online date22 Dec 2020
    DOIs
    Publication statusPublished - Apr 2021

    Keywords

    • DeepSAGE
    • linear mixed model
    • measurement error
    • quality control
    • RNA-Seq

    Fingerprint

    Dive into the research topics of 'Statistical method for modeling sequencing data from different technologies in longitudinal studies with application to Huntington disease'. Together they form a unique fingerprint.

    Cite this