Statistical modeling of an outcome variable with integrated omics data

Zhujie Gu

    Research output: ThesisDoctoral thesis 1 (Research UU / Graduation UU)

    7 Downloads (Pure)

    Abstract

    In human disease studies, it has become common to collect multiple omics datasets measured on various molecular levels. The aim is to study the underlying mechanisms of disease from different perspectives by jointly analyzing these datasets. This thesis develops statistical methodologies to model a disease outcome with two omics datasets. We consider latent variable methods for constructing low-dimensional components representing the two omics, and linear models for associating the components to a disease. The latent variable methods address the statistical challenges of high dimensionality, correlations within and between omics, and systematic differences between datasets. The linear models provide flexibility for various study designs and different distributions of disease outcomes. Both two-stage methods where latent variable model and linear model are fitted separately and one-stage methods where the two are fitted simultaneously are developed. The two-stage methods are computationally fast and offer more flexibility in the linear models, while the one-stage models provide unbiased inference results. The methods are all validated and can be used in a wide range of disease studies.
    Original languageEnglish
    Awarding Institution
    • University Medical Center (UMC) Utrecht
    Supervisors/Advisors
    • Sturkenboom, Miriam, Primary supervisor
    • Duistermaat, Jeanine, Supervisor
    • Uh, Hae-Won, Co-supervisor
    • el Bouhaddani, Said, Co-supervisor
    Award date22 May 2023
    Place of PublicationUtrecht
    Publisher
    Print ISBNs978-94-6483-123-8
    DOIs
    Publication statusPublished - 22 May 2023

    Keywords

    • Omics research
    • Omics heterogeneity
    • Data integration
    • High dimensional statistics
    • Dimension reduction
    • Partial least squares
    • Two-stage modelling
    • Joint modelling
    • Generalized linear models
    • GLM-PO2PLS

    Fingerprint

    Dive into the research topics of 'Statistical modeling of an outcome variable with integrated omics data'. Together they form a unique fingerprint.

    Cite this