TY - JOUR
T1 - Joint modeling of an outcome variable and integrated omics datasets using GLM-PO2PLS
AU - Gu, Zhujie
AU - Uh, Hae Won
AU - Houwing-Duistermaat, Jeanine
AU - el Bouhaddani, Said
N1 - Publisher Copyright:
© 2024 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
PY - 2024
Y1 - 2024
N2 - In many studies of human diseases, multiple omics datasets are measured. Typically, these omics datasets are studied one by one with the disease, thus the relationship between omics is overlooked. Modeling the joint part of multiple omics and its association to the outcome disease will provide insights into the complex molecular base of the disease. Several dimension reduction methods which jointly model multiple omics and two-stage approaches that model the omics and outcome in separate steps are available. Holistic one-stage models for both omics and outcome are lacking. In this article, we propose a novel one-stage method that jointly models an outcome variable with omics. We establish the model identifiability and develop EM algorithms to obtain maximum likelihood estimators of the parameters for normally and Bernoulli distributed outcomes. Test statistics are proposed to infer the association between the outcome and omics, and their asymptotic distributions are derived. Extensive simulation studies are conducted to evaluate the proposed model. The method is illustrated by modeling Down syndrome as outcome and methylation and glycomics as omics datasets. Here we show that our model provides more insight by jointly considering methylation and glycomics.
AB - In many studies of human diseases, multiple omics datasets are measured. Typically, these omics datasets are studied one by one with the disease, thus the relationship between omics is overlooked. Modeling the joint part of multiple omics and its association to the outcome disease will provide insights into the complex molecular base of the disease. Several dimension reduction methods which jointly model multiple omics and two-stage approaches that model the omics and outcome in separate steps are available. Holistic one-stage models for both omics and outcome are lacking. In this article, we propose a novel one-stage method that jointly models an outcome variable with omics. We establish the model identifiability and develop EM algorithms to obtain maximum likelihood estimators of the parameters for normally and Bernoulli distributed outcomes. Test statistics are proposed to infer the association between the outcome and omics, and their asymptotic distributions are derived. Extensive simulation studies are conducted to evaluate the proposed model. The method is illustrated by modeling Down syndrome as outcome and methylation and glycomics as omics datasets. Here we show that our model provides more insight by jointly considering methylation and glycomics.
KW - data integration
KW - Dimension reduction
KW - generalized linear models
KW - multiple omics
KW - PLS methods
UR - http://www.scopus.com/inward/record.url?scp=85186410683&partnerID=8YFLogxK
U2 - 10.1080/02664763.2024.2313458
DO - 10.1080/02664763.2024.2313458
M3 - Article
C2 - 39290359
AN - SCOPUS:85186410683
SN - 0266-4763
VL - 51
SP - 2627
EP - 2651
JO - Journal of Applied Statistics
JF - Journal of Applied Statistics
IS - 13
ER -