TY - JOUR
T1 - INVESTIGATING THE IMPACT OF DOWN SYNDROME ON METHYLATION AND GLYCOMICS WITH TWO-STAGE PO2PLS
AU - Gu, Zhujie
AU - Bouhaddani, Said El
AU - Houwing-Duistermaat, Jeanine
AU - Uh, Hae Won
N1 - Publisher Copyright:
© 2021, Fabrizio Serra Editore Srl. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Down syndrome (DS) is a condition that leads to precocious and accelerated aging in affected subjects. Several alterations in DS cases have been reported at a molecular level, particularly in methylation and glycosylation. Investigating the relation between methylation, glycomics and DS can lead to new insights underlying the atypical aging. We consider a data integration approach, where we investigate how DS affects the parts of glycomics and methylation which are correlated, and which CpG sites and glycans are relevant. Our motivating datasets consist of methylation and glycomics data, measured on 29 DS patients and their unaffected siblings and mothers. The family-based case-control design needs to be taken into account when studying the relationship between methylation, glycomics and DS. We propose a two-stage approach to first integrate methylation and glycomics data, and then link the joint information to Down syndrome. For the data integration step, we consider probabilistic two-way orthogonal partial least squares (PO2PLS). PO2PLS models two omics datasets in terms of low-dimensional joint and omic-specific latent components, and takes into account heterogeneity across the omics data. The relationship between the omics data can be statistically tested. The joint components represent the joint information in methylation and glycomics. In the second stage, we apply a linear mixed model to the relationship between DS and the joint methylation and glycomics components. For the components that are significantly associated with DS, we identify the most important CpG sites and glycans. A simulation study is conducted to evaluate the performance of our approach. The results showed that the effects of DS on the omics data can be detected in a large sample size, and the accuracy of the feature selection was high in both small and large sample sizes. Our approach is applied to the DS datasets, a significant effect of DS on the joint components is found. The identified CpG sites and glycans appeared to be related to DS. Our proposed method that jointly analyzes multiple omics data with an outcome variable may provide new insight into the molecular implications of DS at different omics levels.
AB - Down syndrome (DS) is a condition that leads to precocious and accelerated aging in affected subjects. Several alterations in DS cases have been reported at a molecular level, particularly in methylation and glycosylation. Investigating the relation between methylation, glycomics and DS can lead to new insights underlying the atypical aging. We consider a data integration approach, where we investigate how DS affects the parts of glycomics and methylation which are correlated, and which CpG sites and glycans are relevant. Our motivating datasets consist of methylation and glycomics data, measured on 29 DS patients and their unaffected siblings and mothers. The family-based case-control design needs to be taken into account when studying the relationship between methylation, glycomics and DS. We propose a two-stage approach to first integrate methylation and glycomics data, and then link the joint information to Down syndrome. For the data integration step, we consider probabilistic two-way orthogonal partial least squares (PO2PLS). PO2PLS models two omics datasets in terms of low-dimensional joint and omic-specific latent components, and takes into account heterogeneity across the omics data. The relationship between the omics data can be statistically tested. The joint components represent the joint information in methylation and glycomics. In the second stage, we apply a linear mixed model to the relationship between DS and the joint methylation and glycomics components. For the components that are significantly associated with DS, we identify the most important CpG sites and glycans. A simulation study is conducted to evaluate the performance of our approach. The results showed that the effects of DS on the omics data can be detected in a large sample size, and the accuracy of the feature selection was high in both small and large sample sizes. Our approach is applied to the DS datasets, a significant effect of DS on the joint components is found. The identified CpG sites and glycans appeared to be related to DS. Our proposed method that jointly analyzes multiple omics data with an outcome variable may provide new insight into the molecular implications of DS at different omics levels.
KW - Down Syndrome
KW - Omics Data Integration
KW - Probabilistic O2PLS
UR - http://www.scopus.com/inward/record.url?scp=85134411829&partnerID=8YFLogxK
U2 - 10.19272/202111401004
DO - 10.19272/202111401004
M3 - Comment/Letter to the editor
AN - SCOPUS:85134411829
SN - 2282-2593
VL - 114
SP - 29
EP - 44
JO - Theoretical Biology Forum
JF - Theoretical Biology Forum
IS - 1
ER -