TY - JOUR
T1 - Multi-scale semi-supervised clustering of brain images
T2 - Deriving disease subtypes
AU - Wen, Junhao
AU - Varol, Erdem
AU - Sotiras, Aristeidis
AU - Yang, Zhijian
AU - Chand, Ganesh B
AU - Erus, Guray
AU - Shou, Haochang
AU - Abdulkadir, Ahmed
AU - Hwang, Gyujoon
AU - Dwyer, Dominic B
AU - Pigoni, Alessandro
AU - Dazzan, Paola
AU - Kahn, Rene S
AU - Schnack, Hugo G
AU - Zanetti, Marcus V
AU - Meisenzahl, Eva
AU - Busatto, Geraldo F
AU - Crespo-Facorro, Benedicto
AU - Rafael, Romero-Garcia
AU - Pantelis, Christos
AU - Wood, Stephen J
AU - Zhuo, Chuanjun
AU - Shinohara, Russell T
AU - Fan, Yong
AU - Gur, Ruben C
AU - Gur, Raquel E
AU - Satterthwaite, Theodore D
AU - Koutsouleris, Nikolaos
AU - Wolf, Daniel H
AU - Davatzikos, Christos
N1 - Funding Information:
This work was granted access to be supported in part by NIH grants R01MH112070 , 1U01AG068057 , 1RF1AG054409 , and R01AG067103 . Additional support was provided by S10OD023495, R01MH101111, R01MH113565, R01MH113550, R01MH112847, and R01EB022573. The work was also supported by the PRONIA project as funded by the European Union 7th Framework Program grant 602152 . Data collection and sharing for this project were funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging , the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health ( www.fnih.org ). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. This research has also been conducted using the UK Biobank Resource (UKBB Application Number: 35148): https://www.ukbiobank.ac.uk/ .
Funding Information:
This work was granted access to be supported in part by NIH grants R01MH112070, 1U01AG068057, 1RF1AG054409, and R01AG067103. Additional support was provided by S10OD023495, R01MH101111, R01MH113565, R01MH113550, R01MH112847, and R01EB022573. The work was also supported by the PRONIA project as funded by the European Union 7th Framework Program grant 602152. Data collection and sharing for this project were funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer's Association; Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co. Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. This research has also been conducted using the UK Biobank Resource (UKBB Application Number: 35148): https://www.ukbiobank.ac.uk/.
Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2022/1
Y1 - 2022/1
N2 - Disease heterogeneity is a significant obstacle to understanding pathological processes and delivering precision diagnostics and treatment. Clustering methods have gained popularity for stratifying patients into subpopulations (i.e., subtypes) of brain diseases using imaging data. However, unsupervised clustering approaches are often confounded by anatomical and functional variations not related to a disease or pathology of interest. Semi-supervised clustering techniques have been proposed to overcome this and, therefore, capture disease-specific patterns more effectively. An additional limitation of both unsupervised and semi-supervised conventional machine learning methods is that they typically model, learn and infer from data using a basis of feature sets pre-defined at a fixed anatomical or functional scale (e.g., atlas-based regions of interest). Herein we propose a novel method, "Multi-scAle heteroGeneity analysIs and Clustering" (MAGIC), to depict the multi-scale presentation of disease heterogeneity, which builds on a previously proposed semi-supervised clustering method, HYDRA. It derives multi-scale and clinically interpretable feature representations and exploits a double-cyclic optimization procedure to effectively drive identification of inter-scale-consistent disease subtypes. More importantly, to understand the conditions under which the clustering model can estimate true heterogeneity related to diseases, we conducted extensive and systematic semi-simulated experiments to evaluate the proposed method on a sizeable healthy control sample from the UK Biobank (N = 4403). We then applied MAGIC to imaging data from Alzheimer's disease (ADNI, N = 1728) and schizophrenia (PHENOM, N = 1166) patients to demonstrate its potential and challenges in dissecting the neuroanatomical heterogeneity of common brain diseases. Taken together, we aim to provide guidance regarding when such analyses can succeed or should be taken with caution. The code of the proposed method is publicly available at https://github.com/anbai106/MAGIC.
AB - Disease heterogeneity is a significant obstacle to understanding pathological processes and delivering precision diagnostics and treatment. Clustering methods have gained popularity for stratifying patients into subpopulations (i.e., subtypes) of brain diseases using imaging data. However, unsupervised clustering approaches are often confounded by anatomical and functional variations not related to a disease or pathology of interest. Semi-supervised clustering techniques have been proposed to overcome this and, therefore, capture disease-specific patterns more effectively. An additional limitation of both unsupervised and semi-supervised conventional machine learning methods is that they typically model, learn and infer from data using a basis of feature sets pre-defined at a fixed anatomical or functional scale (e.g., atlas-based regions of interest). Herein we propose a novel method, "Multi-scAle heteroGeneity analysIs and Clustering" (MAGIC), to depict the multi-scale presentation of disease heterogeneity, which builds on a previously proposed semi-supervised clustering method, HYDRA. It derives multi-scale and clinically interpretable feature representations and exploits a double-cyclic optimization procedure to effectively drive identification of inter-scale-consistent disease subtypes. More importantly, to understand the conditions under which the clustering model can estimate true heterogeneity related to diseases, we conducted extensive and systematic semi-simulated experiments to evaluate the proposed method on a sizeable healthy control sample from the UK Biobank (N = 4403). We then applied MAGIC to imaging data from Alzheimer's disease (ADNI, N = 1728) and schizophrenia (PHENOM, N = 1166) patients to demonstrate its potential and challenges in dissecting the neuroanatomical heterogeneity of common brain diseases. Taken together, we aim to provide guidance regarding when such analyses can succeed or should be taken with caution. The code of the proposed method is publicly available at https://github.com/anbai106/MAGIC.
KW - Clustering
KW - Heterogeneity
KW - Multi-scale
KW - Semi-simulated
KW - Semi-supervised
UR - http://www.scopus.com/inward/record.url?scp=85119419962&partnerID=8YFLogxK
U2 - 10.1016/j.media.2021.102304
DO - 10.1016/j.media.2021.102304
M3 - Article
C2 - 34818611
SN - 1361-8415
VL - 75
SP - 1
EP - 18
JO - Medical Image Analysis
JF - Medical Image Analysis
M1 - 102304
ER -