TY - JOUR
T1 - immuneSIM
T2 - tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking
AU - Weber, Cédric R
AU - Akbar, Rahmad
AU - Yermanos, Alexander
AU - Pavlović, Milena
AU - Snapkov, Igor
AU - Sandve, Geir K
AU - Reddy, Sai T
AU - Greiff, Victor
N1 - Publisher Copyright:
© 2020 The Author(s) 2020. Published by Oxford University Press.
PY - 2020/6/1
Y1 - 2020/6/1
N2 - B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full-length variable region immune receptor sequences by tuning the following immune receptor features: (i) species and chain type (BCR, TCR, single and paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis, such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis and machine learning methods for motif detection. Contact: [email protected] or [email protected]
AB - B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full-length variable region immune receptor sequences by tuning the following immune receptor features: (i) species and chain type (BCR, TCR, single and paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis, such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis and machine learning methods for motif detection. Contact: [email protected] or [email protected]
KW - Computer Simulation
KW - Receptors, Antigen, T-Cell/genetics
KW - Software
U2 - 10.1093/bioinformatics/btaa158
DO - 10.1093/bioinformatics/btaa158
M3 - Article
C2 - 32154832
SN - 1367-4803
VL - 36
SP - 3594
EP - 3596
JO - Bioinformatics (Oxford, England)
JF - Bioinformatics (Oxford, England)
IS - 11
ER -