Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor

2022-11-09
Islam, S.M. Ashiqul
Díaz-Gay, Marcos
Wu, Yang
Barnes, Mark
Vangara, Raviteja
Bergstrom, Erik N.
He, Yudou
Vella, Mike
Wang, Jingwei
Teague, Jon W.
Clapham, Peter
Moody, Sarah
Senkin, Sergey
Li, Yun Rose
Riva, Laura
Zhang, Tongwu
Gruber, Andreas J.
Steele, Christopher D.
Otlu Sarıtaş, Burçak
Khandekar, Azhar
Abbasi, Ammal
Humphreys, Laura
Syulyukina, Natalia
Brady, Samuel W.
Alexandrov, Boian S.
Pillay, Nischalan
Zhang, Jinghui
Adams, David J.
Martincorena, Iñigo
Wedge, David C.
Landi, Maria Teresa
Brennan, Paul
Stratton, Michael R.
Rozen, Steven G.
Alexandrov, Ludmil B.
Mutational signature analysis is commonly performed in cancer genomic studies. Here, we present SigProfilerExtractor, an automated tool for de novo extraction of mutational signatures, and benchmark it against another 13 bioinformatics tools by using 34 scenarios encompassing 2,500 simulated signatures found in 60,000 synthetic genomes and 20,000 synthetic exomes. For simulations with 5% noise, reflecting high-quality datasets, SigProfilerExtractor outperforms other approaches by elucidating between 20% and 50% more true-positive signatures while yielding 5-fold less false-positive signatures. Applying SigProfilerExtractor to 4,643 whole-genome- and 19,184 whole-exome-sequenced cancers reveals four novel signatures. Two of the signatures are confirmed in independent cohorts, and one of these signatures is associated with tobacco smoking. In summary, this report provides a reference tool for analysis of mutational signatures, a comprehensive benchmarking of bioinformatics tools for extracting signatures, and several novel mutational signatures, including one putatively attributed to direct tobacco smoking mutagenesis in bladder tissues.
Cell Genomics
Citation Formats
S. M. A. Islam et al., “Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor,” Cell Genomics, vol. 2, no. 11, pp. 0–0, 2022, Accessed: 00, 2023. [Online]. Available: https://hdl.handle.net/11511/102912.