CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data

Download
2022-12-01
Söylev, Arda
Çokoglu, Sevim Seda
KOPTEKİN, DİLEK
Alkan, Can
Somel, Mehmet
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.To date, ancient genome analyses have been largely confined to the study of single nucleotide polymorphisms (SNPs). Copy number variants (CNVs) are a major contributor of disease and of evolutionary adaptation, but identifying CNVs in ancient shotgun-sequenced genomes is hampered by typical low genome coverage (1 kbps with F-scores >0.75 at ≥1×, and distinguish between heterozygous and homozygous states. We used CONGA to genotype 10,002 outgroup-ascertained deletions across a heterogenous set of 71 ancient human genomes spanning the last 50,000 years, produced using variable experimental protocols. A fraction of these (21/71) display divergent deletion profiles unrelated to their population origin, but attributable to technical factors such as coverage and read length. The majority of the sample (50/71), despite originating from nine different laboratories and having coverages ranging from 0.44×-26× (median 4×) and average read lengths 52-121 bps (median 69), exhibit coherent deletion frequencies. Across these 50 genomes, inter-individual genetic diversity measured using SNPs and CONGA-genotyped deletions are highly correlated. CONGA-genotyped deletions also display purifying selection signatures, as expected. CONGA thus paves the way for systematic CNV analyses in ancient genomes, despite the technical challenges posed by low and variable genome coverage.
PLoS Computational Biology

Suggestions

IDEST: International Database of Emotional Short Texts
Kaakinen, Johanna K.; Werlen, Egon; Kammerer, Yvonne; Acartürk, Cengiz; Aparicio, Xavier; Baccino, Thierry; Ballenghein, Ugo; Bergamin, Per; Castells, Núria; Costa, Armanda; Falé, Isabel; Mégalakaki, Olga; Fernández, Susana Ruiz (2022-10-01)
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.We introduce a database (IDEST) of 250 short stories rated for valence, arousal, and comprehensibility in two languages. The texts, with a narrative structure telling a story in the first person and controlled for length, were originally written in six different languages (Fin...
Stem Cells and Innate Immunity in Aquatic Invertebrates: Bridging Two Seemingly Disparate Disciplines for New Discoveries in Biology
Ballarin, Loriano; Karahan, Arzu; Salvetti, Alessandra; Rossi, Leonardo; Manni, Lucia; Rinkevich, Baruch; Rosner, Amalia; Voskoboynik, Ayelet; Rosental, Benyamin; Canesi, Laura; Anselmi, Chiara; Pinsino, Annalisa; Tohumcu, Begüm Ece; Jemec Kokalj, Anita; Dolar, Andraž; Novak, Sara; Sugni, Michela; Corsi, Ilaria; Drobne, Damjana (2021-06-30)
© Copyright © 2021 Ballarin, Karahan, Salvetti, Rossi, Manni, Rinkevich, Rosner, Voskoboynik, Rosental, Canesi, Anselmi, Pinsino, Tohumcu, Jemec Kokalj, Dolar, Novak, Sugni, Corsi and Drobne.The scopes related to the interplay between stem cells and the immune system are broad and range from the basic understanding of organism’s physiology and ecology to translational studies, further contributing to (eco)toxicology, biotechnology, and medicine as well as regulatory and ethical aspects. Stem cells originate...
JOA: Joint Overlap Analysis of multiple genomic interval sets
Otlu, Burcak; Can, Tolga (Springer Science and Business Media LLC, 2019-03-08)
BackgroundNext-generation sequencing (NGS) technologies have produced large volumes of genomic data. One common operation on heterogeneous genomic data is genomic interval intersection. Most of the existing tools impose restrictions such as not allowing nested intervals or requiring intervals to be sorted when finding overlaps in two or more interval sets.ResultsWe proposed segment tree (ST) and indexed segment tree forest (ISTF) based solutions for intersection of multiple genomic interval sets in parallel...
Facile synthesis of alkynyl-, aryl- and ferrocenyl-substituted pyrazoles via Sonogashira and Suzuki-Miyaura approaches
KARABIYIKOĞLU, Sedef; Zora, Metin (2016-10-01)
A concise and efficient synthesis of densely substituted novel pyrazoles with alkynyl, aryl and ferrocenyl functionalities is reported, providing a platform for biological studies. The general strategy involves Sonogashira and Suzuki-Miyaura cross-coupling reactions of easily obtainable 5-ferrocenyl/phenyl-4-iodo-1-phenylpyrazoles with terminal alkynes and boronic acids, respectively. The starting 4-iodopyrazoles were synthesized by electrophilic cyclization of alpha,beta-alkynic hydrazones with molecular i...
Use of open linked data in bioinformatics space: A case study
Çelebi, Remzi; GÜMÜŞ, ÖZGÜR; Aydın Son, Yeşim (2013-01-01)
In the life sciences, semantic web can support many aspects of bio- and health informatics, with exciting applications appearing in areas ranging from plant genetics to drug discovery. Using semantic technologies with open linked data, provides two kinds of advantages: ability to search multiple datasets through a single framework and ability to search relationships and paths of relationships that go across different datasets. The Bio2RDF project creates a network of coherently linked data across the biolog...
Citation Formats
A. Söylev, S. S. Çokoglu, D. KOPTEKİN, C. Alkan, and M. Somel, “CONGA: Copy number variation genotyping in ancient genomes and low-coverage sequencing data,” PLoS Computational Biology, vol. 18, no. 12, pp. 0–0, 2022, Accessed: 00, 2023. [Online]. Available: https://hdl.handle.net/11511/101825.