A comparative analysis on ancient genomic data: The impact of variant discovery approach on population genetics tests

2022-10
Taç, İdil
Işıldak, Ulaş
Çubukcu, Hande
Karadavut, Damla
Vural, Kıvılcım Başak
Altınışık, Ezgi
Erdal, Yılmaz Selim
Somel, Mehmet
Özer, Füsun
Yet, İdil
Kılınç, Gülşah Merve
Ancient DNA is a field of study with genomic material obtained from biological samples that are highly damaged and not stored under special conditions. However, with advances in both sequencing techniques and bioinformatics methods such as imputation, which improve the quality of the data obtained, we have the opportunity to make maximum use of the material we have. When working with ancient genomes, which are mostly low-coverage ancient genomes, we cannot follow the pipeline we apply to modern genome datasets. While modern genomes can reliably perform diploid variant calling due to their high coverage, ancient genomes with <1 coverage often undergo a pseudo-haploidization procedure. In this study, we used different approaches using pseudo-haploidization or read depths to calculate and compare the allele frequencies of neutral and disease related SNPs in Neolithic Anatolian individuals. We calculated the minor allele frequency for each SNP using (i) pseudo-haploidization with random base selection using the ANGSD tool followed by pseudo-haploidization with the Plink program and (ii) diploid variant calling with samtools -mpileup followed by an algorithm that uses read depths and finds the values that maximize the log-likelihood calculated for each population using the binomial probability distribution. With the second method we use, our aim is to maximize the use of low-coverage ancient genome sequence data by using all the information we obtain about the locus of interest from variant calling without pseudo-haploidization in ancient DNA. One-way ANOVA test was used to test whether there was a significant difference between the methods (p value <2e-16 for neutral SNPs, <2e-16 for type 2 diabetes-related SNPs), Tukey test was performed as a post-hoc test; p values: method 1 vs method 2 (0.00) for neutral SNPs, (3.64e-11) for phenotype-related SNPs.

Suggestions

A comparative genetic analysis of the subterranean termite genus Reticulitermes (Isoptera : Rhinotermitidae)
Austin, JW; Szalanski, AL; Uva, P; Bagneres, AG; Kence, Aykut (Oxford University Press (OUP), 2002-11-01)
DNA sequencing analysis of the mitochondrial DNA cytochrome oxidase H (COII) region was used to examine genetic variation in the termite genus Reticulitermes Holmgren. We examined 21 species and subspecies from three continents. Sequencing of a 677-bp region of a 780-bp amplicon from 41 individuals and from 17 sequences obtained from GenBank revealed 221 polymorphic sites within the genus. Tajima-Nei distances from species ranged from 0.9 to 12.7%, and parsimony and maximum likelihood analysis revealed seve...
A mechanistic insight into selective de novo DNA methylation regulated by base-specific hydrogen bonding profile
Barlas , Ayşe Berçin; Karaca , Ezgi (Orta Doğu Teknik Üniversitesi Enformatik Enstitüsü; 2022-10)
The mammalian DNA methylation regulates diverse biological processes at the epigenetic level, such as ageing, embryonic development, reprogramming, chromatin modification, and X chromosome inactivation. Abnormalities in the DNA methylation disrupts integral molecular signaling mechanisms, leading to the severe diseases, especially cancer. DNA methylation occurs mainly at CpG islands through the transfer of a methyl group from S-adenosyl-L-methionine (SAM) to the 5' carbon of the target cytosine. De novo met...
Using Adaptive Neuro-Fuzzy Inference System for Classification of Microarray Gene Expression Cancer Profiles
Haznedar, Bülent; Arslan, Mustafa Turan; Kalınlı, Adem (2018-05-01)
Microarray is a technology that enables simultaneously analysis of thousands of genes in DNA structure depending on the advances in biochemistry. With this technology, it has become possible to diagnose and treat heredity diseases by analyzing thousands of gene expression levels. This study proposes an artificial intelligence method, Adaptive neuro-fuzzy inference system (ANFIS), to classify cancer gene expression profiles. The findings obtained with the proposed ANFIS approach are compared with the results...
A survey of form creation processes within the evolution of the organic tradition in architecture
Ruhi, Işıl; Mennan, Zeynep; Department of Architecture (2008)
Beginning with the developments in biological sciences since the 1750s, many scientists have been exploring the characteristics of Nature and the living. These developments, not only enabled humans to understand the interrelations among natural beings, but also influenced and shaped an organic tradition of architectural design during modernity. In many contemporary computer-aided projects, organicity is still seen to hold a decisive though different role in formal processes, as well as acting as a guide in ...
A study of the relative dominance of selected anaerobic sulfate-reducing bacteria in a continuous bioreactor by fluorescence in situ hybridization
İçgen, Bülent; Harrıson, S.T.L. (Springer Science and Business Media LLC, 2007-01-01)
The diversity and the community structure of sulfate-reducing bacteria (SRB) in an anaerobic continuous bioreactor used for treatment of a sulfate-containing wastewater were investigated by fluorescence in situ hybridization. Hybridization to the 16S rRNA probe EUB338 for the domain Bacteria was performed, followed by a nonsense probe NON338 as a control for nonspecific staining. Sulfate-reducing consortia were identified by using five nominally genus-specific probes (SRB129 for Desulfobacter, SRB221 for De...
Citation Formats
İ. Taç et al., “A comparative analysis on ancient genomic data: The impact of variant discovery approach on population genetics tests,” Erdemli, Mersin, TÜRKİYE, 2022, p. 3026, Accessed: 00, 2023. [Online]. Available: https://hibit2022.ims.metu.edu.tr.