De Novo SNP calling and demographic inference using trio genome data

Download
2019
Bozlak, Elif.
De novo mutations are novel mutations which are found in the offspring but not the parents and do not obey the Mendelian inheritance rules. Determining how many de novo mutations occur is important for genetic studies since they help to understand the evolutionary history of populations. In this thesis, we aim to examine de novo mutations that occur within one generation in domestic horses and make estimations on horse demographic history. We used DNA-sequencing data produced by next-generation sequencing technologies from trio data of three different horse breeds: Lipizzaner, Noriker, Haflinger. After quality checks and mapping of the raw data we called genomic variants with three different variant calling algorithms. We filtered all variants depending on their qualities to detect de novo candidates and the final 50 de novo candidates were tested using Sanger resequencing. About 40% of the candidate variants could be validated. We found a higher number of true positives in highly covered Lipizzaner (n=13) data, while a lower number of true positives in the low covered Noriker (n=3) and Haflinger (n=5) data, showing the importance of sequencing coverage to detect true de novo mutations. In addition, we used the Pairwise Sequentially Markovian Coalescent (PSMC) model and performed runs of homozygosity (ROH) analyses to estimate demographic history. Both PSMC and ROH results were coherent with previous studies. All in all, we had an idea for the minimum coverage threshold and quality of whole genome sequencing data, to determine de novo mutations and to estimate population demography.

Suggestions

Identification and analysis of genomic regions with large between-population differentiation in humans
Myles, S.; Tang, K.; Somel, Mehmet; Green, R. E.; Kelso, J.; Stoneking, M. (Wiley, 2008-01-01)
The primary aim of genetic association and linkage studies is to identify genetic variants that contribute to phenotypic variation within human populations. Since the overwhelming majority of human genetic variation is found within populations, these methods are expected to be effective and can likely be extrapolated from one human population to another. However, they may lack power in detecting the genetic variants that contribute to phenotypes that differ greatly between human populations. Phenotypes that...
Molecular dynamics simulations and coupled nucleotide substitution experiments indicate the nature of A center dot A base pairing and a putative structure of the coralyne-induced homo-adenine duplex
Joung, In Suk; Persil Çetinkol, Özgül; HUD, Nicholas V.; Cheatham, Thomas E. (Oxford University Press (OUP), 2009-12-01)
Coralyne is an alkaloid drug that binds homo-adenine DNA (and RNA) oligonucleotides more tightly than it does Watson-Crick DNA. Hud's laboratory has shown that poly(dA) in the presence of coralyne forms an anti-parallel duplex, however attempts to determine the structure by NMR spectroscopy and X-ray crystallography have been unsuccessful. Assuming adenine-adenine hydrogen bonding between the two poly(dA) strands, we constructed 40 hypothetical homo-(dA) anti-parallel duplexes and docked coralyne into the s...
Context dependent mutation biases in the human genome
Alıcı, Ahmet Yetkin; Somel, Mehmet; Department of Biology (2017)
Different types of mutations occur and spread in the genome at varying rates. For instance, C->T transitions at CpG sites are the most frequent mutation in mammalian genomes. In contrast, GC-biased gene conversion causes A or T->G or C mutations to spread and fix rapidly in populations. Such fixation biases have not yet been investigated taking neighbouring sequence context into account. Using human population genomic data from the 1000 Genomes Project and comparative genomic data from other primates, possi...
Evolution of Primate Gene Expression: Drift and Corrective Sweeps?
Chaix, R.; Somel, Mehmet; Kreil, D. P.; Khaitovich, P.; Lunter, G. A. (Genetics Society of America, 2008-11-01)
Changes in gene expression play an important: role in species' evolution. Earlier studies uncovered evidence that the effect of mutations on expression levels within the primate order is skewed, with many small downregulations balanced by fewer but larger upregulations. In addition, brain-expressed genes appeared to show an increased rate of evolution on the branch leading to human. However, the lack of a mathematical model adequately describing the evolution of gene expression precluded the rigorous establ...
Cloning and characterization of industrially important alpha-galactosidase genes from the human pathogen aspergillus fumigatus
Söyler, U. Betül; Ögel, Zümrüt Begüm; Department of Food Engineering (2004)
In this study, molecular cloning studies were performed on the a-galactosidase genes of Aspergillus fumigatus IMI 385708. This organism is an opportunistic saprophytic fungus and a human pathogen, mainly affecting immunocompromised patients. A. fumigatus is a thermotolerant fungus and can efficiently produce thermostable a-galactosidase. Two different cloning strategies were undertaken in this study. A. fumigatus cDNA library, prepared previously, was screened with three different probes. No net results wer...
Citation Formats
E. Bozlak, “De Novo SNP calling and demographic inference using trio genome data,” Thesis (M.S.) -- Graduate School of Informatics. Bioinformatics., Middle East Technical University, 2019.