Assessment of genetic relatedness tools for ancient DNA using pedigree simulations

2023-8
Aktürk, Şevval
Understanding the biological relationships among individuals retrieved in archaeological excavations has drawn attention for a long time in the ancient DNA field. The increasing number of SNP genotype or whole-genome sequence data has recently opened up the opportunity to estimate genetic relatedness using genome-wide markers. However, next-generation sequencing of ancient genomes at low depth causes low precision in genotype calling due to missing alleles, post-mortem damage, and fragmentation. To overcome these challenges, different tools have been developed, including algorithms relying on genotype likelihoods and population allele frequencies (e.g., NgsRelate and lcMLkin) or tools comparing genotype mismatch rates between a pair (e.g., READ). To systematically evaluate the reliability of these three most commonly used tools in the presence of a limited number of SNPs and inbreeding, we used ancient genome data produced from simulated pedigrees. Our results show that related pairs can be accurately classified as first-degree, even down to 1K shared SNPs, with 85% and 96% F1 scores using READ and NgsRelate or lcMLkin, respectively. Distinguishing unrelated pairs from close relatives down to third-degree is possible with high accuracy (F1 = 99%) at 5K shared SNPs using NgsRelate and lcMLkin. Further, with 10K shared SNPs, NgsRelate and lcMLkin outperform READ in the differentiation of third-degree from second-degree relatedness with a 95% F1 and 96% F1 (80\% for READ). Last, inbreeding (e.g., first cousin mating) leads to the overestimation of kinship coefficients. These results are promising but also call for exploring novel approaches for kinship estimation with ultra-low coverage genomes.
Citation Formats
Ş. Aktürk, “Assessment of genetic relatedness tools for ancient DNA using pedigree simulations,” M.S. - Master of Science, Middle East Technical University, 2023.