DeepKin: Predicting Relatedness From Low-Coverage Genomes and Palaeogenomes With Convolutional Neural Networks

2025-01-01
Güler, Murat
Yılmaz, Ardan
Katırcıoğlu, Büşra
Kantar, Sarp
Ünver, Tara Ekin
Vural, Kıvılcım Başak
ALTINIŞIK, NEFİZE EZGİ
Akbaş, Emre
Somel, Mehmet
DeepKin is a novel tool designed to predict relatedness from genomic data using convolutional neural networks (CNNs). Traditional methods for estimating relatedness often struggle when genomic data is limited, as with palaeogenomes and degraded forensic samples. DeepKin addresses this challenge by leveraging two CNN models, which are trained solely on simulated genomic data, to classify relatedness up to the third degree and to identify parent–offspring and sibling pairs. Our benchmarking shows DeepKin performs comparably or better than the widely used tool READv2. We validated DeepKin, which uses PLINK's.map and.ped files as input, on empirical palaeogenomes from three archaeological sites, demonstrating its robustness and adaptability across different genetic backgrounds, with accuracy > 90% above 10 K shared SNPs. By capturing information across genomic segments, DeepKin offers a new methodological path for relatedness estimation in settings with highly degraded samples, with applications in ancient DNA, as well as forensic and conservation genetics.
Molecular Ecology Resources
Citation Formats
M. Güler et al., “DeepKin: Predicting Relatedness From Low-Coverage Genomes and Palaeogenomes With Convolutional Neural Networks,” Molecular Ecology Resources, pp. 0–0, 2025, Accessed: 00, 2025. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105013569198&origin=inward.