DeepKin: Predicting Relatedness From Low-Coverage Genomes and Palaeogenomes With Convolutional Neural Networks


GÜLER M., YILMAZ A., KATIRCIOĞLU B., Kantar S., Ünver T. E., Vural K. B., ...More

Molecular Ecology Resources, vol.25, no.8, 2025 (SCI-Expanded, Scopus) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 25 Issue: 8
  • Publication Date: 2025
  • Doi Number: 10.1111/1755-0998.70032
  • Journal Name: Molecular Ecology Resources
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aquatic Science & Fisheries Abstracts (ASFA), BIOSIS, CAB Abstracts, Environment Index, MEDLINE, Veterinary Science Database
  • Keywords: ancient DNA, bioinfomatics/phyloinfomatics, convolutional neural networks, deep learning, palaeogenomics
  • Middle East Technical University Affiliated: Yes

Abstract

DeepKin is a novel tool designed to predict relatedness from genomic data using convolutional neural networks (CNNs). Traditional methods for estimating relatedness often struggle when genomic data is limited, as with palaeogenomes and degraded forensic samples. DeepKin addresses this challenge by leveraging two CNN models, which are trained solely on simulated genomic data, to classify relatedness up to the third degree and to identify parent–offspring and sibling pairs. Our benchmarking shows DeepKin performs comparably or better than the widely used tool READv2. We validated DeepKin, which uses PLINK's.map and.ped files as input, on empirical palaeogenomes from three archaeological sites, demonstrating its robustness and adaptability across different genetic backgrounds, with accuracy > 90% above 10 K shared SNPs. By capturing information across genomic segments, DeepKin offers a new methodological path for relatedness estimation in settings with highly degraded samples, with applications in ancient DNA, as well as forensic and conservation genetics.