Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
DARKIN: a zero-shot benchmark for phosphosite–dark kinase association using protein language models
Date
2025-11-01
Author
Sunar, Emine Ayşe
Işık, Zeynep
Pekey, Mert
Cinbiş, Ramazan Gökberk
Tastan, Oznur
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
9
views
0
downloads
Cite This
Motivation: Protein language models (pLMs) have emerged as powerful tools for capturing the intricate information encoded in protein sequences, facilitating various downstream protein prediction tasks. With numerous pLMs available, there is a critical need for diverse benchmarks to systematically evaluate their performance across biologically relevant tasks. Here, we introduce DARKIN, a zero-shot classification benchmark designed to assign phosphosites to understudied kinases, termed dark kinases. Kinases, which catalyze phosphorylation, are central to cellular signaling pathways. While phosphoproteomics enables the large-scale identification of phosphosites, determining the cognate kinase responsible for the phosphorylation event remains an experimental challenge. Results: In DARKIN, we prepared training, validation, and test folds that respect the zero-shot nature of this classification problem, incorporating stratification based on kinase groups and sequence similarity. We evaluated multiple pLMs using two zero-shot classifiers: a novel, training-free k-NN-based method, and a bilinear classifier. Our findings indicate that ESM, ProtT5-XL, and SaProt exhibit superior performance on this task. DARKIN provides a challenging benchmark for assessing pLM efficacy and fosters deeper exploration of under-characterized (dark) kinases by offering a biologically relevant test bed. Availability and implementation The DARKIN benchmark data and the scripts for generating additional splits are publicly available at: https://github.com/tastanlab/darkin
URI
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105020681519&origin=inward
https://hdl.handle.net/11511/117139
Journal
Bioinformatics
DOI
https://doi.org/10.1093/bioinformatics/btaf480
Collections
Department of Computer Engineering, Article
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
E. A. Sunar, Z. Işık, M. Pekey, R. G. Cinbiş, and O. Tastan, “DARKIN: a zero-shot benchmark for phosphosite–dark kinase association using protein language models,”
Bioinformatics
, vol. 41, no. 11, pp. 0–0, 2025, Accessed: 00, 2025. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105020681519&origin=inward.