Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens
Download
s13059-019-1835-8.pdf
Date
2019-11-19
Author
Zhou, N
Jiang, Y
Bergquist, TR
Lee, AJ
Kacsoh, BZ
Crocker, AW
Lewis, KA
Georghiou, G
Nguyen, HN
Hamid, MN
Davis, L
Dogan, T
Atalay, Mehmet Volkan
Rifaioğlu, Ahmet Süreyya
Dalkıran, A
Atalay, Rengül
Zhang, C
Hurto, RL
Freddolino, PL
Zhang, Y
Bhat, P
Supek, F
Fernández, JM
Gemovic, B
Perovic, VR
Davidović, RS
Sumonja, N
Veljkovic, N
Asgari, E
Mofrad, MRK
Profiti, G
Savojardo, C
Martelli, PL
Casadio, R
Boecker, F
Schoof, H
Kahanda, I
Thurlby, N
McHardy, AC
Renaux, A
Saidi, R
Gough, J
Freitas, AA
Antczak, M
Fabris, F
Wass, MN
Hou, J
Cheng, J
Wang, Z
Romero, AE
Paccanaro, A
Yang, H
Goldberg, T
Zhao, C
Holm, L
Törönen, P
Medlar, AJ
Zosa, E
Borukhov, I
Novikov, I
Wilkins, A
Lichtarge, O
Chi, PH
Tseng, WC
Linial, M
Rose, PW
Dessimoz, C
Vidulin, V
Dzeroski, S
Sillitoe, I
Das, S
Lees, JG
Jones, DT
Wan, C
Cozzetto, D
Fa, R
Torres, M
Warwick, Vesztrocy
Rodriguez, JM
Tress, ML
Frasca, M
Notaro, M
Grossi, G
Petrini, A
Re, M
Valentini, G
Mesiti, M
Roche, DB
Reeb, J
Ritchie, DW
Aridhi, S
Alborzi, SZ
Devignes, MD
Koo, DCE
Bonneau, R
Gligorijević, V
Barot, M
Fang, H
Toppo, S
Lavezzo, E
Falda, M
Berselli, M
Tosatto, SCE
Carraro, M
Piovesan, D
Ur, Rehman
Mao, Q
Zhang, S
Vucetic, S
Black, GS
Jo, D
Suh, E
Dayton, JB
Larsen, DJ
Omdahl, AR
McGuffin, LJ
Brackenridge, DA
Babbitt, PC
Yunes, JM
Fontana, P
Zhang, F
Zhu, S
You, R
Zhang, Z
Dai, S
Yao, S
Tian, W
Cao, R
Chandler, C
Amezola, M
Johnson, D
Chang, JM
Liao, WH
Liu, YW
Pascarelli, S
Frank, Y
Hoehndorf, R
Kulmanov, M
Boudellioua, I
Politano, G
Di, Carlo
Benso, A
Hakala, K
Ginter, F
Mehryary, F
Kaewphan, S
Björne, J
Moen, H
Tolvanen, MEE
Salakoski, T
Kihara, D
Jain, A
Šmuc, T
Altenhoff, A
Ben-Hur, A
Rost, B
Brenner, SE
Orengo, CA
Jeffery, CJ
Bosco, G
Hogan, DA
Martin, MJ
O'Donovan, C
Mooney, SD
Greene, CS
Radivojac, P
Friedberg, I
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
175
views
135
downloads
Cite This
Background The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. Results Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. Conclusion We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
Subject Keywords
Protein function prediction
,
Long-term memory
,
Biofilm
,
Critical assessment
,
Community challenge
URI
https://hdl.handle.net/11511/31645
Journal
GENOME BIOLOGY
DOI
https://doi.org/10.1186/s13059-019-1835-8
Collections
Graduate School of Natural and Applied Sciences, Article
Suggestions
OpenMETU
Core
Large-scale automated function prediction of protein sequences and an experimental case study validation on PTEN transcript variants
Rifaioğlu, Ahmet Süreyya; Sarac, Omer Sinan; ERSAHİN, Tulin; Saidi, Rabie; Atalay, Mehmet Volkan; Atalay, Rengül (2018-02-01)
Recent advances in computing power and machine learning empower functional annotation of protein sequences and their transcript variations. Here, we present an automated prediction system UniGOPred, for GO annotations and a database of GO term predictions for proteomes of several organisms in UniProt Knowledgebase (UniProtKB). UniGOPred provides function predictions for 514 molecular function (MF), 2909 biological process (BP), and 438 cellular component (CC) GO terms for each protein sequence. UniGOPred co...
Discriminative remote homology detection using maximal unique sequence matches
OGUL, H; Mumcuoğlu, Ünal Erkan (2005-01-01)
We define a new pairwise sequence comparison scheme, for distantly related proteins and report its performance on remote homology detection task. The new scheme compares two protein sequences by using the maximal unique matches (MUM) between them. Once identified, the length of all nonoverlapping MUMs is used to define the simflarity between two sequences. To detect the homology of a protein to a protein family, we utilize the feature vectors containing all pairwise similarity scores between the test protei...
Supporting performance data acquisition and analysis for corporate public real estate management
Gürsel Dino, İpek; Stouffs, Rudi (null; 2010-07-02)
This paper discusses a previously developed computational model, CLIP (Computational support for Lifecycle Integral Performance assessment) in the context of a current project, Energy Performance Integration in Corporate Public Real Estate Management (EPI-CREM). CLIP is a reference model that functions as a conceptual core, addressing the generic performance assessment functions with modular and extensible data representations and algorithms. EPI-CREM, on the other hand, is a public funded project that deve...
The Use of Informed Priors in Biclustering of Gene Expression with the Hierarchical Dirichlet Process.
Tercan, Bahar; Acar, Aybar Can (2019-02-26)
We motivate and describe the application of Hierarchical Dirichlet Process (HDP) models to the "soft" biclustering of gene expression data, in which we obtain modules (biclusters) where the affiliation of genes and samples with the modules are weighted, instead of being hard memberships. As a distinct contribution, we propose a method which HDP is informed with prior beliefs, significantly increasing the quality of the biclustering in terms of both the correctness of the number of modules inferred, and the ...
A discriminative method for remote homology detection based on n-peptide compositions with reduced amino acid alphabets
OĞUL, Hasan; Mumcuoğlu, Ünal Erkan (2007-01-01)
In this study, n-peptide compositions are utilized for protein vectorization over a discriminative remote homology detection framework based on support vector machines (SVMs). The size of amino acid alphabet is gradually reduced for increasing values of n to make the method to conform with the memory resources in conventional workstations. A hash structure is implemented for accelerated search of n-peptides. The method is tested to see its ability to classify proteins into families on a subset of SCOP famil...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
N. Zhou et al., “The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens,”
GENOME BIOLOGY
, pp. 0–0, 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/31645.