Show/Hide Menu
Hide/Show Apps
anonymousUser
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Açık Bilim Politikası
Açık Bilim Politikası
Frequently Asked Questions
Frequently Asked Questions
Browse
Browse
By Issue Date
By Issue Date
Authors
Authors
Titles
Titles
Subjects
Subjects
Communities & Collections
Communities & Collections
Prediction of enzyme classes in a hierarchical approach by using SPMap
Date
2010-04-01
Author
Yaman, A.
Atalay, Mehmet Volkan
Atalay, Rengül
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
1
views
0
downloads
Enzymes are proteins that play important roles in biochemical reactions as catalysts. They are classified based on the reaction they catalyzed, in a hierarchical scheme by International Enzyme Commission (EC). This hierarchical scheme is expressed in four-level tree structure and a unique number is assigned to each enzyme class. There are six major classes at the top level according to the reaction they carried out and sub-classes at the lower levels are further specific reactions of these classes. The aim of this study was to build a three-level classification model based on the hierarchical structure of EC classes. ENZYME database was used to extract the information of EC classes then enzymes were assigned to these EC classes. Primary sequences of enzymes extracted from UniProtKB/Swiss-Prot database were used to extract features. A subsequence based feature extraction method, Subsequence Profile Map (SPMap) was used in this study. SPMap is a discriminative method that explicitly models the differences between positive and negative examples. SPMap considers the conserved subsequences of protein sequences in the same class. SPMap generates the feature vector of each sample protein as a probability of fixed-length subsequences of this protein with respect to a probabilistic profile matrix calculated by clustering similar subsequences in the training data set. In our case, positive and negative training datasets were prepared for each class, at each level of the tree structure. SPMap was used for feature extraction and Support Vector Machines (SVMs) were used for classification. Five-fold cross validation was used to test the performance of the system. The overall sensitivity, specificity and AUC values for the six major EC classes are 93.08%, 98.95% and 0.993, respectively. The results at the second and third levels were also comparable to those of six major classes.
Subject Keywords
Biochemical research methods
,
Biotechnology
,
Applied microbiology
URI
https://hdl.handle.net/11511/32291
Journal
NEW BIOTECHNOLOGY
DOI
https://doi.org/10.1016/j.nbt.2010.01.023
Collections
Graduate School of Informatics, Article