Machine learning methods for promoter region prediction

Arslan, Hilal
Promoter classification is the task of separating promoter from non promoter sequences. Determining promoter regions where the transcription initiation takes place is important for several reasons such as improving genome annotation and defining transcription start sites. In this study, various promoter prediction methods called ProK-means, ProSVM, and 3S1C are proposed. In ProSVM and ProK-means algorithms, structural features of DNA sequences are used to distinguish promoters from non promoters. Obtained results are compared with ProSOM which is an existing promoter prediction method. It is shown that ProSVM is able to achieve greater recall rate compared to ProSOM results. Another promoter prediction methods proposed in this study is 3S1C. The difference of the proposed technique from existing methods is using signal, similarity, structure, and context features of DNA sequences in an integrated way and a hierarchical manner. In addition to current methods related to promoter classification, the similarity feature, which compares the promoter regions between human and other species, is added to the proposed system. We show that the similarity feature improves the accuracy. To classify core promoter regions, firstly, signal, similarity, structure, and context features are extracted and then, these features are classified separately by using Support Vector Machines. Finally, output predictions are combined using multilayer perceptron. The result of 3S1C algorithm is very promising.


Predicting the effect of hydrophobicity surface on binding affinity of PCP-like compounds using machine learning methods
Yoldaş, Mine; Alpaslan, Ferda Nur; Büyükbingöl, Erdem; Department of Computer Engineering (2011)
This study aims to predict the binding affinity of the PCP-like compounds by means of molecular hydrophobicity. Molecular hydrophobicity is an important property which aff ects the binding affinity of molecules. The values of molecular hydrophobicity of molecules are obtained on three-dimensional coordinate system. Our aim is to reduce the number of points on the hydrophobicity surface of the molecules. This is modeled by using self organizing maps (SOM) and k-means clustering. The feature sets obtained fro...
Highly efficient polymer blends from a polyfluorene derivative and PVK for LEDs
NOWACKI, Bruno; IAMAZAKI, Eduardo; Çırpan, Ali; KARASZ, Frank; ATVARS, Teresa D.Z.; AKCELRUD, Leni (Elsevier BV, 2009-11-27)
The photophysical and electroluminescent properties of blends of a polyfluorene derivative of the PPV type, poly[(9,9-dihexyl-9H-fluorene-2,7-diyl)-1,2-ethenediyl-1,4-phenylene-1,2-ethenediyl] (labeled as LaPPS16) and poly(vinylcarbazole) - PVK are presented and discussed in terms of the operating light emission mechanisms. Static and dynamic fluorescence measurements and morphology data showed a powerful exciton migration from the host (PVK) to the guest (LaPPS16) resulting in emission coming from solely L...
An Extension to GOPred to annotate swiss-prot and trembl sequences for all gene ontology categories and EC numbers
Rifaioğlu, Ahmet Süreyya; Toroslu, İsmail Hakkı; Atalay, Rengül; Department of Computer Engineering (2015)
Traditional protein function annotation methods cannot keep up with annotation of proteins as the number of proteins whose sequences known is increasing exponentially. For this reason, protein function prediction became an important research area. In this thesis, GOPred method is used with improvements for protein function prediction problem. GOPred consists of SPMap, Blast-kNN and Pepstats methods which are subsequence, similarity and feature based methods, respectively. Previous version of GOPred method u...
Effect of oxygen transfer conditions on recombinant protein production by Pichia pastoris under glyceraldehyde 3 phosphate dehydrogenase promoter
Güneş, Hande; Çalık, Pınar (null; 2016-04-06)
Glyceraldehyde-3-phosphate dehydrogenase promoter, PGAP, is known to be a strong constitutive promoter of Pichia pastorisexpression systems that can achieve comparative expression levels with alcohol oxidase I promoter, PAOX1. Oxygen transfer strategies need to be fine-tuned in order to obtain high expression levels as P. pastorisis vulnerable to changes in oxygen transfer conditions. In this study, effect of oxygen transfer conditions on recombinant glucose ...
Theoretical study of tetramethyl- and tetra-tert-butyl-substituted cyclobutadiene and tetrahedrane
Balcı, Metin; McKee, ML; Schleyer, PV (2000-02-17)
The tetramethyl and tetra-tert-butyl derivatives of cyclobutadiene and tetrahedrane have been studied with ab initio and density functional methods. The ring in tetra-tert-butylcyclobutadiene displays very unequal bond lengths (1.354, 1.608 Angstrom) and confirms the earlier suspicion that the low-temperature X-ray structure was distorted. The C-C single bonds have the longest separations found to date between sp(2)-hybridized carbons. Tetra-rert-butyltetrahedrane, which prefers T over Td symmetry, is calcu...
Citation Formats
H. Arslan, “Machine learning methods for promoter region prediction,” M.S. - Master of Science, Middle East Technical University, 2011.