Hide/Show Apps

Prediction of transmembrane regions of g protein-coupled receptors using machine learning techniques

Download
2019
Çınar, Muazzez Çelebi
G protein-coupled receptors (GPCRs) are one of the largest and the most significant membrane receptor families in eukaryotes. They transmit extracellular stimuli to the inside of the cell by undergoing conformational changes. GPCRs can recognize a diversity of extracellular ligands including hormones, neurotransmitters, odorants, photons, and ions. These receptors are associated with a variety of diseases in humans such as cancer and central nervous system disorders, and can be proclaimed as one of the most important targets for the pharmaceutical industry. They have seven transmembrane helices that contain essential regions such as ligand binding sites, actuator protein (e.g. G protein) binding sites and cholesterol binding sites. There is a large gap in topology data for membrane proteins due to the experimental limitations resulting from unstability of the membrane. In UniProt, which is a freely available database of protein sequences and structural and functional information, only 29 GPCRs among the thousands have experimentally solved transmembrane (TM) region data. The topology information of other membrane proteins is provided using the TMHMM prediction tool, which is based on hidden Markov models. However, it incorrectly predicts the total number of TM regions for 6 of the 29 experimentally determined GPCRs. With this study, we try to develop a GPCR-specific TM prediction algorithm using machine learning techniques. The algorithm is based on hydrophobicity of each amino acid in the protein sequence and the secondary structure. As hydrophobicity scale, both Moon-Fleming and Kyte-Doolittle hydrophobicity scales are implemented separately. The secondary structures are derived from the JPred server. With this algorithm, we obtain more than 85% accuracy with higher true positive rate. The results obtained could shed light on many other scientific researches and facilitate structure-based drug discovery with further therapeutic opportunities for many diseases.