ARTIFICIAL LEARNING-BASED ANALYSIS OF MOLECULAR, CLINICAL TRIALS AND PATENT DATA FOR IMPROVED DRUG DEVELOPMENT

Download

Fulya_Çıray_Tez.pdf

Date

2022-8-31

Author

Çıray, Fulya

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

342
views

339
downloads

Drug development is a costly process, especially in terms of the required time and money. Many promising drug candidates are eliminated at late development stages, e.g., phase II or III of clinical trials, due to insufficient efficacy or unexpected adverse health related affects. Lately, pharmaceutical companies are evaluating computational approaches, to increase the efficiency of this process. In this thesis study, we investigated the computational prediction of the approval of drug candidate compounds by regulatory bodies (i.e., approved for an official use to treat the indicated disease) while the trial process is still continuing, using relevant information from previous discovery and development stages and machine learning. As a preliminary analysis, we examined drug substructures to observe whether the presence of specific molecular structures in drug candidates lead to undesirable outcomes (i.e., unapproved). In the main part of the study, we employed a wider and more heterogeneous set of features including molecular and physicochemical properties of drugs, together with clinical trial and patent related features, to represent each drug-indication pair as a heterogeneous numerical vector. Following data gathering, manual curation and imputation procedures, our finalized feature vectors are processed by random forest (RF) classifiers to train independent drug approval prediction models for 14 different disease groups. We achieved high prediction scores in our cross validation-based performance evaluation, varying in ranges of; accuracy: 0.67-0.81, precision: 0.77-0.82, recall: 0.77-0.96, F1-score: 0.77-0.88 and MCC: 0.45-0.62. Furthermore, by conducting a temporal analysis, we showed that our method is also capable of producing successful results in a prospective manner. We also carried out a performance comparison against a baseline model and a state-of-the-art method from literature, the results of which indicated both robustness and the generalization capability of our approach. Additionally, we identified the most important features for accurately predicting drug approvals, which heavily includes clinical trial and patent related features. Within a use-case study, we showed that our method can successfully predict regulatorily approved (phase IV) drugs that are later withdrawn from the market due to severe side effects. Finally, we used pre-trained models to predict the approval of drug candidates that are currently in clinical trial phases I/II/III and presented prediction results. We hope that the results of our study and the computational tool we presented will contribute to the literature in terms of evaluating and improving the drug development process. All of the datasets, source code, results and pre-trained models of this study are freely available at https://github.com/HUBioDataLab/DrugApp.

Subject Keywords

Approval of drugs, clinical trials, drug patents, machine learning, predictive modeling

URI

https://hdl.handle.net/11511/99462

Collections

Graduate School of Informatics, Thesis

Suggestions

OpenMETU
Core

INTEGRATIVE NETWORK MODELLING OF DRUG RESPONSES IN CANCER FOR REVEALING MECHANISM OF ACTION Ünsal Beyge, Şeyma; Tunçbağ, Nurcan; Department of Medical Informatics (2021-9-6) Classification of cancer drugs is crucial for drug repurposing since the cost and innovation deficit make new drug development processes challenging. Heterogeneity of cancer causes drug classification purely based on known mechanism of action (MoA) and the list of target proteins to be insufficient. Multi-omic data integration is necessary for a systems biology perspective to understand molecular mechanisms and interactions between cellular entities underlying the disease. This study integrates drug-target ...
Data Centric Molecular Analysis and Evaluation of Hepatocellular Carcinoma Therapeutics Using Machine Intelligence-Based Tools Cetin-Atalay, Rengul; Kahraman, Deniz Cansen; Sinoplu, Esra; RİFAİOĞLU, AHMET SÜREYYA; Atakan, Ahmet; Dönmez, Ataberk; Atas, Heval; Atalay, Mehmet Volkan; Acar, Aybar C.; DOĞAN, TUNCA (2021-12-01) Purpose Computational approaches have been used at different stages of drug development with the purpose of decreasing the time and cost of conventional experimental procedures. Lately, techniques mainly developed and applied in the field of artificial intelligence (AI), have been transferred to different application domains such as biomedicine. Methods In this study, we conducted an investigative analysis via data-driven evaluation of potential hepatocellular carcinoma (HCC) therapeutics in the context of ...
Controlled release of bioactive agents in gene therapy and tissue engineering Keskin, DS; Hasırcı, Vasıf Nejat (2003-01-01) Even though the drugs are effective in the treatment of some diseases, they may be inefficient or incapable of solving the problem in some other diseases. It is known that some diseases have genetic causes and therefore the search for a therapy in these cases is intense. The solutions involving either direct application of a gene or its basic product, proteins, especially the growth factors, are oftencontemplated. Gene therapy is a novel approach to treating diseases based on modifying the expression of a p...
Thermal changes in an artificial lake simulated using a one-dimensional numerical model Tokyay Sinha, Talia Ekin; Yetgin, Mehmet Yücel (LookUs Bilisim A.S., 2019) Bu çalışma yüksek sıcaklıktaki suyun göl ve rezervuar gibi akıntısız su kütlelerine verilmesini incelemektedir. Sayısal çalışmada PROBE isimli bir boyutlu (1B) sonlu hacim yazılımı kullanılmıştır. Yazılım, yüksek sıcaklıktaki suyun ve rüzgârın göl içindeki karışma süreçlerine etkisine, Koriolis etkisine ve güneş ışınımı etkisine açıklamalar getirmektedir. Bu koşullar termik santrallerdeki (kömür, doğalgaz, nükleer vb.) soğutma işlemleriyle alakalıdır. Mevsimsel doğal tabakalaşma ve termoklin oluşumu ...
GENOTYPE CATALOG FOR THE ANALYSIS OF DRUG-DRUG INTERACTIONS OZDEMIR, AYSE; Acar, Aybar Can; Department of Bioinformatics (2021-9-07) Polypharmacy is an essential practice in today’s therapeutics, especially in the care of older population. Most polypharmacy-induced drug-drug interactions (DDIs) are often discovered after drugs are put on the market. Health problems and economic burden due to unpredicted DDIs put the health system in a difficult situation. Therefore, increasing the predictability of DDIs has become one of the most critical concerns towards improving treatment success. Being dependent on several underlying parameters makes...

Citation Formats

F. Çıray, “ARTIFICIAL LEARNING-BASED ANALYSIS OF MOLECULAR, CLINICAL TRIALS AND PATENT DATA FOR IMPROVED DRUG DEVELOPMENT,” Ph.D. - Doctoral Program, Middle East Technical University, 2022.