Disease Centric Large Scale De Novo Design of Drug Candidate Molecules with Graph Generative Deep Adversarial Networks

2022-10
Ünlü, Atabey
Çevrim, Elif
Sarıgün, Ahmet
Ataş, Heval
Koyaş, Altay
Çelikbilek, Hayriye
Kahraman, Deniz Cansen
Olğaç, Abdurrahman
Rifaioğlu, Ahmet Süreyya
Doğan, Tunca
Discovering novel drug candidate molecules is one of the most fundamental and critical steps in drug development. It is especially challenging to develop new drug-based treatments for complex diseases, such as various cancer subtypes, which have heterogeneous structure and affect multiple biological mechanisms. With the advancements in high-throughput screening technology, it is now possible to scan thousands of compounds simultaneously; but still, it is impossible to fully analyze the target and compound spaces due to the excessive number of protein-compound combinations. Furthermore, it is possible to design approximately 1060 small molecules that differ from each other by at least one atom or bond, indicating the nearly limitless potential of the theoretical space of drug-like molecules. Generative deep learning models, which create new data points according to a probability distribution at hand, have been developed with the purpose of picking completely new samples from a distribution space that is only partially known. In this study, we propose a novel computational system, DrugGEN, for de novo generation of single and multitarget drug candidate molecules intended for specific drug resistant diseases, by constructing a new deep learning architecture that leverages the transformer architecture and graph neural networks in a generative adversarial setting (Figure 1a). The DrugGEN system optimizes two main processes: creation of a new molecule and transforming it to target a selected protein. To this end, we developed a two-fold end-to-end model that takes graph representations of small molecules and target proteins as input to stacked generative adversarial networks – sGAN (composed of 2 modules: GAN1 and GAN2), and outputs de novo drug candidate molecules specific to the given target proteins (Figure 1a). The main goal of GAN1 is learning molecular properties of drug-like small molecules, such as how the atoms and bonds should be arranged for a molecule to be chemically synthesizable and physically stable. GAN1 is composed of a transformer encoder-based generator and a graph convolutional network-based discriminator. Transformer encoder takes random gaussian noise as input and transforms it to a graph in the form of separate annotation and adjacency matrices (Figure 1b). These generated graphs (representing de novo molecules), are then fed to the discriminator along with real molecules, in which graph convolution and graph aggregation operations are applied to predict whether a data point is generated by the model or belongs to the real molecules set. The aim of the GAN2 module is modifying the previously generated de novo molecules to effectively bind to the selected target. GAN2 is composed of a transformer decoder (generator) (Figure 1c) and a graph convolutional (discriminator) network. Here, the transformer decoder architecture was re-designed to process both protein and small molecule feature graphs, creating a pseudo-interaction module. Final molecular products are sampled and sent to the GAN2 discriminator to be compared with the known inhibitors of the protein of interest. The system was trained using all molecules in the ChEMBL database (~2M) and known inhibitors of the selected target protein (AKT1), in GAN1 and 2, respectively, to produce novel and effective inhibitory molecules against the hepatocellular carcinoma (HCC) disease, which is a deadly sub-type of liver cancer. The hyperparameter values of the system were optimized via multiple rounds of training/validation experiments. Generated molecules were monitored based on their synthetic accessibility and quantitative estimation of druglikeness. The overall model evaluation was done based on the percentage of valid molecules generated by the model, together with their uniqueness and novelty. All the metrics/scores were calculated using the RDkit library to keep results reproducible. The finalized system was run to design thousands of novel AKT1 inhibitors (Figure 1d, right hand-side). The resulting de novo molecule records are being evaluated by medicinal chemists, which will be followed by chemical synthesis of selected molecules and their utilization in wet-lab (in vitro) experiments for validating their inhibitory effects on drug resistant HCC cell lines. If the expected results are obtained, new drug candidate compounds of critical importance will be discovered for the treatment of HCC, and pre-clinical and clinical studies will be planned for future. DrugGEN has been developed as a generic system that can easily be used to design new molecules for other targets and diseases. All of the datasets, source code, results and pre-trained models of DrugGEN are freely available at https://github.com/HUBioDataLab/DrugGEN.

Suggestions

Molecular Assembly of Multi-Wall Carbon Nanotubes with Amino Crown Ether: Synthesis and Characterization
Camarena, J. P.; Espinoza-Gomez, H.; SOMANATHAN, R; Tiznado, H.; Velez-Lopez, E.; Romero-Rivera, R.; Martinez-Lopez, M. A.; Avalos-Borja, M.; Bek, Alpan; Alonso-Nunez, G.; Rogel-Hernandez, E. (2011-06-01)
Synthetic methodology and physicochemical characterization of multi-wall carbon nanotubes (MWCNTs) functionalized with a crown ether molecule is reported. The MWCNTs were synthesized by spray pyrolysis technique using toluene as carbon source and ferrocene as catalyst. The nanotubes were characterized by scanning electron microscopy (SEM) and transmission electron microscopy (TEM). Oxidation of MWCNTs was carried out by 8 h of sonication in a mixture of sulfuric and nitric acid (3:1). The MWCNT-COOH was ami...
Surface functionalization of SBA - 15 particles for amoxicillin delivery
Sevimli, Filiz F; Yılmaz, Ayşen; Department of Chemistry (2011)
There are several studies in order to control drug delivery, decrease the toxicity of drugs and also for novel biomedical applications. It is necessary to be able to control the release of the drug within the body by using drug delivery systems. Mesoporous silica compounds have only been discovered twenty years ago and they have already attracted many researchers to study these materials for several applications. SBA-15 particles have a highly ordered regular structure and are a good matrix for guest-host a...
Core/shell type, Ce3+ and Tb3+ doped GdBO3 system: Synthesis and Celecoxib drug delivery application
Çolak, Pelin; Ulusan, Sinem; Banerjee, Sreeparna; Yılmaz, Ayşen (Elsevier BV, 2020-12-01)
In this study, luminescent and magnetic core/shell Gd1-x-yCexTbyBO3@SiO2 nanoparticles were synthesized and used to design a drug delivery system for Celecoxib (CLX). CLX was chosen as the model drug because it is a nonsteroidal anti-inflammatory drug that is highly hydrophobic with relatively low bioavailability. The core was synthesized by Pechini sol-gel method and silica coating was carried out by a Modified Stöber method. Drug loading was carried out in ethanol with high efficiency and an improved drug...
Compressed images for affinity prediction (CIFAP): a study on predicting binding affinities for checkpoint kinase 1 protein inhibitors
Erdas, Ozlem; Andac, Cenk A.; Gurkan-Alp, A. Selen; Alpaslan, Ferda Nur; Buyukbingol, Erdem (Wiley, 2013-06-01)
Analyses of known protein-ligand interactions play an important role in designing novel and efficient drugs, contributing to drug discovery and development. Recently, machine learning methods have proven useful in the design of novel drugs, which utilize intelligent techniques to predict the outcome of unknown protein-ligand interactions by learning from the physical and geometrical properties of known protein-ligand interactions. The aim of this study is to work through a specific example of a novel comput...
Catalytic reaction of propylene to propylene oxide on various catalysts
Kalyoncu, Şule; Önal, Işık; Şeker, Erol; Department of Chemical Engineering (2012)
Throughout this thesis work, various catalysts were investigated with combinational approach to develop highly active and selective novel catalysts for direct epoxidation of propylene to PO using molecular oxygen. The promoted and un-promoted silver (Ag), copper (Cu), ruthenium (Ru), manganese (Mn) mono and multimetallic catalytic systems over different silica supports were prepared via sol-gel method and incipient wetness method. In addition to support effect, the effects of different promoters on the cata...
Citation Formats
A. Ünlü et al., “Disease Centric Large Scale De Novo Design of Drug Candidate Molecules with Graph Generative Deep Adversarial Networks,” Erdemli, Mersin, TÜRKİYE, 2022, p. 2032, Accessed: 00, 2023. [Online]. Available: https://hibit2022.ims.metu.edu.tr/.