Tools and techniques for assessing functional relevance of genomicloci

Download
2017
Otlu, Burçak
Genomic studies identify genomic loci representing genetic variations, transcription factor occupancy, or histone modification through next generation sequencing (NGS) technologies. Interpreting these loci requires evaluating them with known genomic and epigenomic annotations. In this thesis, we develop tools and techniques to assess the functional relevance of set of genomic intervals. Towards this goal, we first introduce Genomic Loci ANnotation and Enrichment Tool (GLANET) as a comprehensive annotation and enrichment analysis tool. Input query to GLANET is a set of genomic intervals. GLANET annotates and performs enrichment analysis on these loci with a rich library that includes: (i) gene-centric regions that encompass their non-coding neighborhood, (ii) a large collection of regulatory regions from ENCODE, and (iii) gene sets derived from pathways. As a key feature, users can easily extend this library with new gene sets and genomic intervals. GLANET implements a sampling-based enrichment test that can account for GC content and/or mappability biases inherent to NGS technologies, which shows high statistical power and well-controlled Type-I error rate. Other key features of GLANET include assessment of impact of single nucleotide variants on transcription factor binding sites when input consists of SNPs only and not only exon based but also regulation based gene set enrichment analysis by considering introns and proximal regions of genes in a gene set. GLANET also allows joint enrichment analysis for TF binding sites and KEGG pathways. With this option, users can evaluate whether the input set is enriched concurrently with binding sites of TFs and the genes within a KEGG pathway. This joint enrichment analysis provides a detailed functional interpretation of the input loci. As a second contribution we designed novel data-driven computational experiments for assessing the power and Type-I error of enrichment procedures. The data-driven computational experiments render detailed quantitative comparisons of GLANET with other tools possible. Our results on these computational experiments showcase GLANET’s unique capabilities as well as robustness, speed and accuracy. Finally, as a third contribution, we present an efficient algorithmic solution for finding common overlapping intervals over n interval sets. Our strategy is based on constructing one segment tree for each interval set as the first step and proceeds by converting each segment tree to an indexed segment tree forest by cutting this tree at a certain depth. Experiments on real data show that this data structure decreases the search time. This novel representation also enables parallel computations on each segment tree in the forest. We also extend this solution to solve the problem of finding at least k common overlapping intervals over n interval sets. The tools and techniques developed herein will hopefully expedite the genomic research and help improve our understanding of the molecular biology of the cell and the mechanisms underlying diseases. 

Suggestions

GLANET: genomic loci annotation and enrichment tool
Otlu, Burcak; Firtina, Can; Keles, Sunduz; Tastan, Oznur (Oxford University Press (OUP), 2017-09-15)
Motivation: Genomic studies identify genomic loci representing genetic variations, transcription factor (TF) occupancy, or histone modification through next generation sequencing (NGS) technologies. Interpreting these loci requires evaluating them with known genomic and epigenomic annotations.
A multi-layered graphical model of the relation among SNPS, GENES, and pathways based on subgraph search
Ersoy, Gökhan; Aydın Son, Yeşim; Can, Tolga; Department of Bioinformatics (2015)
The analysis of Single Nucleotide Polymorphisms (SNPs) through Genome Wide Association Studies (GWAS) presents great potential for describing disease loci and gaining insight into the underlying etiology of diseases. Recently described combined p-value approach allows identification of associations at gene and pathway level. The integrated programs like METU-SNP produce simple lists of either SNP id/gene id/pathway title and their p-values and significance status or SNP id/disease id/pathway information. In...
Optimization of nucleic acid delivery via cationic polymers for genome engineering
Öktem, Ayşegül; Erson Bensan, Ayşe Elif; Department of Molecular Biology and Genetics (2019)
One of the most challenging aspects of genome engineering is the delivery of genome editing components such as plasmids, oligonucleotides, RNA and protein. In this work, in-house synthetized cationic polymer poly (2-hydroxypropylene imine) (pHP) was tested in order to achieve substantial delivery efficiency while preserving high culture viability. Applicability of this cationic polymer mediated nucleic acid delivery method for both plant and mammalian cells were demonstrated. Several parameters of plasmid a...
Using Adaptive Neuro-Fuzzy Inference System for Classification of Microarray Gene Expression Cancer Profiles
Haznedar, Bülent; Arslan, Mustafa Turan; Kalınlı, Adem (2018-05-01)
Microarray is a technology that enables simultaneously analysis of thousands of genes in DNA structure depending on the advances in biochemistry. With this technology, it has become possible to diagnose and treat heredity diseases by analyzing thousands of gene expression levels. This study proposes an artificial intelligence method, Adaptive neuro-fuzzy inference system (ANFIS), to classify cancer gene expression profiles. The findings obtained with the proposed ANFIS approach are compared with the results...
Recent Advances in Analytical Chemistry
İlgü, Müslüm; Nilsen-hamilton, Marit (IntechOpen, 2019-04-01)
Aptamers are invitro selected oligonucleotides (DNA, RNA, oligos with modified nucleotides) that can have high affinity and specificity for a broad range of potential targets with high affinity and specificity. Here we focus on their applications as biosensors in the diagnostic field, although they can also be used as therapeutic agents. A small number of peptide aptamers have also been identified. In analytical settings, aptamers have the potential to extend the limit of current techniques as they offer ma...
Citation Formats
B. Otlu, “Tools and techniques for assessing functional relevance of genomicloci,” Ph.D. - Doctoral Program, Middle East Technical University, 2017.