CAP-RNAseq: An online tool for Clustering, Annotation and Prioritization of RNAseq data

2022-10
Vural Özdeniz , Merve
Çalışır , Kübra
Arıcı , Burçin Irem
Acar , Rana
Targen , Seniye
Dalgıç , Ertuğrul
Konu, Özlen
Transcriptome analysis has been an effective high throughput method for examining the regulation and function of genomes. However, analysis of RNAseq experiments with more than two groups/factors is complex and can benefit from clustering followed by annotation and prioritization of genes. While a few advanced analysis pipelines are available for such data sets, none of them make the gene selection process automated and hence easier for validation experiments. To fill this gap, we have developed a web tool named CAP-RNAseq, which creates co-expressed gene clusters using logCPM (voom transformed) values, and then annotates and prioritizes genes in the selected cluster. Users can upload their raw count RNAseq data, followed by application of ANOVA to filter/reduce the data to obtain genes showing significant changes between any two groups. k-means clustering is then applied with the user-specified cluster numbers; and the emerging clusters can be visualized and enriched functionally with mSigDB, GO and KEGG databases. CAP-RNAseq also suggests genes for future validation by qRT-PCR and Western Blotting. The algorithm used for this prioritization implements distance correlation (multivariate independence), which measures the correlation between the consensus profile of a cluster and profile of each gene in that cluster. Then, top correlated genes are shown in a table from which the user can select a gene for primers to be designed. Moreover, the Human Protein Atlas data has been integrated into CAP-RNAseq to visualize the protein levels of suggested or user-selected genes in selected cancer types. CAP-RNAseq has been designed and implemented for the R-Shiny platform and is user-friendly. We present several case studies for its use in breast cancer and hormone replacement therapy based on our own RNAseq datasets.

Suggestions

Integer linear programming based solutions for construction of biological networks
Eren Özsoy, Öykü; Can, Tolga; Department of Health Informatics (2014)
Inference of gene regulatory or signaling networks from perturbation experiments and gene expression assays is one of the challenging problems in bioinformatics. Recently, the inference problem has been formulated as a reference network editing problem and it has been show that finding the minimum number of edit operations on a reference network in order to comply with perturbation experiments is an NP-complete problem. In this dissertation, we propose linear programming based solutions for reconstruction o...
Score test for testing etiologic heterogeneity in two-stage polytomous logistic regression
Karagülle, Saygın; Kalaylıoğlu Akyıldız, Zeynep Işıl; Department of Statistics (2013)
Two-stage polytomous logistic regression was proposed by Chatterjee (2004) as an effective tool to analyze epidemiological data when disease subtype information is available. In this modeling approach, a classic logistic regression is employed in the first level of the model. In the second level, the first-stage regression parameters are modeled as a function of some contrast parameters in a somehow similar spirit of an ANOVA model. This modeling also enables a practical way of estimating the heterogeneity ...
COMPUTATIONAL STUDIES ON NOVEL ENERGETIC MATERIALS: TETRANITRO-[2,2]PARACYCLOPHANES
Tuerker, Lemi; Atalar, Taner; Guemues, Selcuk (Informa UK Limited, 2009-01-01)
Computational studies on tetranitro derivatives of [2,2]paracyclophane are carried out at B3LYP/6-31G(d,p) level of theory. Optimized geometries, electronic structures and some thermodynamic properties have been obtained in their ground states. Also, detonation performances were evaluated by the Kamlet-Jacobs equations, based on the quantum-chemical calculated densities and heat of formation values. Aromaticities were investigated by performing NICS (nucleus independent chemical shift) calculations using th...
R&D Project Performance Evaluation With Multiple and Interdependent Criteria
Tohumcu, Zeynep; Karasakal, Esra (Institute of Electrical and Electronics Engineers (IEEE), 2010-11-01)
In this study, we develop an approach based on analytic network process (ANP) and data envelopment analysis (DEA) to evaluate the performance of Research and Development projects. We identify a set of criteria and subcriteria having interdependencies among themselves. Interdependency is treated using a hybrid ANP model consisting of both hierarchy and network. The interval pairwise comparison matrices are constructed in a group decision-making process. ANP is extended to obtain interval weights from the int...
Inference of switching networks by using a piecewise linear formulation
Akçay, Didem; Öktem, Hakan; Department of Scientific Computing (2005)
Inference of regulatory networks has received attention of researchers from many fields. The challenge offered by this problem is its being a typical modeling problem under insufficient information about the process. Hence, we need to derive the apriori unavailable information from the empirical observations. Modeling by inference consists of selecting or defining the most appropriate model structure and inferring the parameters. An appropriate model structure should have the following properties. The model...
Citation Formats
M. Vural Özdeniz et al., “CAP-RNAseq: An online tool for Clustering, Annotation and Prioritization of RNAseq data,” Erdemli, Mersin, TÜRKİYE, 2022, p. 3066, Accessed: 00, 2023. [Online]. Available: https://hibit2022.ims.metu.edu.tr/.