Bi-k-bi clustering: mining large scale gene expression data using two-level biclustering

Download
2010-01-01
Carkacioglu, Levent
Atalay, Rengül
KONU KARAKAYALI, ÖZLEN
Atalay, Mehmet Volkan
Can, Tolga
Due to the increase in gene expression data sets in recent years, various data mining techniques have been proposed for mining gene expression profiles. However, most of these methods target single gene expression data sets and cannot handle all the available gene expression data in public databases in reasonable amount of time and space. In this paper, we propose a novel framework, bi-k-bi clustering, for finding association rules of gene pairs that can easily operate on large scale and multiple heterogeneous data sets. We applied our proposed framework on the available NCBI GEO Homo sapiens data sets. Our results show consistency and relatedness with the available literature and also provides novel associations.
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS

Suggestions

Integer linear programming based solutions for construction of biological networks
Eren Özsoy, Öykü; Can, Tolga; Department of Health Informatics (2014)
Inference of gene regulatory or signaling networks from perturbation experiments and gene expression assays is one of the challenging problems in bioinformatics. Recently, the inference problem has been formulated as a reference network editing problem and it has been show that finding the minimum number of edit operations on a reference network in order to comply with perturbation experiments is an NP-complete problem. In this dissertation, we propose linear programming based solutions for reconstruction o...
Mathematical Modeling and Approximation of Gene Expression Patterns
Yılmaz, Fatih; Öktem, Hüseyin Avni (2004-09-03)
This study concerns modeling, approximation and inference of gene regulatory dynamics on the basis of gene expression patterns. The dynamical behavior of gene expressions is represented by a system of ordinary differential equations. We introduce a gene-interaction matrix with some nonlinear entries, in particular, quadratic polynomials of the expression levels to keep the system solvable. The model parameters are determined by using optimization. Then, we provide the time-discrete approximation of our time...
Short Time Series Microarray Data Analysis and Biological Annotation
Sökmen, Zerrin; Atalay, Mehmet Volkan; Atalay, Rengül (2008-01-01)
Significant gene list is the result of microarray data analysis should be explained for the purpose of biological functions. The aim of this study is to extract the biologically related gene clusters over the short time series microarray gene data by applying unsupervised methods and automatically perform biological annotation of those clusters. In the first step of the study, short time series microarray expression data is clustered according to similar expression profiles. After that, several biological d...
Comparing Clustering Techniques for Real Microarray Data
Purutçuoğlu Gazi, Vilda (2012-08-29)
The clustering of genes detected as significant or differentially expressed provides useful information to biologists about functions and functional relationship of genes. There are variant types of clustering methods that can be applied in genomic data. These are mainly divided into the two groups, namely, hierarchical and partitional methods. In this paper, as the novelty, we perform a detailed clustering analysis for the recently collected boron microarray dataset to investigate biologically more interes...
mESAdb: microRNA expression and sequence analysis database.
Kaya, KD; Karakülah, G; Yakicier, CM; Acar, Aybar Can; Konu, O (2011-01-01)
MicroRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse a...
Citation Formats
L. Carkacioglu, R. Atalay, Ö. KONU KARAKAYALI, M. V. Atalay, and T. Can, “Bi-k-bi clustering: mining large scale gene expression data using two-level biclustering,” INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, pp. 701–721, 2010, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/32311.