Mining microarray data for biologically important gene sets

Korkmaz, Gülberal Kırçiçeği Yoksul
Microarray technology enables researchers to measure the expression levels of thousands of genes simultaneously to understand relationships between genes, extract pathways, and in general understand a diverse amount of biological processes such as diseases and cell cycles. While microarrays provide the great opportunity of revealing information about biological processes, it is a challenging task to mine the huge amount of information contained in the microarray datasets. Generally, since an accurate model for the data is missing, first a clustering algorithm is applied and then the resulting clusters are examined manually to find genes that are related with the biological process under inspection. We need automated methods for this analysis which can be used to eliminate unrelated genes from data and mine for biologically important genes. Here, we introduce a general methodology which makes use of traditional clustering algorithms and involves integration of the two main sources of biological information, Gene Ontology and interaction networks, with microarray data for eliminating unrelated information and find a clustering result containing only genes related with a given biological process. We applied our methodology successfully on a number of different cases and on different organisms. We assessed the results with Gene Set Enrichment Analysis method and showed that our final clusters are highly enriched. We also analyzed the results manually and found that most of the genes that are in the final clusters are actually related with the biological process under inspection.
Citation Formats
G. K. Y. Korkmaz, “Mining microarray data for biologically important gene sets,” Ph.D. - Doctoral Program, Middle East Technical University, 2012.