Prediction of Protein-Protein Interaction Relevance of Articles Using References

Calli, Cagatay
Classifying documents as protein-protein interaction (PPI) relevant or not is the first step towards extracting meaningful PPI data from article content. Currently, this classification step is handled manually by expert curators. A number of text-mining methods have been proposed to tackle this problem, using abstracts without references. We propose that article references contain important information that can be used to enhance these previous techniques. We trained an SVM classifier solely based on reference links extracted from Biocreative II data to test the effect of references. Our approach includes a feature selection method based on reference count imbalance between positive and negative examples. Classification results on Biocreative II test and Biocreative II.5 training datasets show that even simple referential information extracted from papers can be effective for predicting protein interaction.


On Fuzzy Extensions to Energy Ontologies for Text Processing Applications
Kucuk, Dilek; Kucuk, Dogan; Yazıcı, Adnan (2014-10-28)
Ubiquitous application areas of domain ontologies include text processing applications like categorizing related documents of the domain, extraction of information from these documents, and semantic search. In this paper, we focus on the utilization of two energy ontologies, one for electrical power quality and the second for wind energy, within such applications. For this purpose, we present fuzzy extensions to these domain ontologies as fuzziness is an essential feature of the ultimate forms of the ontolo...
Fuzzy data representation and querying in XML database
Ustunkaya, Ekin; Yazıcı, Adnan; George, Roy (2007-02-01)
Real-world information including subjective opinions and judgments need imprecise data to be modeled for representation and querying in databases. The Extensible Markup Language (XML) has become a de-facto standard for data modeling and exchange in recent years. Efforts on modeling imprecision and representing such data in XML have not been fully developed. In this paper, an XML based fuzzy data representation and querying system is presented. Complex and imprecise data are represented using a fuzzy extensi...
Semantic Processing of Database Textual Attributes Using Wikipedia
Campana, Jesus R.; Medina, Juan M.; Vila, M. Amparo (2011-10-28)
Text attributes in databases contain rich semantic information that is seldom processed or used. This paper proposes a method to extract and semantically represent concepts from texts stored in databases. This process relies on tools such as WordNet and Wikipedia to identify concepts extracted from texts and represent them as a basic ontology whose concepts are annotated with search terms. This ontology can play diverse roles. It can be seen as a conceptual summary of the content of an attribute, which can ...
Ecodat: A web-based application and database for limnological monitoring data
Değermenci, Ali; Özen, Can; Department of Biotechnology (2022-10-10)
This study is focused on the elaboration of database application solutions for saving measured data in limnological monitoring. The purpose of the study is to give background information about monitoring data and the creation of a smart-accessible database system using query language and database management system Microsoft SQL Server 2012 while also providing a web application where researchers can reach, interact and save their fieldwork using C# .NET Core MAUI/Blazor framework. The thesis includes a data...
Secure logical schema and decomposition algorithm for proactive context dependent attribute based inference control
Turan, Ugur; Toroslu, İsmail Hakkı; Kantarcioglu, Murat (2017-09-01)
Inference problem has always been an important and challenging topic of data privacy in databases. In relational databases, the traditional solution to this problem was to define views on relational schemas to restrict the subset of attributes and operations available to the users in order to prevent unwanted inferences. This method is a form of decomposition strategy, which mainly concentrates on the granularity of the accessible fields to the users, to prevent sensitive information inference. Nowadays, du...
