Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Selective word encoding for effective text representation
Date
2019-01-01
Author
Ozkan, Savas
Ozkan, Akin
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
196
views
0
downloads
Cite This
Determining the category of a text document from its semantic content is highly motivated in the literature and it has been extensively studied in various applications. Also, the compact representation of the text is a fundamental step in achieving precise results for the applications and the studies are generously concentrated to improve its performance. In particular, the studies which exploit the aggregation of word-level representations are the mainstream techniques used in the problem. In this paper, we tackle text representation to achieve high performance in different text classification tasks. Throughout the paper, three critical contributions are presented. First, to encode the wordlevel representations for each text, we adapt a trainable orderless aggregation algorithm to obtain a more discriminative abstract representation by transforming word vectors to the text-level representation. Second, we propose an effective term-weighting scheme to compute the relative importance of words from the context based on their conjunction with the problem in an end-to-end learning manner. Third, we present a weighted loss function to mitigate the class-imbalance problem between the categories. To evaluate the performance, we collect two distinct datasets as Turkish parliament records (i.e. written speeches of four major political parties including 30731/7683 train and test documents) and newspaper articles (i.e. daily articles of the columnists including 16000/3200 train and test documents) whose data is available on the web. From the results, the proposed method introduces significant performance improvements to the baseline techniques (i.e. VLAD and Fisher Vector) and achieves 0.823% and 0.878% true prediction accuracies for the party membership and the estimation of the category of articles respectively. The performance validates that the proposed contributions (i.e. trainable word-encoding model, trainable term-weighting scheme and weighted loss function) significantly outperform the baselines.
Subject Keywords
Electrical and Electronic Engineering
,
General Computer Science
URI
https://hdl.handle.net/11511/65216
Journal
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
DOI
https://doi.org/10.3906/elk-1805-138
Collections
Department of Electrical and Electronics Engineering, Article
Suggestions
OpenMETU
Core
Domain adaptation on graphs by learning graph topologies: theoretical analysis and an algorithm
Vural, Elif (The Scientific and Technological Research Council of Turkey, 2019-01-01)
Traditional machine learning algorithms assume that the training and test data have the same distribution, while this assumption does not necessarily hold in real applications. Domain adaptation methods take into account the deviations in data distribution. In this work, we study the problem of domain adaptation on graphs. We consider a source graph and a target graph constructed with samples drawn from data manifolds. We study the problem of estimating the unknown class labels on the target graph using the...
Multipath Characteristics of Frequency Diverse Arrays Over a Ground Plane
Cetintepe, Cagri; Demir, Şimşek (Institute of Electrical and Electronics Engineers (IEEE), 2014-07-01)
This paper presents a theoretical framework for an analytical investigation of multipath characteristics of frequency diverse arrays (FDAs), a task which is attempted for the first time in the open literature. In particular, transmitted field expressions are formulated for an FDA over a perfectly conducting ground plane first in a general analytical form, and these expressions are later simplified under reasonable assumptions. Developed formulation is then applied to a uniform, linear, continuous-wave opera...
Recursive shortest spaning tree algorithms for image segmentation
Bayramoğlu, Neslihan Yalçın; Bazlamaçcı, Cüneyt Fehmi; Department of Electrical and Electronics Engineering (2005)
Image segmentation has an important role in image processing because it is a tool to obtain higher level object descriptions for further processing. In some applications such as large image databases or video image sequence segmentations, the speed of the segmentation algorithm may become a drawback of the application. This thesis work is a study to improve the run-time performance of a well-known segmentation algorithm, namely the Recursive Shortest Spanning Tree (RSST). Both the original and the fast RSST...
State-space identification of switching linear discrete time-periodic systems with known scheduling signals
Uyanik, Ismail; Hamzacebi, Hasan; Ankaralı, Mustafa Mert (The Scientific and Technological Research Council of Turkey, 2019-01-01)
In this paper, we propose a novel frequency domain state-space identification method for switching linear discrete time-periodic (LDTP) systems with known scheduling signals. The state-space identification problem of linear time-invariant (LTI) systems has been widely studied both in the time and frequency domains. Indeed, there have been several studies that also concentrated on state-space identification of both continuous and discrete linear time-periodic (LTP) systems. The focus in this study is the fam...
Positive impact of state similarity on reinforcement learning performance
Girgin, Sertan; Polat, Faruk; Alhaj, Reda (Institute of Electrical and Electronics Engineers (IEEE), 2007-10-01)
In this paper, we propose a novel approach to identify states with similar subpolicies and show how they can be integrated into the reinforcement learning framework to improve learning performance. The method utilizes a specialized tree structure to identify common action sequences of states, which are derived from possible optimal policies, and defines a similarity function between two states based on the number of such sequences. Using this similarity function, updates on the action-value function of a st...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
S. Ozkan and A. Ozkan, “Selective word encoding for effective text representation,”
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
, pp. 1028–1040, 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/65216.