Text mining : a burgeroning quality improvement tool

J.Mohammad, Mohammad Alkin
While the amount of textual data available to us is constantly increasing, managing the texts by human effort is clearly inadequate for the volume and complexity of the information involved. Consequently, requirement for automated extraction of useful knowledge from huge amounts of textual data to assist human analysis is apparent. Text mining (TM) is mostly an automated technique that aims to discover knowledge from textual data. In this thesis, the notion of text mining, its techniques, applications are presented. In particular, the study provides the definition and overview of concepts in text categorization. This would include document representation models, weighting schemes, feature selection methods, feature extraction, performance measure and machine learning techniques. The thesis details the functionality of text mining as a quality improvement tool. It carries out an extensive survey of text mining applications within service sector and manufacturing industry. It presents two broad experimental studies tackling the potential use of text mining for the hotel industry (the comment card analysis), and in automobile manufacturer (miles per gallon analysis). Keywords: Text Mining, Text Categorization, Quality Improvement, Service Sector, Manufacturing Industry.


Development of tools for modeling hybrid systems with memory
Gökgöz, Nurgul; Öktem, Hakan; Department of Scientific Computing (2008)
Regulatory processes and history dependent behavior appear in many dynamical systems in nature and technology. For modeling regulatory processes, hybrid systems offer several advances. From this point of view, to observe the capability of hybrid systems in a history dependent system is a strong motivation. In this thesis, we developed functional hybrid systems which exhibit memory dependent behavior such that the dynamics of the system is determined by both the location of the state vector and the memory. T...
Information Decorrelation for an Interacting Multiple Model Filter
Acar, Duygu; Orguner, Umut (2018-07-13)
In a sensor network compensation of the correlated information caused by previous communication is of utmost interest for distributed estimation. In this paper, we investigate different information decorrelation approaches that can be applied when using an interacting multiple model filter in a local sensor node. The related decorrelation and the corresponding fusion operations are discussed. The different approaches are compared on a simple distributed single maneuvering target tracking example.
Domain adaptation on graphs by learning graph topologies: theoretical analysis and an algorithm
Vural, Elif (The Scientific and Technological Research Council of Turkey, 2019-01-01)
Traditional machine learning algorithms assume that the training and test data have the same distribution, while this assumption does not necessarily hold in real applications. Domain adaptation methods take into account the deviations in data distribution. In this work, we study the problem of domain adaptation on graphs. We consider a source graph and a target graph constructed with samples drawn from data manifolds. We study the problem of estimating the unknown class labels on the target graph using the...
Semantic Communications in Networked Systems: A Data Significance Perspective
Uysal, Elif; KAYA, ONUR; Ephremides, Anthony; Gross, James; Codreanu, Marian; Popovski, Petar; Assaad, Mohamad; Liva, Gianluigi; Munari, Andrea; Soret, Beatriz; Soleymani, Touraj; Johansson, Karl Henrik (2022-7-01)
We present our vision for a departure from the established way of architecting and assessing communication networks, by incorporating the semantics of information, defined not necessarily as the meaning of the messages, but as their significance, possibly within a real-time constraint, relative to the purpose of the data exchange. We argue that research efforts must focus on laying the theoretical foundations of a redesign of the entire process of information generation, transmission, and usage for networke...
Power Spectra of Constrained Codes with Level-Based Signaling: Overcoming Finite-Length Challenges
Centers, Jessica; Tan, Xinyu; Hareedy, Ahmed; Calderbank, Robert (2021-08-01)
In various practical systems, certain data patterns are prone to errors if written or transmitted. In magnetic recording and communication over transmission lines, data patterns causing consecutive transitions that are not sufficiently separated are prone to errors. In Flash memory with two levels per cell, data patterns causing high-low-high charge levels on adjacent cells are prone to errors. Constrained codes are used to eliminate error-prone patterns, and they can also achieve other goals. Recently, we ...
Citation Formats
M. A. J.Mohammad, “Text mining : a burgeroning quality improvement tool,” M.S. - Master of Science, Middle East Technical University, 2007.