Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
APPLICATION OF TEXT MINING TO TECHNOLOGY MANAGEMENT DOMAIN TO EXTRACT TOPICS AND TRENDS
Download
yasar_tekin.pdf
Date
2022-1-17
Author
Tekin, Yaşar
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
501
views
292
downloads
Cite This
Topic modeling is a widely used technique to extract latent topics from large document collections. One of the most remarkable uses of it is its application to scientific fields. If topic modeling is applied to all articles published in a specific scientific field, it provides an overall view of topics and trends for the time period under consideration. If it is applied to a single conference or journal, it reveals differences from global trends. The most popular method used for topic modeling is Latent Dirichlet Allocation (LDA). Although LDA is used in many different fields, the problems of how to optimize model parameters and how to eliminate topic instability have not been fully solved yet. This thesis consists of two main parts: 1) An empirical investigation is conducted: a) to investigate the level of topic instability in ordered documents, b) to search for methods to eliminate (if not possible, to alleviate) the effects of the topic instability, c) to evaluate the use of word vector representations to optimize LDA parameters. It is found out that: a) the level of instability is high even in ordered documents, b) average scores of replicated topic models can be used to alleviate the effects of topic instability, c) Skip-gram similarity score is an acceptable measure in optimizing LDA parameters. 2) By using the method proposed, topic modeling is applied to Technology Management (TM) domain. Top topics, the most studied industries, the most used methods and surprising topics of TM literature are identified.
Subject Keywords
Technology Management
,
Topic Modeling
,
Latent Dirichlet Allocation
,
Parameter Optimization
,
Word Vector Representation
URI
https://hdl.handle.net/11511/95456
Collections
Graduate School of Social Sciences, Thesis
Suggestions
OpenMETU
Core
Comparison of feature-based and image registration-based retrieval of image data using multidimensional data access methods
Arslan, Serdar; Yazıcı, Adnan; Sacan, Ahmet; Toroslu, İsmail Hakkı; Acar, Esra (Elsevier BV, 2013-07-01)
In information retrieval, efficient similarity search in multimedia collections is a critical task In this paper, we present a rigorous comparison of three different approaches to the image retrieval problem, including cluster-based indexing, distance-based indexing, and multidimensional scaling methods. The time and accuracy trade-offs for each of these methods are demonstrated on three different image data sets. Similarity of images is obtained either by a feature-based similarity measure using four MPEG-...
Gibbs Sampling in Inference of Copula Gaussian Graphical Model Adapted to Biological Networks
Purutçuoğlu Gazi, Vilda (2017-09-01)
Markov chain Monte Carlo methods (MCMC) are iterative algorithms that are used in many Bayesian simulation studies, where the inference cannot be easily obtained directly through the defined model. Reversible jump MCMC methods belong to a special type of MCMC methods, in which the dimension of parameters can change in each iteration. In this study, we suggest Gibbs sampling in place of RJMCMC, to decrease the computational demand of the calculation of high dimensional systems. We evaluate the performance of...
Abstract or Full-text in Topic Modeling? Konu Modellemede Özet mi Tam Metin mi?
Tekin, Yasar; Coşar, Ahmet (2022-01-01)
Topic modeling is a text mining technique used for automatic extraction of topics addressed in document collections. Although there are different topic models proposed by researchers, the most preferred one is Latent Dirichlet Allocation (LDA). Despite such widespread use, uncertainties about LDA have not been fully resolved yet. In this study, the effect of using abstracts or full-text articles on LDA model parameters is investigated. For this purpose, LDA parameters are optimized on abstracts and full-tex...
Topic-centric querying of web information resources
Altıngövde, İsmail Sengör; Ulusoy, O; Ozsoyoglu, G; Ozsoyoglu, ZM (2001-01-01)
This paper deals with the problem of modeling web information resources using expert knowledge and personalized user information, and querying them in terms of topics and topic relationships. We propose a model for web information resources, and a query language SQL-TC (Topic-Centric SQL) to query the model. The model is composed of web-based information resources (XML or HTML documents on the web), expert advice repositories (domain-expert-specified metadata for information resources), and personalized inf...
Comparison of multidimensional data access methods for feature-based image retrieval
Arslan, Serdar; Saçan, Ahmet; Açar, Esra; Toroslu, İsmail Hakkı; Yazıcı, Adnan (2010-11-18)
Within the scope of information retrieval, efficient similarity search in large document or multimedia collections is a critical task. In this paper, we present a rigorous comparison of three different approaches to the image retrieval problem, including cluster-based indexing, distance-based indexing, and multidimensional scaling methods. The time and accuracy tradeoffs for each of these methods are demonstrated on a large Corel image database. Similarity of images is obtained via a featurebased similarity...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
Y. Tekin, “APPLICATION OF TEXT MINING TO TECHNOLOGY MANAGEMENT DOMAIN TO EXTRACT TOPICS AND TRENDS,” Ph.D. - Doctoral Program, Middle East Technical University, 2022.