Incremental clustering with vector expansion for online event detection in microblogs

2017-11-04
Identifying similarities in microblog posts for event detection poses challenges due to short texts with idiosyncratic spellings, irregular writing styles, abbreviations and synonyms. In order to overcome these challenges, we present an enhancement to the incremental clustering techniques by detecting similar terms in microblog posts in a temporal context. We devise an unsupervised method to measure the similarities online using co-occurrence-based techniques and use them in a vector expansion process. The results of our evaluation performed on a tweet set indicate that the proposed vector expansion method helps identify similarities in tweets despite differences in their content. This facilitates the clustering of tweets and detection of events with higher accuracy without incurring a high execution cost.
SOCIAL NETWORK ANALYSIS AND MINING

Suggestions

Semantic Expansion of Hashtags for Enhanced Event Detection in Twitter
Özdikiş, Özer; Karagöz, Pınar; Oğuztüzün, Mehmet Halit Seyfullah (2012-09-09)
In this work, we present an event detection method in Twitter based on clustering of hashtags and introduce an enhancement technique by using the semantic similarities between the hashtags. To this aim, we devised two methods for tweet vector generation and evaluated their effect on clustering and event detection performance in comparison to word-based vector generation methods. By analyzing the contexts of hashtags and their co-occurrence statistics with other words, we identify their paradigmatic relation...
Beware the range in RANGE, and the academic in AWL
Neufeld, Steve; Hancioglu, Nilgun; Eldridge, John (2011-12-01)
This article examines a recent example of published research on the vocabulary profile of a financial corpus based on the Academic Word List (AWL) to illustrate not only the erroneous output from vocabulary profiling tools but also the pitfalls of using the AWL as a filter for academic lexis.
CROSS-LEVEL TYPING THE LOGICAL FORM FOR OPEN-DOMAIN SEMANTIC PARSING
Öztürel, İsmet Adnan; Bozşahin, Cem; Department of Cognitive Sciences (2022-8-29)
This thesis presents a novel approach to assigning types to expressive Discourse Representation Structure (DRS) meaning representations. In terms of linguistic analysis, our typing methodology couples together the representation of phenomena at the same level of analysis that was traditionally considered to belong to distinctive layers. In the thesis, we claim that the realisation of sub-lexical, lexical, sentence and discourse-level phenomena (such as tense, word sense, named entity class, thematic role, a...
Efficient Name Disambiguation for Large Scale Datasets.
Huang, Jian; Ertekin Bolelli, Şeyda; Giles, C Lee (2006-09-18)
Name disambiguation can occur when one is seeking a list of publications of an author who has used different name variations and when there are multiple other authors with the same name. We present an efficient integrative framework for solving the name disambiguation problem: a blocking method retrieves candidate classes of authors with similar names and a clustering method, DBSCAN, clusters papers by author. The distance metric between papers used in DBSCAN is calculated by an online active selection supp...
TRANSFORMATION-INVARIANT DICTIONARY LEARNING FOR CLASSIFICATION WITH 1-SPARSE REPRESENTATIONS
Yuzuguler, Ahmet Caner; Vural, Elif; Frossard, Pascal (2014-05-09)
Sparse representations of images in well-designed dictionaries can be used for effective classification. Meanwhile, training data available in most realistic settings are likely to be exposed to geometric transformations, which poses a challenge for the design of good dictionaries. In this work, we study the problem of learning class-representative dictionaries from geometrically transformed image sets. In order to efficiently take account of arbitrary geometric transformations in the learning, we adopt a r...
Citation Formats
O. Ozdikis, P. Karagöz, and M. H. S. Oğuztüzün, “Incremental clustering with vector expansion for online event detection in microblogs,” SOCIAL NETWORK ANALYSIS AND MINING, pp. 0–0, 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/33056.