Scene Classification: A Comprehensive Study Combining Local and Global Descriptors

Cura, Burak Fatih
Sürer, Elif
In this paper, local region characteristics and overall structure of scene images are used for scene classification by combining different local and global descriptors. For this purpose, GIST, Histogram of Oriented Gradients (HOG), dense Scale-Invariant Feature Transform (SIFT), dense Speed-Up Robust Features (SURF), Daisy and Local Binary Patterns (LBP) features are classified individually and jointly with Support Vector Machine (SVM) by using different sizes of training sets. Evaluation tests were conducted on Places15, MIT indoor, SUN397 and Places365 datasets. Most used machine learning algorithms in scene classification literature-SVM with RBF and linear kernels, K-Nearest Neighbors and Random Forest-were evaluated on Places15 dataset for comparison. Besides accuracy, recall and precision, processing time for testing with SVM was measured individually and jointly for a deeper evaluation of the features.


Image categorization using Fisher kernels of non-iid image models
Cinbiş, Ramazan Gökberk; Schmid, Cordelia (2012-01-01)
The bag-of-words (BoW) model treats images as an unordered set of local regions and represents them by visual word histograms. Implicitly, regions are assumed to be identically and independently distributed (iid), which is a poor assumption from a modeling perspective. We introduce non-iid models by treating the parameters of BoW models as latent variables which are integrated out, rendering all local regions dependent. Using the Fisher kernel we encode an image by the gradient of the data log-likelihood w....
Texture and edge preserving multiframe super-resolution
Turgay, Emre; Akar, Gözde (Institution of Engineering and Technology (IET), 2014-09-01)
Super-resolution (SR) image reconstruction refers to methods where a higher resolution image is reconstructed using a set of overlapping aliased low-resolution observations of the same scene. Although edge preservation has been a widely explored topic in SR literature, texture-specific regularisation has recently gained interest. In this study, texture-specific regularisation is handled as a post-processing step. A two stage method is proposed, comprising multiple SR reconstructions with different regularis...
Yuzuguler, Ahmet Caner; Vural, Elif; Frossard, Pascal (2014-05-09)
Sparse representations of images in well-designed dictionaries can be used for effective classification. Meanwhile, training data available in most realistic settings are likely to be exposed to geometric transformations, which poses a challenge for the design of good dictionaries. In this work, we study the problem of learning class-representative dictionaries from geometrically transformed image sets. In order to efficiently take account of arbitrary geometric transformations in the learning, we adopt a r...
Scene representation technologies for 3DTV - A survey
Alatan, Abdullah Aydın; Gueduekbay, Ugur; Zabulis, Xenophon; Mueller, Karsten; ERDEM, Cigdem Eroglu; WEİGEL, Christian; SMOLİC, Aljoscha (Institute of Electrical and Electronics Engineers (IEEE), 2007-11-01)
3-D scene representation is utilized during scene extraction, modeling, transmission and display stages of a 3DTV framework. To this end, different representation technologies are proposed to fulfill the requirements of 3DTV paradigm. Dense point-based methods are appropriate for free-view 3DTV applications, since they can generate novel views easily. As surface representations, polygonal meshes are quite popular due to their generality and current hardware support. Unfortunately, there is no inherent smoot...
Semantic Reasoning for Scene Interpretation
Jensen, Lars B. W.; Baseski, Emre; Kalkan, Sinan; Pugeault, Nicolas; Woergoetter, Florentin; Krueger, Norbert (2008-01-01)
In this paper, we propose a hierarchical architecture for representing scenes, covering 2D and 3D aspects of visual scenes as well as the semantic relations between the different aspects. We argue that labeled graphs are a suitable representational framework for this representation and demonstrate its potential by two applications. As a first application, we localize lane structures by the semantic descriptors and their relations in a Bayesian framework. As the second application, which is in the context of...
Citation Formats
B. F. Cura and E. Sürer, “Scene Classification: A Comprehensive Study Combining Local and Global Descriptors,” 2019, Accessed: 00, 2020. [Online]. Available: