Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Approximate Fisher Kernels of Non-iid Image Models for Image Categorization
Download
index.pdf
Date
2016-06-01
Author
Cinbiş, Ramazan Gökberk
Schmid, Cordelia
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
185
views
121
downloads
Cite This
The bag-of-words (BoW) model treats images as sets of local descriptors and represents them by visual word histograms. The Fisher vector (FV) representation extends BoW, by considering the first and second order statistics of local descriptors. In both representations local descriptors are assumed to be identically and independently distributed (iid), which is a poor assumption from a modeling perspective. It has been experimentally observed that the performance of BoW and FV representations can be improved by employing discounting transformations such as power normalization. In this paper, we introduce non-iid models by treating the model parameters as latent variables which are integrated out, rendering all local regions dependent. Using the Fisher kernel principle we encode an image by the gradient of the data log-likelihood w.r.t. the model hyper-parameters. Our models naturally generate discounting effects in the representations; suggesting that such transformations have proven successful because they closely correspond to the representations obtained for non-iid models. To enable tractable computation, we rely on variational free-energy bounds to learn the hyper-parameters and to compute approximate Fisher kernels. Our experimental evaluation results validate that our models lead to performance improvements comparable to using power normalization, as employed in state-of-the-art feature aggregation methods.
Subject Keywords
Statistical image representations
,
Object recognition
,
Image classification
,
Fisher kernels
URI
https://hdl.handle.net/11511/57819
Journal
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
DOI
https://doi.org/10.1109/tpami.2015.2484342
Collections
Department of Computer Engineering, Article
Suggestions
OpenMETU
Core
Image categorization using Fisher kernels of non-iid image models
Cinbiş, Ramazan Gökberk; Schmid, Cordelia (2012-01-01)
The bag-of-words (BoW) model treats images as an unordered set of local regions and represents them by visual word histograms. Implicitly, regions are assumed to be identically and independently distributed (iid), which is a poor assumption from a modeling perspective. We introduce non-iid models by treating the parameters of BoW models as latent variables which are integrated out, rendering all local regions dependent. Using the Fisher kernel we encode an image by the gradient of the data log-likelihood w....
TRANSFORMATION-INVARIANT DICTIONARY LEARNING FOR CLASSIFICATION WITH 1-SPARSE REPRESENTATIONS
Yuzuguler, Ahmet Caner; Vural, Elif; Frossard, Pascal (2014-05-09)
Sparse representations of images in well-designed dictionaries can be used for effective classification. Meanwhile, training data available in most realistic settings are likely to be exposed to geometric transformations, which poses a challenge for the design of good dictionaries. In this work, we study the problem of learning class-representative dictionaries from geometrically transformed image sets. In order to efficiently take account of arbitrary geometric transformations in the learning, we adopt a r...
Image annotation with semi-supervised clustering
Sayar, Ahmet; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2009)
Image annotation is defined as generating a set of textual words for a given image, learning from the available training data consisting of visual image content and annotation words. Methods developed for image annotation usually make use of region clustering algorithms to quantize the visual information. Visual codebooks are generated from the region clusters of low level visual features. These codebooks are then, matched with the words of the text document related to the image, in various ways. In this th...
Elimination of Non-Novel Segments at Multi-Scale for Few-Shot Segmentation
Kayabasi, Alper; Tufekci, Gulin; Ulusoy, İlkay (2023-01-01)
Few-shot segmentation aims to devise a generalizing model that segments query images from unseen classes during training with the guidance of a few support images whose class tally with the class of the query. There exist two domain-specific problems mentioned in the previous works, namely spatial inconsistency and bias towards seen classes. Taking the former problem into account, our method compares the support feature map with the query feature map at multi scales to become scale-agnostic. As a solution t...
Fine-grained object recognition and zero-shot learning in multispectral imagery
Sumbul, Gencer; Cinbiş, Ramazan Gökberk; AKSOY, SELİM (2018-05-05)
We present a method for fine-grained object recognition problem, that aims to recognize the type of an object among a large number of sub-categories, and zero-shot learning scenario on multispectral images. In order to establish a relation between seen classes and new unseen classes, a compatibility function between image features extracted from a convolutional neural network and auxiliary information of classes is learnt. Knowledge transfer for unseen classes is carried out by maximizing this function. Per...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
R. G. Cinbiş and C. Schmid, “Approximate Fisher Kernels of Non-iid Image Models for Image Categorization,”
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, pp. 1084–1098, 2016, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/57819.