Show/Hide Menu
Hide/Show Apps
anonymousUser
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Açık Bilim Politikası
Açık Bilim Politikası
Frequently Asked Questions
Frequently Asked Questions
Browse
Browse
By Issue Date
By Issue Date
Authors
Authors
Titles
Titles
Subjects
Subjects
Communities & Collections
Communities & Collections
Approximate Fisher Kernels of Non-iid Image Models for Image Categorization
Download
index.pdf
Date
2016-06-01
Author
Cinbiş, Ramazan Gökberk
Verbeek, Jakob
Schmid, Cordelia
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
2
views
3
downloads
The bag-of-words (BoW) model treats images as sets of local descriptors and represents them by visual word histograms. The Fisher vector (FV) representation extends BoW, by considering the first and second order statistics of local descriptors. In both representations local descriptors are assumed to be identically and independently distributed (iid), which is a poor assumption from a modeling perspective. It has been experimentally observed that the performance of BoW and FV representations can be improved by employing discounting transformations such as power normalization. In this paper, we introduce non-iid models by treating the model parameters as latent variables which are integrated out, rendering all local regions dependent. Using the Fisher kernel principle we encode an image by the gradient of the data log-likelihood w.r.t. the model hyper-parameters. Our models naturally generate discounting effects in the representations; suggesting that such transformations have proven successful because they closely correspond to the representations obtained for non-iid models. To enable tractable computation, we rely on variational free-energy bounds to learn the hyper-parameters and to compute approximate Fisher kernels. Our experimental evaluation results validate that our models lead to performance improvements comparable to using power normalization, as employed in state-of-the-art feature aggregation methods.
Subject Keywords
Statistical image representations
,
Object recognition
,
Image classification
,
Fisher kernels
URI
https://hdl.handle.net/11511/57819
Journal
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
DOI
https://doi.org/10.1109/tpami.2015.2484342
Collections
Department of Computer Engineering, Article