Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction
Download
index.pdf
Date
2017-10-29
Author
Ozkan, Savas
Akar, Gözde
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
174
views
67
downloads
Cite This
Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66.7%). Lastly, we believe that this method can be extended to different problems such as action/event recognition in future.
Subject Keywords
Computer Science, Artificial Intelligence
,
Engineering, Electrical & Electronic
URI
https://hdl.handle.net/11511/47843
DOI
https://doi.org/10.1109/iccvw.2017.366
Collections
Department of Electrical and Electronics Engineering, Conference / Seminar
Suggestions
OpenMETU
Core
Adaptive mean-shift for automated multi object tracking
Beyan, C.; Temizel, Alptekin (2012-01-01)
Mean-shift tracking plays an important role in computer vision applications because of its robustness, ease of implementation and computational efficiency. In this study, a fully automatic multiple-object tracker based on mean-shift algorithm is presented. Foreground is extracted using a mixture of Gaussian followed by shadow and noise removal to initialise the object trackers and also used as a kernel mask to make the system more efficient by decreasing the search area and the number of iterations to conve...
Enhancing the accuracy of the interpolations and anterpolations in MLFMA
Ergül, Özgür Salih (Institute of Electrical and Electronics Engineers (IEEE), 2006-01-01)
We present an efficient technique to reduce the interpolation and anterpolation (transpose interpolation) errors in the aggregation and disaggregation processes of the multilevel fast multipole algorithm (MLFMA), which is based on the sampling of the radiated and incoming fields over all possible solid angles, i.e., all directions on the sphere. The fields sampled on the sphere are subject to various operations, such as interpolation, aggregation, translation, disaggregation, anterpolation, and integration....
3-D structure assisted reference view generation for H.264 based multi-view video coding
Gedik, O. Serdar; Oezkalayci, Burak; Alatan, Abdullah Aydın (2007-06-13)
A 3D geometry-based multi-view video coding (MVC) method is proposed. In order to utilize the spatial redundancies between multiple views, the scene geometry is estimated as dense depth maps. The dense depth estimation problem is modeled by using a Markov random field (MRF) and solved via the belief propagation algorithm. Relying on these depth maps of the scene, novel view estimates of the intermediate views of the multi-view set is obtained with a 3D warping algorithm, which also performs hole-filling in ...
3D object recognition from range images using transform invariant object representation
AKAGÜNDÜZ, erdem; Ulusoy, İlkay (Institution of Engineering and Technology (IET), 2010-10-28)
3D object recognition is performed using a scale and orientation invariant feature extraction method and a scale and orientation invariant topological representation. 3D surfaces are represented by sparse, repeatable, informative and semantically meaningful 3D surface structures, which are called multiscale features. These features are extracted with their scale (metric size and resolution) using the classified scale-space of 3D surface curvatures. Triplets of these features are used to represent the surfac...
Fusion of Image Segmentations under Markov Random Fields
Karadag, Ozge Oztimur; Yarman Vural, Fatoş Tunay (2014-08-28)
In this study, a fast and efficient consensus segmentation method is proposed which fuses a set of baseline segmentation maps under an unsupervised Markov Random Fields (MRF) framework. The degree of consensus among the segmentation maps are estimated as the relative frequency of co-occurrences among the adjacent segments. Then, these relative frequencies are used to construct the energy function of an unsupervised MRF model. It is well-known that MRF framework is commonly used for formulating the spatial r...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
S. Ozkan and G. Akar, “Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction,” 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/47843.