Matching words and pictures

Download

index.pdf

Date

2003-08-15

Author

Barnard, K
Duygulu, P
Forsyth, D
de Freitas, N
Blei, DM
Jordan, MI

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

45
views

0
downloads

We present a new approach for modeling multi-modal data sets, focusing on the specific case of segmented images with associated text. Learning the joint distribution of image regions and words has many applications. We consider in detail predicting words associated with whole images (auto-annotation) and corresponding to particular image regions (region naming). Auto-annotation might help organize and access large collections of images. Region naming is a model of object recognition as a process of translating image regions to words, much as one might translate from one language to another. Learning the relationships between image regions and semantic correlates (words) is an interesting example of multi-modal data mining, particularly because it is typically hard to apply data mining techniques to collections of images. We develop a number of models for the joint distribution of image regions and words, including several which explicitly learn the correspondence between regions and words. We study multi-modal and correspondence extensions to Hofmann's hierarchical clustering/aspect model, a translation model adapted from statistical machine translation (Brown et at.), and a multi-modal extension to mixture of latent Dirichlet allocation (MoM-LDA). All models are assessed using a large collection of annotated images of real scenes. We study in depth the difficult problem of measuring performance. For the annotation task, we look at prediction performance on held out data. We present three alternative measures, oriented toward different types of task. Measuring the performance of correspondence methods is harder, because one must determine whether a word has been placed on the right region of an image. We can use annotation performance as a proxy measure, but accurate measurement requires hand labeled data, and thus must occur on a smaller scale. We show results using both an annotation proxy, and manually labeled data.

Subject Keywords

Physics and Astronomy (miscellaneous)

URI

https://hdl.handle.net/11511/68274

Journal

JOURNAL OF MACHINE LEARNING RESEARCH

DOI

https://doi.org/10.1162/153244303322533214

Collections

Department of Computer Engineering, Article

Suggestions

OpenMETU
Core

Model independent analysis of Lambda baryon polarizations in Lambda(b)->Lambda l(+)l(-) decay Alıyev, Tahmasıb; Özpineci, Altuğ; Savcı, Mustafa (American Physical Society (APS), 2003-02-01) We present a model independent analysis of Lambda baryon polarizations in the Lambda(b)-->Lambdal(+)l(-) decay. The sensitivity of the averaged Lambda polarizations to the new Wilson coefficients is studied. It is observed that there exist certain regions of the new Wilson coefficients where the branching ratio coincides with the standard model prediction, while the Lambda baryon polarizations deviate from the standard model results remarkably.
CP violation in the inclusive b -> sg decay in the framework of multi-Higgs-doublet models Goksu, A; Iltan, EO; Solmaz, L (American Physical Society (APS), 2001-09-01) We study the decay width and CP asymmetry of the inclusive process b-->sg (g denotes gluon) in the multi-Higgs-doublet models with complex Yukawa couplings, including next to leading QCD corrections. We analyze the dependences of the decay width and CP asymmetry on the scale mu and CP-violating parameter theta. We observe that there exists an enhancement in the decay width and CP asymmetry is at the order of 10(-2).
Radiative decays of the heavy tensor mesons in light cone QCD sum rules Alıyev, Tahmasıb; Savcı, Mustafa (American Physical Society (APS), 2019-01-11) The transition form factors of the radiative decays of the heavy tensor mesons to heavy pseudoscalar and heavy vector mesons are calculated in the framework of the light-cone QCD sum rules method at the point Q(2) = 0. Using the obtained values of the transition form factors at the point Q(2) = 0, the corresponding decay widths are estimated. The results show that the radiative decays of the heavy-light tensor mesons could potentially be measured in the future planned experiments at LHCb.
Shortcuts to high symmetry solutions in gravitational theories Deser, S; Tekin, Bayram (IOP Publishing, 2003-11-21) We apply the Weyl method, as sanctioned by Palais' symmetric criticality theorems, to obtain those-highly symmetric-geometries amenable to explicit solution, in generic gravitational models and dimension. The technique consists of judiciously violating the rules of variational principles by inserting highly symmetric, and seemingly gauge fixed, metrics into the action, then varying it directly to arrive at a small number of transparent, indexless, field equations. Illustrations include spherically and axial...
Distributions of topological observables in inclusive three- and four-jet events in pp collisions at root s=7 TeV Khachatryan, V.; et. al. (Springer Science and Business Media LLC, 2015-07-01) This paper presents distributions of topological observables in inclusive three- and four-jet events produced in pp collisions at a centre-of-mass energy of 7 TeV with a data sample collected by the CMS experiment corresponding to a luminosity of 5.1 fb(-1). The distributions are corrected for detector effects, and compared with several event generators based on two- and multi-parton matrix elements at leading order. Among the considered calculations, MadGraph interfaced with PYTHIA6 displays the overall be...

Citation Formats

K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. Blei, and M. Jordan, “Matching words and pictures,” JOURNAL OF MACHINE LEARNING RESEARCH, pp. 1107–1135, 2003, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/68274.