Back to Pinar Duygulu's homepage |
Object Recognition as Machine
Translation
|
We propose a new approach to the object recognition problem, motivated by the recent availability of large annotated image collections. This approach considers object recognition as the translation of image regions to words, similar to the translation of text from one language to another. The lexicon for the translation is learned from large annotated image collections, which consist of images that are associated with text. Once learned, the correspondences between words and image regions can be used to predict words corresponding to particular image regions (region naming), or words associated with whole images (auto-annotation).
PeoplePinar Duygulu (Bilkent University)Kobus Barnard (University of Arizona) David Forsyth (UC Berkeley) Nando de Freitas (University of British Columbia) Publications
Kobus Barnard, Pinar Duygulu, David Forsyth, Nando de Freitas, David Blei, Michael Jordan Journal of Machine Learning Research, Vol 3. pp. 1107-1135, Special Issue on Machine Learning Methods for Text and Images , 2003 Data used in this study Pinar Duygulu, Kobus Barnard, Nando de Freitas, David Forsyth European Conference on Computer Vision (ECCV) Copenhagen, 2002 Best paper in Cognitive Vision award (also published in Lecture Notes in Computer Science, Volume 2353, pp, 97) Data used in this study Kobus Barnard, Pinar Duygulu, David Forsyth, SPIE Electronic Imaging 2002, Document Recognition and Retrieval IX, 20-25 January 2002, San Jose, California, USA Nando de Freitas, Kobus Barnard, Pinar Duygulu, David Forsyth 7th Valencia International Meeting on Bayesian Models for statistics/2002 ISBA International Meeting. June, 2002, Spain. Related linksComputer Vision meets Digital Libraries , UC Berkeley Digital Library Project Statistical Multimedia Learning Group at the University of British Columbia |