Back to Pinar Duygulu's homepage
 
Object Recognition as Machine Translation


We propose a new approach to the object recognition problem, motivated by the recent availability of large annotated image collections. This approach considers object recognition as the translation of image regions to words, similar to the translation of text from one language to another. The lexicon for the translation is learned from large annotated image collections, which consist of images that are associated with text. Once learned, the correspondences between words and image regions can be used to predict words corresponding to particular image regions (region naming), or words associated with whole images (auto-annotation).




tiger cat grass






>>>



 

People

Pinar Duygulu (Bilkent University)
Kobus Barnard (University of Arizona)
David Forsyth (UC Berkeley)
Nando de Freitas (University of British Columbia)

Publications

  • Matching Words and Pictures (.ps.gz)
    Kobus Barnard, Pinar Duygulu, David Forsyth, Nando de Freitas, David Blei, Michael Jordan
    Journal of Machine Learning Research, Vol 3. pp. 1107-1135, Special Issue on Machine Learning Methods for Text and Images , 2003
    Data used in this study

  • Object Recognition as Machine Translation: Learning a lexicon for a fixed image vocabulary (.ps.gz)
    Pinar Duygulu, Kobus Barnard, Nando de Freitas, David Forsyth
    European Conference on Computer Vision (ECCV) Copenhagen, 2002
    Best paper in Cognitive Vision award
    (also published in Lecture Notes in Computer Science, Volume 2353, pp, 97)
    Data used in this study

  • Modeling the statistics of image features and associated text
    Kobus Barnard, Pinar Duygulu, David Forsyth,
    SPIE Electronic Imaging 2002, Document Recognition and Retrieval IX, 20-25 January 2002, San Jose, California, USA

  • Massive Multimedia Databases: a new frontier.
    Nando de Freitas, Kobus Barnard, Pinar Duygulu, David Forsyth
    7th Valencia International Meeting on Bayesian Models for statistics/2002 ISBA International Meeting. June, 2002, Spain.

    Related links

    Computer Vision meets Digital Libraries , UC Berkeley Digital Library Project

    Statistical Multimedia Learning Group at the University of British Columbia