Senior Project Topics for 2011-2012 Academic Year

 

 

·        Image Annonation Based on Retrieval (3 students) (Will be jointly supervised with Dr. Ugur Gudukbay)

 

Makadia et al. [1] describes a propagation method in which a test image is annotated by propagating the tags from similar training images. Given a test image, first similar images which are already annotated are retrieved using a similarity measure. This is the retrieval phase. After similar annotated images are retrieved the annotation data is used or propagated to the test image. Although the training set should be large enough to find reasonable neighbors for a succesful propagation, this approach has shown to provide successful results.

 

In [2], we describe a procedure based on constructing codebooks for each semantic label. In the training phase for each semantic/annotation label you retrieve all images that are tagged with it. Then by using a descriptor, e.g. SIFT, you construct a codebook.  In the testing phase of the method, given a test image, for each extracted feature you calculate the distance between it and the nearest visual word of a codebook. The summation of distances then gives you the image distance to that particular codebook. In the end you use k semantics of codebooks having the lowest distance with the test image.

 

Improvements can be made in areas of:

 

o  Integrating contextual information [3].

o  Integrating spatial information of the visual words if codebook framework is going to be used [2]

o  Weighting the visual words according to the discriminative power.

o  Combining other feature descriptors to your framework.

 

References

[1] Ameesh Makadia, Vladimir Pavlovic, and Sanjiv Kumar (2010). Baselines for Image Annotation. In International Journal of Computer Vision.

 

[2] Fatih Cakir (2011). Nearest Neighbor based Metric Functions for Indoor Scene Recognition. MS thesis. Bilkent University.

 

[3] Yu Xiang; Xiangdong Zhou; Zuotao Liu; Tat-Seng Chua; Chong-Wah Ngo; , "Semantic context modeling with maximal margin Conditional Random Fields for automatic image annotation," Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on , vol., no., pp.3368-3375, 13-18 June 2010.

 

 

·  BilAudio-7: MPEG-7 Compatible Audio Search Engine (3 students) (Will be jointly supervised with Dr. Ugur Gudukbay)

 

In this project, you will develop an MPEG-7 compatible audio indexing and retrieval system (BilAudio-7). You will be building upon a partially completed version of BilAudio-7. The system will consist of 3 main components:

  1. Audio Indexing: Segmenting/annotating audio data and producing an MPEG-7 compatible XML file, which will be stored in an XML database.
  2. Query Interface: A graphical user interface which will enable users to formulate keyword and similarity-based audio queries on the audio database and view/manipulate the search results.
  3. Query Processing: Receiving user queries, executing them and returning the most relevant list of audio segments to the clients.

 

 

·  BilMAT: MPEG-7 Annotation Tool (3 students) (Will be jointly supervised with Dr. Ugur Gudukbay)

 

MPEG-7 is an ISO standard developed by MPEG group to standardize multimedia indexing and retrieval and make multimedia data (audio/video/image) as searchable as text. It is necessary to process the multimedia data and extract low-level/high-level features for indexing. In this project, you will develop an MPEG-7 annotation tool to annotate audio, video and images. The tool should have an easy-to use GUI and facilities to minimize the amount of human effort and time for annotation.

  1. C++ code is available for the extraction of low-level MPEG-7 features
  2. The system will be developed in C++ and should be able to run both on Linux and Windows