Project topics for CS490 (Spring 2008)

 

1. Boosting the performance of search engines

 

Typing a few keywords and “google”ing for something that you are interested is a common practice of your life today.

But do you ever wonder... how a search engine finds a single piece of information (say, the latest album of a megastar,

the picture of a beautiful countryside or some clues about your algorithms homework!!) so quickly among billions of

documents? what sort of indexes, algorithms and architectural tricks make this possible? and finally, is it possible to

make it faster?

 

If your is answer “yes!” to above questions, this research project will be a great opportunity to get a background

on all of these, to investigate new ideas in this exciting field, and, who knows, to obtain results good enough for

publication... If you are really ambitious, you can even find something that can make you rich!!!

 

In this study, you will try to boost the search engine performance by combining two popular techniques, namely, index

caching and pruning. This is a well-defined research project with the following stages:

 

·        Getting familiar with the topic: This requires some guided reading about collection indexing, index caching and

 index pruning.

·        Research: You will first investigate a hybrid approach that combines caching and pruning techniques.

New ideas may always arise, and always welcome!!

·        Implementation: You can adapt codes either developed within our research group or publicly available

on the net. You may also write a reasonable amount of code for implementing the “new” ideas. In any case,

you should be good at programming in at least one of the C/C++ or Java.

·        Experimentation: You will evaluate how good the proposed ideas are with respect to the state-of-the-art solutions.

 

Expected output: We expect this research would shed light on the overall system performance in an environment

where index caching and pruning used together. The result is expected to be –at least- worthwhile to prepare a formal

technical report”. According to the quality and originality of the work, and the final results, the entire study can also

be considered for submission to an international conference and/or journal. If accepted, you will have your first scientific

paper, even before applying for an MS or PhD degree!

 

   

2. Image/Video crawler  
 
This project involves crawling the web to collect images/videos for database construction. The system to be developed 
for crawling should have utilities to collect specific type of media along with the nearby relevant labels. 
 
 
3. Paper search & bibliography generation  
 
We often need to search for the publications on the web relevant to our research. The system to be developed 
by this project will search for a list of web sites (or use google search results) and present the collected information 
in an easy to use format. 
 
Some of the facilities provided by this system will be:
 
·        A link to the pdf file of the paper if it exists,
·        Abstract of the paper,
·        Bibtex entry to be included as reference in latex documents, 
·        etc.
 
If done properly, this tool can be very useful and used by many researches if it is made available online.