Bilkent University
Department of Computer Engineering


Novelty Detection in Topic Tracking


Cem Aksoy
MSc. Student
Computer Engineering Department
Bilkent University

News portals provide many services to the news consumers such as information retrieval, personalized information filtering, summarization and news clustering. Additionally, many news portals using multiple sources enable their users to evaluate developments from different perspectives by richening the content. However, increasing number of sources and incoming news makes it difficult for news consumers to find news of their interest in news portals. Different types of organizational operations are applied to ease browsing over the news. New event detection and tracking (NEDT) is one of these operations which aim to organize news with respect to the events that they report. NEDT may not also be enough by itself to satisfy the news consumers. needs because of the repetitions of the information that may occur in the tracking news of a topic due to usage of multiple sources. In this thesis, we investigate usage of novelty detection (ND) in tracking news of a topic. For this aim, we built a Turkish ND experimental collection consisting of 59 topics with an average of 51 tracking news. We propose an effective and efficient ND method based on cover coefficient concept which is also language independent. We report experimental results using both Turkish dataset and TREC Novelty Track 2004 dataset. Additionally, we experiment on category-based threshold learning which has not been worked on previously in ND literature. We also provide some experimental pointers for ND in Turkish such as restriction of document vector lengths and usage of preprocessing methods.


DATE: 23 July, 2010, Friday @ 14:40