Bilkent University
Department of Computer Engineering
MS THESIS PRESENTATION

 

Improving Educational Search and Question Answering

 

Tolga Yılmaz
MS Student
(Supervisor: Prof. Dr. Özgür Ulusoy)
Computer Engineering Department
Bilkent University

Students use general web search engines (GSEs) as their primary source of research while trying to find answers to school related questions. Although GSEs are highly relevant for the general population, they may return results that are out of education context. Another rising trend; social community question answering websites (CQ&A) are the secondary choice for students who try to get answers from other peers online. We focus on discovering possible improvements on educational search by leveraging both of the two information sources.

The first part of our work involves Q&A websites. In order to gain contextual and behavioral insights, we have extracted the content of a fairly used Q&A website with a scraper we implemented. We analyze the content in terms of user behaviour and try to understand to what extent the educational Q&A differs from the general purpose Q&A.

In the second part, we implement a classiffer for educational questions. This classiffer is built by an ensemble method that employs several regular learning algorithms and retrieval based ones that utilize external resources. We also build a query expander to facilitate classiffcation. We further improve the classiffcation using search engine results.

In the third part, in order to find out whether search engine ranking can be improved in the education domain using the classiffcation model, we collect and label a set of query results retrieved from a GSE. We use five ad-hoc methods to improve search ranking based on the idea that the query-document category relation is an indicator of relevance. We evaluate these methods on various query sets and show that some of the methods significantly improve the rankings in the education domain.

In the last part, we focus on educational spell checking. In educational search systems, it is common for users to make spelling mistakes. In the first part of our work, actual query logs of two commercial search engines in the education domain are analyzed in terms of spelling mistakes using 5 well-known spell correction software that are not education specific and lack the terms that are used in the education field. It is shown that by extending the spell-check dictionary of one of them, even with a small-sized education oriented word-list, one can improve the precision, recall and F1 values of a spell-checker.

 

DATE: 30 June, 2016, Thursday @ 13:40
PLACE: EA-409