Applying Information Retrieval Paradigms in Natural Language to SQL Translation
(Project No: 118E724)

·         SPONSOR: Scientific and Technical Research Council of Turkey - TÜBITAK

·         ABSTRACT: The goal of the project is to develop new algorithms for the integration of well known information retrieval paradigms into Natural Language Interface to Databases (NLIDBs) to improve search behavior in relational databases by making it more user friendly and effective.

Natural language is the ideal choice of users when they are expressing their search needs. However, although it is quite effective to have natural language interfaces in web search, it is challenging to have those interfaces in Relational Database Management Systems (RDBMSs). In RDBMSs, the querying language is SQL, which is a powerful and flexible language to express user intent. However, casual users cannot query over relational databases due to need of having technical background. Therefore, natural language to SQL translation has become quite crucial to provide such interfaces to the users.

Having a natural language query interface is the first step to make towards providing user friendly environment to the users for their search experience. For this purpose, there are well known paradigms introduced in information retrieval field which try to make the search more user friendly. However, these paradigms require certain modifications to be applicable in NLIDBs.

In this project, we aim to develop different information retrieval based algorithms to be used in translation context to make it more user friendly and more effective. The information retrieval paradigms involved in our algorithms are listed below:

1. Query auto-completion to resolve ambiguities inside the query to increase translation accuracy,

2. Context aware query recommendation to suggest candidate queries to the users given the context to improve search experience,

3. Information retrieval based ranking to rank queries to be listed as candidate translations given a query to substitute for state-of-the-art translation.

The algorithms will be integrated to a natural language to SQL translation system from the literature.

Although natural language to SQL translation is a quite popular topic in database community, there have not been any works that incorporate well known information retrieval paradigms into that context, which makes our proposal novel. We believe that using these paradigms will help translation to be more effective and improve search performance in relational databases.

·         DURATION: February 2019 - February 2021

·         PRINCIPAL INVESTIGATOR: Özgür Ulusoy

·         GRADUATE STUDENTS: Arif Usta, Akifhan Karakayalı, Mousa Farshkar Azari

·         BUDGET: 307,500 TL (~$60,000)