Bilkent University
Department of Computer Engineering
Ph.D DISSERTATION

 

Towards Deeply Intelligent Interfaces in Relational Databases

 

Arif Usta
Ph.D Candidate
(Supervisor: Prof. Dr. Özgür Ulusoy)
Computer Engineering Department
Bilkent University

Relational databases is one of the most popular and broadly utilized infrastructures to store data in a structured fashion. In order to retrieve data, users have to phrase their information need in Structured Query Language (SQL). SQL isa powerfully expressive and flexible language, yet one has to know the schema underlying the database on which the query is issued and to be familiar with SQL syntax, which is not trivial for casual users. To this end, we propose two different strategies to provide more intelligent user interfaces to relational databases by utilizing deep learning techniques. As the first study, we propose a solution for keyword mapping in Natural Language Interfaces to Databases (NLIDB), which aims to translate Natural Language Queries (NLQs) to SQL. We define the keyword mapping problem as a sequence tagging problem, and propose a novel deep learning based supervised approach that utilizes POS tags of NLQs. Our proposed approach, called DBTagger (DataBase Tagger), is an end-to-end and schema independent solution. Query recommendation paradigm, a well-known strategy utilized in Web search engines broadly, is helpful to suggest queries ofexpert users to the casual users to help them with their information need. As the second study, we propose Conquer, a CONtextual QUEry Recommendation algorithm on relational databases exploiting deep learning. First, we train local embeddings of database tuples using Graph Convolutional Networks to extract feature representations of the tuples in latent space. We represent SQL queries with a semantic vector by averaging the embeddings of the tuple returned as a result of the query. We employ cosine similarity over the final representations of the queries to generate recommendations, as a Witness-Based approach. Our results show that classification accuracy of database rows as an indicator for local embeddings is compatible with state-of-the-art techniques.

 

DATE: 11 August 2021, Wednesday @ 13:30
PLACE: zoom