Bilkent University
Department of Computer Engineering
MS THESIS PRESENTATION

 

TEXT MINING ANALYSIS OF TRANSLATION, SOCIAL COMMUNICATION AND LITERARY WRITING FOR TURKISH

 

Sevil Çalışkan
MS Student
(Supervisor: Prof. Dr. Fazlı Can)
Computer Engineering Department
Bilkent University

Text mining is an important research area considering the increase in text generation and the need for analysis. Text mining in Turkish is still not a well-invested research area, compared to the other languages. In this thesis, we analyze different types of Turkish text from different points of view, having an overall review on text mining in Turkish at the end. First, we analyze the translation quality of a Turkish text, My Names is Red novel, to English, French, and Spanish with the features generated for each chapter. With the proposed method, translations can be quantified without any parallel comparisons. Then, we analyze the Turkish spoken texts of 98 people in terms of gender and age attributes of the speakers. We also analyze the difference between written and spoken texts in Turkish. Results show that it is possible to predict the attributes of the speaker from the spoken text and written and spoken texts are significantly different in terms of stylometric measures. Lastly, we make an assessment on cross-lingual transferring performances of multilingual networks from English to Turkish. We see that transferring is possible; however zero-shot cross-lingual transferring still has its way to be competitive with monolingual networks for Turkish.

 

DATE: 14 December 2020, Monday @ 10:00