Example-Based Machine Translation Using Morphological Correspondences


Turhan Osman Daybelge
MSc. Student
Computer Engineering Department
Bilkent University

Machine Translation (MT) is a subfield of Computational Linguistics that focuses on developing algorithms and software to translate text from one natural language to another. Grammar-based approaches to MT require large amounts of language-pair dependent, hand-crafted, linguistic rules in order to achieve acceptably accurate translations. Example-Based Machine Translation (EBMT) is a Machine Learning approach to MT that employs the translation by analogy paradigm. In EBMT, translation templates are extracted automatically from bilingual corpora of sentence pairs, getting rid of the need for hand-crafted rules. In this study, we focus on translation between Turkish and English; our approach, however, is language independent. We use morphologically analyzed sentence pairs in order to generate translation templates that take the morphological correspondences into consideration. Translation templates are learned by inspecting similarities and differences in sentence pairs and are later used in order to translate new sentences. Since morphological representations of words in natural language texts are often ambiguous, a need for morphological disambiguation arises. We present a rule-based morphological disambiguator developed for this purpose. Another problem that needs to be solved is the ranking of translation results. In our system, translation results are ranked according to a confidence factor metric calculated from statistical data collected during the learning process.


DATE: 26 March, 2007, Monday@ 15:40