Bilkent University
Department of Computer Engineering


Ottoman Document Analysis


Hande Adiguzel
MSc Student
Computer Engineering Department
Bilkent University

Large archives of historical documents attract many researchers from all around the world. The increasing demand to access those archives makes automatic retrieval and recognition of these documents crucial.

Ottoman archives are one of the largest collections of historical documents; they include more than 150 million documents ranging from military reports to economic and political correspondences belonging to the Ottoman era. This thesis proposes two Ottoman document analysis studies; first one is segmentation where layout, line and word segmentation is studied. Layout segmentation is obtained via Log-Gabor filtering where 4 different algorithms are proposed for line segmentation and finally a simple morphological method is preferred for word segmentation. Experiments constructed both with Ottoman and other languages (English, Greek, Bangla) show that segmentation algorithms gives promising results. Second task aims to detect Islamic patters in Kufic images. Sub-pattern matching is used for the analysis. Given a query pattern, all the instances can be found through retrieval. Going further, through known patterns images can be automatically labeled in the entire dataset. Finally, patterns that repeat inside an image can be automatically discovered. Graph and sub-graph isomorphism is used for detecting the sub-patterns.

Promising results are obtained for finding the instances of a query pattern and for fully automatic detection of repeating patterns on a square Kufic image collection.


DATE: 23 July, 2013, Tuesday @ 10:00