Bilkent University
Department of Computer Engineering
S E M I N A R

 

Extracting Protein Names from Biomedical Abstracts

 

Umit Tezcan

MSc.Student
Computer Engineering
Bilkent University

Automatically extracting information from ever-growing biomedical texts, stored in published articles, holds the promise of converting large amounts of biomedical knowledge into computer-accessible form. However, processing this information is a difficult task due to many reasons including the absence of a formal structure in the natural-language, having no standardized naming conventions, and the deficiency of a fixed nomenclature. The first stage of processing this valuable information is the identification of protein names in these documents. There have been several attempts to develop systems to identify protein names in biomedical texts. The approaches followed in those systems can be roughly categorized into three groups, that are, dictionary-based, heuristic rule-based, and statistical. We have implemented a dictionary-based system that generalizes the entities in a dictionary and learns rules for extracting protein names

 

DATE: April 18, 2005, Monday @ 16:40
PLACE: EA 409