Bilkent University
Department of Computer Engineering


Searching video for complex activities with finite state models


Nazli Ikizler
Ph.D Student
Computer Engineering Department
Bilkent University

Understanding what people are doing is one of the great unsolved problems of computer vision. A fair solution opens tremendous application possibilities, ranging from security to improved surveillance, sign language to gesture based interfaces. While there has been extensive study on this topic, understanding activities that depend on the detailed information of the body is still very hard. The major difficulties have been that (a) good kinematic tracking is hard; (b) models typically have too many parameters to be learned directly from data; and (c) for much everyday behaviour, there isn't a taxonomy. In this study, we aim to understand human motion by tools of searching and recognizing complex human activities. By complex activities, we refer to composite actions where in a given time sequence, the human subject performs more than one single action. We describe a method of representing human activities that allows a collection of motions to be queried without examples. Our approach is based on units of activity at segments of the body, that can be composed across space and across the body to produce complex queries. The presence of search units is inferred automatically by tracking the body, lifting the 2D tracks to 3D and comparing to Hidden Markov Model(HMM)s trained using motion capture data. Automatic motion segmentation is also achieved by linking the action models to form activity models.


DATE: November 20, 2006, Monday@ 15:40