Seminar in Computer Engineering

Bilkent University
Department of Computer Engineering
SEMINAR

Bridging Vision and Language, and Beyond: A Unified Journey in Multimodal Generative AI

Prof. Dr. Erkut Erdem

Department of Computer Engineering
Hacettepe University
https://web.cs.hacettepe.edu.tr/~erkut/

Abstract: Over the past decade, within our research group, we have advanced a unified perspective that bridges computer vision and natural language processing, aiming to develop models that more effectively process, understand, and manipulate visual data. In this talk, I will present a personal and comprehensive overview of our contributions to what is now broadly recognized as multimodal generative AI. I will highlight some of our past and recent efforts in modeling and benchmarking approaches that tightly couple vision and language and extend naturally to additional modalities. Through these examples, I will argue that a unified multimodal viewpoint offers a powerful path toward more general and capable AI systems.

Bio: Erkut Erdem is a Professor in the Department of Computer Engineering at Hacettepe University and is co-affiliated with the KUIS AI Center. He received his Ph.D. from Middle East Technical University (METU) and completed postdoctoral research at TELECOM ParisTech, France. He is a founding member of the Hacettepe University Computer Vision Laboratory (HUCVL). His research focuses on developing advanced methods for understanding and manipulating visual data, with a particular emphasis on leveraging additional modalities, such as natural language, as complementary sources of guidance. His contributions have been recognized with the Outstanding Young Scientist Award (GEBIP) from the Turkish Academy of Sciences in 2018, and he is a recent awardee of the TÜBİTAK 2247-A National Outstanding Researchers Program.

DATE: December 01, Monday @ 13:30

Place: EA 409