Bilkent University
Department of Computer Engineering
CS 590/690 SEMINAR

 

VISION TRANSFORMER IS ALL YOU NEED

 

Yunus Esergün

Master Student
(Supervisor: Prof. Dr. Özcan Öztürk)
Computer Engineering Department
Bilkent University

Abstract: In current years, "Attention Is All You Need” paper has become very popular in Natural Language Processing” area and “Attention” algorithms are claimed to be better than most of the deep learning applications. After that paper is published, which is approximately 3 years, with the help of “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale” paper, attention structure is started to be utilized in vision applications and the paper has claimed that vision transformers which include attention layers can be replaced with deep learning algorithms with small accuracy change and more performance in terms of time. After the paper referred is published, many more paper has been published. Those papers try to increase performance in vision applications by using attention layers and other techniques. As a result, in my presentation, I will talk about vision transformer “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale” paper and some vision transformer papers.

 

DATE: May 8, Monday @ 15:50 Place: Zoom