Advanced School for Computing and Imaging (ASCI)

ASCI office
Delft University of Technology
Building 28, room 04.E120
Van Mourik Broekmanweg 6
2628 XE – DELFT, The Netherlands

P: +31 15 27 88032

Visiting hours office
Monday, Tuesday, Thursday: 10:00 – 15:00


The ASCI office is located at the Delft University of Technology campus.  It is easily accessible by bicycle, public transport and car. The numbers of buildings can help you find your way around the campus. Make sure you remember the name and building number of your destination.

Contact us at +31 15 278 8032 or send us an email at

Learning Continuity for Image and Video Recognition

Learning Continuity for Image and Video Recognition

Author : Jiaojiao Zhao
Promotor(s) : Prof.dr. C.G.M. Snoek / Dr. P.S.M. Mettes
University : University of Amsterdam
Year of publication : 2022
Link to repository :


his thesis aims at learning continuity for visual recognition. As a natural property of images and videos, continuity is important for many computer vision tasks. The thesis strives to answer the research question ” What is the benefit of continuity for image and video recognition? ” Therefore, the thesis includes two parts, respectively looking into spatial continuity of images and spatio-temporal continuity of videos. Part I is specifically for learning continuity for image recognition. In Chapter 2, we explore spatial continuity for image colorization. Chapter 3 presents a new pooling method maintaining better spatial continuity. Part II aims at learning continuity for video recognition. The goal of Chapter 4 is to utilize temporal continuity for action detection. Chapter 5 targets on endowing a 3D-Convnet with spatio-temporal continuity. In Chapter 6, we propose TubeR: a simple solution for spatio-temporal video action detection. To summarize, this thesis aims at studying continuity for image and video recognition. In depth, we start with the benefit of learning continuity for images or videos in each part, and then respectively dig into technological innovations of exploiting continuity in various network architectures. In breadth, the thesis explores spatial continuity for images and spatio-temporal continuity for videos. Specifically, it covers image colorization, image classification, semantic segmentation, video action detection, video action recognition, and video object segmentation. We hope our journey is able to stimulate more research on image and video continuity.