Learning Continuity for Image and Video Recognition
|Author||: Jiaojiao Zhao|
|Promotor(s)||: Prof.dr. C.G.M. Snoek / Dr. P.S.M. Mettes|
|University||: University of Amsterdam|
|Year of publication||: 2022|
|Link to repository||: Dare.uva.nl|
his thesis aims at learning continuity for visual recognition. As a natural property of images and videos, continuity is important for many computer vision tasks. The thesis strives to answer the research question ” What is the benefit of continuity for image and video recognition? ” Therefore, the thesis includes two parts, respectively looking into spatial continuity of images and spatio-temporal continuity of videos. Part I is specifically for learning continuity for image recognition. In Chapter 2, we explore spatial continuity for image colorization. Chapter 3 presents a new pooling method maintaining better spatial continuity. Part II aims at learning continuity for video recognition. The goal of Chapter 4 is to utilize temporal continuity for action detection. Chapter 5 targets on endowing a 3D-Convnet with spatio-temporal continuity. In Chapter 6, we propose TubeR: a simple solution for spatio-temporal video action detection. To summarize, this thesis aims at studying continuity for image and video recognition. In depth, we start with the benefit of learning continuity for images or videos in each part, and then respectively dig into technological innovations of exploiting continuity in various network architectures. In breadth, the thesis explores spatial continuity for images and spatio-temporal continuity for videos. Specifically, it covers image colorization, image classification, semantic segmentation, video action detection, video action recognition, and video object segmentation. We hope our journey is able to stimulate more research on image and video continuity.