Outdoor Image Understanding from Multiple Vision Modalities
|Author||: Hoàng-Ân Lê|
|Promotor(s)||: Prof. dr. T. Gevers / dr. T. E. J. Mensink|
|University||: Universiteit van Amsterdam|
|Year of publication||: 2021|
|Link to repository||: UVA repository|
The thesis investigates various computer vision modalities, which, taken from the broad definitions, include both sensory data as well as subsequent interpretation such as RGB, depth, intrinsic images, semantic maps, surface normals, optical flow, and point clouds. Specifically, the thesis focuses on the research question how various computer vision modalities can be exploited and combined. The question is tackled from multiple perspectives, starting with decomposing a primary modality, followed by the study of modality complement and combination. The subsequent chapters explore multimodality from a generative perspective, how a modality benefits generation of the others, and concludes with the construction of a multimodal synthetic dataset.