Advanced School for Computing and Imaging (ASCI)

ASCI office
Delft University of Technology
Building 28, room 04.E120
Van Mourik Broekmanweg 6
2628 XE – DELFT, The Netherlands

P: +31 15 27 88032

Visiting hours office
Monday, Tuesday, Thursday: 10:00 – 15:00


The ASCI office is located at the Delft University of Technology campus.  It is easily accessible by bicycle, public transport and car. The numbers of buildings can help you find your way around the campus. Make sure you remember the name and building number of your destination.

Contact us at +31 15 278 8032 or send us an email at

Deep learning with 3D and label geometry

Deep learning with 3D and label geometry

Author : Shuai Liao
Promotor(s) : Prof.dr. C.G.M. Snoek, Co-supervisor: Dr. E. Gavves
University : University of Amsterdam
Year of publication : 2021
Link to repository


A fine-grained understanding of an image is two-fold: visual understanding and semantic understanding. The former strives to understand the intrinsic properties of the object in the image, whereas the latter aims at associating the diverse objects with certain semantics. All of these form the basis of an in-depth understanding of images.
Today’s default architectures of deep convolutional networks have already shown a remarkable ability in capturing the 2D visual appearances of images, and mapping visual content to semantic classes thereafter. However, research on fine-grained image understanding, such as inferring the intrinsic 3D information and more structured semantics, is less explored. In this thesis, we look at the problems by asking “How to better utilize geometry for better image understanding?”
In the first part, we research visual image understanding with 3D geometry. We show that it is possible to automatically explain a variety of visual contents in the image with texture-free 3D shapes. Furthermore, we develop a deep learning framework to reliably recover a set of 3D geometric attributes, such as the pose of an object and the surface normal of its shape, from a 2D image.
In the second part, we explore label geometry for semantic image understanding. We find that a set of image classification problems have geometrically similar probability spaces. Therefore, label geometry is introduced, unifying one-vs.-rest classification, multi-label classification, and out-of-distribution classification in one framework. Moreover, we show that learned hierarchical label geometries can balance the accuracy and specificity of an image classifier.