En poursuivant votre navigation sur ce site, vous acceptez le dépôt de cookies dans votre navigateur. (En savoir plus)

PhD position in Computer Science (AI)

This offer is available in the following languages:
Français - Anglais

Date Limite Candidature : jeudi 27 janvier 2022

Assurez-vous que votre profil candidat soit correctement renseigné avant de postuler. Les informations de votre profil complètent celles associées à chaque candidature. Afin d’augmenter votre visibilité sur notre Portail Emploi et ainsi permettre aux recruteurs de consulter votre profil candidat, vous avez la possibilité de déposer votre CV dans notre CVThèque en un clic !

Faites connaître cette offre !

General information

Reference : UMR7357-HYESEO-004
Workplace : STRASBOURG
Date of publication : Thursday, January 6, 2022
Scientific Responsible name : Hyewon Seo
Type of Contract : PhD Student contract / Thesis offer
Contract Period : 36 months
Start date of the thesis : 1 March 2022
Proportion of work : Full time
Remuneration : 2 135,00 € gross monthly

Description of the thesis topic

The proposed PhD study is articulated with three following tasks:
1. A photo-realistic 4D human model will be developed, by coupling a geometric human model and a color- and illumination components in a way that any desired shape, pose, and attribute can be generated. We will build our human model by grounding on our previous work as well as other standard geometric human models learned from multiple images of real people. Additional efforts will be made to consolidate the model by adding cloth and garment draping module, again based on our previous work.
2. The evolution of the human model over time due to motion will be learned by a deep neural network (DNN), over multiples of annotated motion datasets. We will focus on the combinations of RNNs and variational autoencoders, allowing a stochastic prediction of shape- and pose-sequences in a latent space. Different data representations and network hyperparameters will be experimented, to obtain best results.
3. An inverse problem will be formulated to reconstruct a 4D human model given an image, video, or a simple action command. All model components will be developed in a differentiable manner, so that an inverse problem can be formulated: Given an observation data, we will be able to build a model instance by an optimal model-fitting and even to predict its future state.

Work Context

Robot vision for human cognition often fails to work well in the real-world situation, despite the disruptive results achieved in Computer Vision and Artificial Intelligence. While most training data have been collected in well-conditioned, easy-to-isolate backgrounds, wild videos from the real-world may contain various environmental conditions such as lighting, background patterns, and, most notoriously, occlusions. The latter becomes the source of recurrent problems of human cognition by care-robots in the in-house situation. Large variations in body shapes, motions, clothes, and frequent interactions with objects also contribute to the difficulty. Finally, viewing dynamics by moving robots and humans is another source of the problem. Learning-based methods relying on dataset inevitably show limited performance, as it is almost impossible to collect a large, annotated dataset that spans over all possible configurations of the real-world scene.

Model-based learning approaches are good alternatives to this problem, as have been confirmed by a considerable amount of previous works on human face and body recognition from images based on pre-constructed 3D models. However, many existing models consider only the geometry of human face or body in isolation, making it difficult to real-world situation involving obstacles or environmental objects. In this project2, we will also adopt a model-based approach but with enhanced realism, additional dimension (i.e. time), and beyond. Our aim is to push the current limits of robot vision in human cognition by care-robots in the in-house situation. The specific goal is to make the performance of the vision-intelligence robust to large variations (in body shapes, motions,..) to occlusion (cloth, furniture, wall,..), and capable of understanding the interaction by developing a photo-realistic, physics-aware 4D human model.

Additional Information


We talk about it on Twitter!