Reference : UMR6072-SOPRAS-007
Workplace : CAEN
Date of publication : Monday, August 1, 2022
Type of Contract : FTC Technical / Administrative
Contract Period : 8 months
Expected date of employment : 10 October 2022
Proportion of work : Full time
Remuneration : Between 2400 and 2600 € gross monthly salary
Desired level of education : PhD
Experience required : 1 to 4 years
The work in this engineer place will be done in the context of the ANR Mobideep. This project aim at proposing navigation or navigation assistance solutions to robots and/or blind people. In this context, it is important to have a classification at the pixel level of the available video streams in order to identify the navigable space and to anticipate the evolution of the environment in which the agent moves. The use of bounding boxes does not offer sufficient precision to the system and a more precise clipping is then necessary. The work will be done in collaboration with INRIA and INJA and will be integrated into a common platform. After a state of the art study, the candidate will propose new solutions to solve the specific problems of the project.
-Semantic segmentation is a field that has seen important advances in recent years. The apparition of fully convolutional networks  has allowed both to reduce the number of parameters of the architectures, thus accelerating the computation while reducing the memory cost, but also to work with images at different resolutions. The following works such as SegNet  and U-Net  have studied in more depth how to transmit information between the convolution and deconvolution layers allowing the generation of semantic maps at any scale level. Indeed, at the first layers of the network, the use of max-pooling functions results in the loss of part of the localization information. It is therefore necessary to propagate part of the information to allow an accurate reconstruction of the semantic map. For example,  introduced ParseNet to add local scene context knowledge. Recurrent networks such as Conv-LSTM layers have been used in  to improve video segmentation. Some works also use mechanisms from object detection to apply them to segmentation such as Mask-RCNN .
The engineer, after writing a precise state of the art of semantic segmentation methods applicable to stereo-vision videos, will develop new architectures to meet the specific constraints of the project. We aim at taking full advantage of the stereo images as well as of the previous annotations from the previous frames of the video.
- PhD in Machine learning or equivalent,
- Experience in Deep learning and in associated frameworks ( Pytorch, Tensorflow...),
- Good experience in publishing scientific articles.
The GREYC lab is a research lab located in the city of Caen in Normandie ( France). The GREYC lab realizes research works in the field of digital science with activities in image processing, machine learning, artificial intelligence, computer security, fundamental computer science, Web science, electronics. The work will be carried out within the Image team, whose research activities are focused on the development of new methods to process and analyze images and signals. The team benefits from a solid expertise in pattern recognition and information retrieval in images/videos using methods based on graphs, neural networks, metrics learning, knowledge engineering … The team members have different backgrounds ( computer science, signal/image processing,applied mathematics, artificial intelligence). This variety of skills is one of our strengths as we can approach image processing and analysis from different scientific viewpoints and paradigms.
In context of the collaboration with other partners, the implementation will take place on the project's experimental platform located at Inria in Sophia Antipolis, possibly requiring some travel to the site.
We talk about it on Twitter!