En poursuivant votre navigation sur ce site, vous acceptez le dépôt de cookies dans votre navigateur. (En savoir plus)
Portail > Offres > Offre UMR5216-CHRROM-022 - H/F offre Post-doc - Contrôle gestuel temps-réel de l'intonation pour la suppléance vocale

H/F offre Post-doc - Real-time gestural control of intonation for voice substitution

This offer is available in the following languages:
Français - Anglais

Date Limite Candidature : mardi 13 décembre 2022

Assurez-vous que votre profil candidat soit correctement renseigné avant de postuler. Les informations de votre profil complètent celles associées à chaque candidature. Afin d’augmenter votre visibilité sur notre Portail Emploi et ainsi permettre aux recruteurs de consulter votre profil candidat, vous avez la possibilité de déposer votre CV dans notre CVThèque en un clic !

General information

Reference : UMR5216-CHRROM-022
Date of publication : Tuesday, November 22, 2022
Type of Contract : FTC Scientist
Contract Period : 6 months
Expected date of employment : 1 February 2023
Proportion of work : Full time
Remuneration : 2 805,35 € to 3963.98 €
Desired level of education : PhD
Experience required : 1 to 4 years


This post-doctorate is part of the ANR GEPETO* (GEstures and PEdagogy of InTOnation) project, that aims at studying the use of manual gestures through human-machine interfaces, for the design of tools and methods for learning how to control intonation (melody) in speech.
In particular, this position is in the context of voice rehabilitation, in the case of impairment or absence of vocal fold vibration caused by larynx disorders. Current medical solutions to replace this vibration consist in injecting an artificial sound source directly into the mouth or against the neck with an electrolarynx. This vibrator generates a substitute voice source on which the user can articulate speech. Alternatively, a microphone can be used to capture unvoiced speech that is produced without vibration of the vocal folds (e.g., whispering), and voicing can be re-introduced in real-time using voice synthesis. The reconstructed voice is then played in real-time on a loudspeaker. As for today, all these systems generate signals that have a monotonous melody, leading to extremely robotic voices.
The goal of GEPETO project at GIPSA-lab is to supplement these two solutions with a real-time control of intonation with hand gestures, and to study the use of such systems in oral interaction situations.

In the first part of the project, we developed a real-time whisper-to-speech conversion solution to which various interfaces can be connected in order to capture hand gestures in different conditions (trajectory on a surface, in space, pressure, etc.) for intonation control. The aim of the post-doc is to evaluate such a system in oral production and interaction situations in a voice substitution application.
We will first address the issue of voicing control (activation or not of the voiced source). We have implemented a semi-automatic method based on the spectral centroid of the whisper signal which requires the user to adjust the production of the whisper for a correct voicing decision to be made (Ardaillon et al., Interspeech 2022)**. We will evaluate to what extent this adaptation to the system is possible.
In a second step, we will study the prosodic control by manual gesture of typical intonational patterns (e.g. sentence modalities, accentuation), depending on the degrees of freedom offered by the available interfaces (Sensel Morph touch tablet, accelerometer, etc.). This stage will be evaluated both on simple sentence imitation tasks, but also in communication situations where the user has to produce intelligible and expressive sentences for the interlocutor, without any reference to imitate.
These two research questions therefore focus on users' speech production and manual gestures adaptation for voicing and intonation control, respectively. For each study, experimental protocols will be proposed for the evaluation of subjects on control tasks and to assess their ability to learn to use such a system.
* http://gepeto.dalembert.upmc.fr/project_fr.html
** http://www.gipsa-lab.grenoble-inp.fr/~olivier.perrotin/media/papers/10675_file_Paper.pdf


- Familiarise with the whisper-to-speech conversion system in the Max/MSP environment
- Propose a learning protocol for voicing control
- Propose a protocol for evaluating intonation control on imitation and production tasks
- Propose a learning protocol for intonation control
- Perform evaluations following these protocols on groups of users


Those without expertise in some of the areas listed are nevertheless encouraged to apply.
- Perception and Production of Speech
- Experimental Methodology in Speech Science / Phonetics
- Tools for results analysis (e.g., R/Python/Matlab)
- Human Machine Interaction
- Max/MSP programming (familiarisation with the existing system and improvements)
- French language comprehension (language used for the development and evaluation of the system)

Work Context

Gipsa-lab is a CNRS research unit joint, Grenoble-INP (Grenoble Institute of Technology), University of
Grenoble under agreement with Inria, Observatory of Sciences of the Universe of Grenoble.
With 350 people including about 150 doctoral students, Gipsa-lab is a multidisciplinary research unit
developing both basic and applied researches on complex signals and systems.
Gipsa-lab develops projects in the strategic areas of energy, environment, communication, intelligent
systems, life and health and linguistic engineering.
Thanks to the research activities, Gipsa-lab maintains a constant link with the economic environment through
a strong partnership with companies.
Gipsa-lab staff is involved in teaching and training in the various universities and engineering schools of the
Grenoble academic area (Université Grenoble Alpes).
Gipsa-lab is internationally recognized for the research achieved in Automatic & Diagnostics, Signal Image
Information Data Sciences, Speech and Cognition. The research unit develops projects in 16 teams organized
in 4 Reseach centers
.Automatic & Diagnostic
.Data Science
.Geometry, Learning, Information and Algorithms
.Speech -cognition
Gipsa-lab regroups 150 permanent staff and around 250 no-permanent staff (Phd, post-doctoral students,
visiting scholars, administrative and technical staff, trainees in master…)
The post-doctoral fellow will be part of the CRISSP team (Cognitive Robotics, Interactive Systems, Speech Processing) of the GIPSA-lab Speech and Cognition Division. He/she will work with Olivier Perrotin of the CRISSP team, and Nathalie Henrich Bernardoni of the MOVE team (Analysis and Modelling of Human in Motion: Biomechanics, Cognition, Vocology) of the Data Science Division.

Constraints and risks


We talk about it on Twitter!