Informations générales
Intitulé de l'offre : PHD M/F : Grounding Meanings in Intrinsic Social Motivations for Language Models (H/F)
Référence : UMR8051-VIRLAI-012
Nombre de Postes : 1
Lieu de travail : CERGY
Date de publication : vendredi 23 mai 2025
Type de contrat : CDD Doctorant
Durée du contrat : 36 mois
Date de début de la thèse : 1 octobre 2025
Quotité de travail : Complet
Rémunération : 2200 gross monthly
Section(s) CN : 01 - Interactions, particules, noyaux du laboratoire au cosmos
Description du sujet de thèse
Contexte :
Recent advances in Transformer-based Large Language Models (LLMs) [1] have led to systems capable of generating coherent conversations and responding to a wide range of prompts with impressive fluency. However, they require to be fed with massive, internet-scale text datasets while human children receive around 4 to 5 orders of magnitude less language data to reach language literacy. This sample inefficiency is a central issue for the greening/reduction of the environmental footprint, as Generative AI is expected to become a $1.3 Trillion market by 2032 according to a recent Bloomberg Intelligence report [2]. Another fundamental difference between LLMs and humans is that living organisms are inextricably anchored to the body and world [3], while it remains a challenge to relate the language produced by LLMs to the real world, outside of a text-based interaction [4]. This limits their adoption in robotics when adaptation to the physical and perceptual environment and to social partners is needed to act in an appropriate way. This problem of associating language with objects and actions is known as the Symbol Grounding Problem [5]. We posit that principles inspired by human language acquisition have the capacity to address these limitations. Furthermore, proposing Language Models capable of learning on the same kind and quantity of data as humans could lead to more plausible cognitive models of language acquisition and improve our understanding of how humans manage to acquire language so efficiently. This question lies at the heart of this project.
Objectives:
Our goal is twofold:
1. Improving the potential of LLMs to serve as cognitive models of language understanding in humans.
2. Addressing limitations of these systems by benefiting from bioinspired approaches in terms of sample efficiency, grounding, and social competence for social robotics applications.
For this purpose, the goal of this PhD is to propose a cognitive computational model dedicated to ground LLMs to an intrinsically motivated robotic agent, interacting socially with the world by leveraging human language acquisition.
Research Program :
Building on previous work inspired by child language development where we proposed a cognitive model for grounding symbols in basic motivations [6,7] this PhD project will explore the following directions:
1. Enrich the intrinsic motivation module by enabling the system to autonomously generate new motivations, discovered through interaction [7][8]. These motivations will be driven by homeostatic needs, curiosity, and social rewards. This shift from static, manually defined motivations to flexible, self-emerging goals will make the agent more adaptive.
2. Develop a cognitive model that integrates this motivation system with a language model, enabling the agent to interpret and produce language based on its evolving internal drives and social experiences.
3. Evaluate and validate the model through a series of experiments on a humanoid Robot (Reachy robot from Pollen Robotics), ranging from controlled protocols to more realistic scenarios modeled on developmental psychology studies. These will allow us to benchmark the system's performance against patterns observed in human language development.
[1] A. Vaswani et al., “Attention is all you need,” Adv. Neural Inf. Process. Syst., vol. 30, 2017,
[2] B. Intelligence, “Generative AI to become a $1.3 trillion market by 2032, research finds,” Bloom. Com, 2023.
[3] G. Pezzulo, T. Parr, P. Cisek, A. Clark, and K. Friston, “Generating meaning: active inference and the scope and limits of passive AI,” Trends Cogn. Sci., vol. 28, no. 2, pp. 97–112, Feb. 2024, doi: 10.1016/j.tics.2023.10.002.
[4] E. Pavlick, “Symbols and grounding in large language models,” Philos. Trans. R. Soc. Math. Phys. Eng. Sci., vol. 381, no. 2251, p. 20220041, Jul. 2023, doi: 10.1098/rsta.2022.0041.
[5] S. Harnad, “The symbol grounding problem,” Phys. Nonlinear Phenom., vol. 42, no. 1–3, pp. 335–346, 1990.
[6] L. Cohen and A. Billard, “Social babbling: The emergence of symbolic gestures and words,” Neu. Net., vol. 106, pp. 194–204, 2018.
[7] Z. Lemhaouri, L. Cohen, and L. Cañamero, “The role of the caregiver's responsiveness in affect-grounded language learning by a robot: Architecture and first experiments,” in 2022 IEEE ICDL, IEEE, 2022, pp. 349–354.
[8] N. Duminy, S. M. Nguyen, J. Zhu, D. Duhaut, and J. Kerdreux, “Intrinsically motivated open-ended multi-task learning using transfer learning to discover task hierarchy,” Appl. Sci., vol. 11, no. 3, p. 975, 2021.
[9] A. Manoury, S. M. Nguyen, and C. Buche, “Hierarchical Affordance Discovery using Intrinsic Motivation,” in Proceedings of the 7th HAI, Kyoto Japan: ACM, Sep. 2019, pp. 186–193. doi: 10.1145/3349537.335189
Contexte de travail
This PhD is funded by the ANR JCJC project GISMo (Grounding Meaning in Intrinsic Social Motivation), led by Laura Cohen (CY Cergy Paris Université). It is part of a collaborative and interdisciplinary initiative, involving key partners such as Julia Ive (University College London, specialist in language models) and Sao Mai Nguyen (ENSTA, expert in intrinsic motivation). The PhD candidate will also benefit from the support of other project members, including several interns recruited in parallel.
The research will be carried out within the NEURO team of the ETIS laboratory, a joint research unit (UMR 8051) between CY Cergy Paris Université, ENSEA, and the CNRS, recognized for its work in bio-inspired robotics and artificial intelligence.
The NEURO team designs cognitive and social robots based on biological models of human cognition, with strong foundations in developmental science, psychology, and neuroscience. The team includes 16 permanent researchers and 22 PhD students and postdoctoral researchers, and is actively involved in numerous national and international collaborative projects.
Contraintes et risques
Requirements
• Master or engineering diploma in Computer Science, Robotics or related fields
• Languages: Python, C/C++
• Experience in Neural networks, image processing, reinforcement learning and robotics
• Interest in statistical analysis of results and in cognitive science
• Scientific rigor and writing skills; ability to conduct a literature review
• A good level of English is a plus, French is not mandatory
Hiring process
All applications must be submitted via this portal (i.e. Portail emploi CNRS) by June 15th, 2025 and should include:
• Detailed CV
• Master's transcripts (M1 and M2)
• Motivation letter including a short description of your background, a statement of your research interests and motivation for this position, and why you think you would be a good fit (1 page)
• Two reference letters or contact details of two referees
Informal inquiries via email prior to full applications are welcome.
Shortlisted candidates will be invited for an interview.