PhD (M/F) : Multimodal Question-Answer Explanability : Toward controlable and interpretable methods
New
- FTC PhD student / Offer for thesis
- 36 month
- BAC+5
Offer at a glance
The Unit
Laboratoire Interdisciplinaire des Sciences du Numérique
Contract Type
FTC PhD student / Offer for thesis
Working hHours
Full Time
Workplace
91190 GIF SUR YVETTE
Contract Duration
36 month
Date of Hire
01/10/2026
Remuneration
2300 € gross monthly
Apply Application Deadline : 03 July 2026 23:59
Job Description
Thesis Subject
This thesis will address the problem of black-box models in a multimodal question-answering setting. Previous years' question-answering systems have benefited from large neural network models. While improving the precision and accuracy of generated answers, these approaches still lack interpretability, especially given the opacity and complexity of such models. Indeed, Large Language models or Visual Language models rely on an extremely large number of operations and parameters (namely, neural network weights), leading to major difficulties in interpreting predictions. A direct consequence of neural networks' prediction opacity is a lack of user confidence or the impossibility of verifying how the answer is constructed. In addition, generated content/answers are sometimes untruthful or contain inaccurate or unfounded elements. The generation of unfounded, imprecise, or unexpected responses is named hallucination.
In this thesis, the objective will be to propose new approaches to explain or interpret model behaviour, first to decide what is used for the content generation in a question-answering setting, especially if the information used for the generation comes from internal model knowledge (known as parametric knowledge) or from the context provided in input to an llms (information provided to the model prepending the response). A second part will be dedicated to methods to locate contextual information used for the model generation, these approaches are referred to as attribution methods, i.e. methods that provide importances to the context. While such approaches as been studied in the case of unimodal data, few methods address it through the multimodal setting.
Your Work Environment
The research will be conducted at the laboratoire interdisciplinaire des sciences du numérique (LISN) in Paris-Saclay. The doctoral student will be an integral part of the ANR EQUATION project.
Constraints and risks
working on a screen
Compensation and benefits
Compensation
2300 € gross monthly
Annual leave and RTT
44 jours
Remote Working practice and compensation
Pratique et indemnisation du TT
Transport
Prise en charge à 75% du coût et forfait mobilité durable jusqu’à 300€
About the offer
| Offer reference | UMR9015-THOGER-008 |
|---|---|
| CN Section(s) / Research Area | Information sciences: processing, integrated hardware-software systems, robots, commands, images, content, interactions, signals and languages |
About the CNRS
The CNRS is a major player in fundamental research on a global scale. The CNRS is the only French organization active in all scientific fields. Its unique position as a multi-specialist allows it to bring together different disciplines to address the most important challenges of the contemporary world, in connection with the actors of change.
Create your alert
Don't miss any opportunity to find the job that's right for you. Register for free and receive new vacancies directly in your mailbox.