En poursuivant votre navigation sur ce site, vous acceptez le dépôt de cookies dans votre navigateur. (En savoir plus)

PhD Thesis Formalisms for Generative Conversation Pathways - Application to Q&A M/F

This offer is available in the following languages:
- Français-- Anglais

Date Limite Candidature : mardi 27 mai 2025 23:59:00 heure de Paris

Assurez-vous que votre profil candidat soit correctement renseigné avant de postuler

Informations générales

Intitulé de l'offre : PhD Thesis Formalisms for Generative Conversation Pathways - Application to Q&A M/F (H/F)
Référence : UMR5217-SIHAME-012
Nombre de Postes : 1
Lieu de travail : ST MARTIN D HERES
Date de publication : mardi 6 mai 2025
Type de contrat : CDD Doctorant
Durée du contrat : 36 mois
Date de début de la thèse : 1 novembre 2025
Quotité de travail : Complet
Rémunération : 2200 gross monthly
Section(s) CN : 06 - Sciences de l'information : fondements de l'informatique, calculs, algorithmes, représentations, exploitations

Description du sujet de thèse

Conversational AI systems are Large Language Models (LLMs) that use Transformer Neural Networks. These models are trained on a large amount of text data collected from the web using supercomputers over several days. To give an idea, PaLM, an LLM model by Google, has 540 billion parameters and requires more than a month of training on a specialized computer cluster. The rapid adoption of LLMs has outpaced the development of techniques for evaluating their output quality. This oversight is crucial because LLMs have been shown to be prone to producing what is known as "hallucinations", plausible responses that nonetheless are factually incorrect or inconsistent with the user intent. Therefore relying on LLMs without proper assessment may have severe consequences. Ensuring the quality of LLM output is essential for leveraging the transformative power of these models while mitigating potential risks. By developing robust validation methodologies and incorporating quality-control measures, businesses can harness the benefits of LLMs while safeguarding their decision-making.

Multiple-Choice Questions (MCQs) have long been a cornerstone of education, providing a standardized means of assessing student knowledge and comprehension. However, the creation of high-quality MCQs remains a time-consuming and labor-intensive process for medical educators. This challenge has prompted exploration into innovative solutions that can alleviate the burden on professors while maintaining the quality and relevance of educational content. Recent advancements in Large Language Models (LLMs), present a promising avenue for addressing this issue. LLMs have demonstrated remarkable capabilities in language understanding and generation tasks across various domains. Exploring the use of these models for developing educational materials, particularly in creating MCQs for medical training, merits comprehensive research and evaluation. In this PhD proposal, we introduce the notion of Generative Conversation Pathways that leverage LLMs to produce sequences of questions with the purpose of assessing student knowledge and comprehension.

Contexte de travail

PROJECT ITN ARMADA within Laboratoire d'Informatique de Grenoble
LIG is a 500-member laboratory with teaching faculty, full-time researchers, PhD students, administrative and technical staff. The mission of LIG is to contribute to the development of fundamental aspects of Computer Science (models, languages, methodologies, algorithms) and address conceptual, technological, and societal challenges. The 24 research teams in LIG aim to increase diversity and dynamism of data, services, interaction devices, and use cases influence the evolution of software and systems to guarantee the essential properties such as reliability, performance, autonomy, and adaptability. Research within LIG is organized into 5 focus areas: Intelligent Systems for Bridging Data, Knowledge and Humans, Software and Information System Engineering, Formal Methods, Models, and Languages, Interactive and Cognitive Systems, Distributed Systems, Parallel Computing, and Networks.

ARMADA is a doctoral network aims at training 15 versatile and interconnected Early Stage Researchers (ESRs) to specialize in the overarching area of Conversational Artificial Intelligence (Conversational AI) and the challenges associated to the recent advances in developing Large Language Models (LLMs), such as ChatGPT and Bard. These specialists will acquire unique knowledge and skills in Artificial Intelligence, Natural Language Processing, Machine Learning, Data Management, and Algorithms Design to improve the reliability of LLMs. A reliable LLM will produce timely, consistent, and verifiable answers, and provide guidance to the user. Due to the highly interdisciplinary aspect, the proposed program will ensure a number of training activities targeted to hone the skills of the trainees. The network provides research training with summer and winter schools on the multidisciplinary aspects of the topic, as well as workshops and courses to foster non-technical social and interpersonal skills, such as scientific writing, innovation, supervision, and management. This program tackles the crucial EU needs for regulating AI by offering to train experts in the area of Conversational AI that will potentially advise EU bodies on technical matters related to the adoption of these technologies in critical disciplines, such as medicine, education, and business intelligence. The 8 organizations distributed in 7 countries will form an interoperability platform to share knowledge and skills.

Le poste se situe dans un secteur relevant de la protection du potentiel scientifique et technique (PPST), et nécessite donc, conformément à la réglementation, que votre arrivée soit autorisée par l'autorité compétente du MESR.

Contraintes et risques

N/A