General information
Offer title : Post-Doc (ou Ingénieur) Machine Learning et bases de données M/F (H/F)
Reference : UAR3565-FATIDM-006
Number of position : 1
Workplace : POITIERS
Date of publication : 16 November 2024
Type of Contract : Researcher in FTC
Contract Period : 10 months
Expected date of employment : 3 February 2025
Proportion of work : Full Time
Remuneration : 2991,58 euros bruts
Desired level of education : Doctorate
Experience required : 1 to 4 years
Section(s) CN : 1 - Interactions, particles, nuclei, from laboratory to cosmos
Missions
As part of the OSCARS project "AMIS" (Advanced Metadata Intelligent System), the Consortium-HN ARIANE is recruiting a Postdoctoral Researcher or Engineer specializing in Data Science, Machine Learning, and Databases. The Consortium-HN ARIANE (Analysis, Research, Artificial Intelligence, and New Digital Editions) is an interdisciplinary scientific network within the Huma-Num infrastructure. It brings together experts in the humanities (literature, linguistics, history, etc.) and computer science to create a collaborative space between these fields, advancing methodological and epistemological approaches to the analysis of text-based scholarly materials.
ARIANE aims to contribute to the design, adaptation, and refinement of digital tools currently used for analyzing textual data in the humanities. The consortium's mission is to foster an interdisciplinary approach that bridges digital humanities methodologies with advanced natural language processing technologies. It seeks to enhance text analysis processes through (semi-)automated tools while also providing a critical space for discussing the interpretation of results generated by these methods.
The recruited researcher will join the team responsible for developing the innovative web application "AMIS," designed to enhance metadata for humanities researchers. This individual will develop the "Robot AMIS" module, which leverages artificial intelligence and machine learning techniques to provide metadata recommendations based on text data analysis. They will play a key role in managing and analyzing large databases and in training Large Language Models (LLMs). Additionally, the researcher may be expected to supervise interns recruited by the consortium to help achieve the project's objectives.
Activities
More specifically, the duties and responsibilities of the recruited engineer will include:
Designing and implementing the "Robot AMIS" (Module 2) to query external databases via APIs and process the results to provide metadata recommendations.
Training and fine-tuning Large Language Models (LLMs) for text analysis and extraction of relevant metadata.
Analyzing results from databases and proposing enriched metadata based on criteria such as content, genre, themes, sentiment, thesauri, ontologies, etc.
Integrating explainability features (X-AI) to trace the steps and provide justifications for the recommendations made by the model.
Managing databases and cloud infrastructures required to perform large-scale machine learning tasks.
Optimizing models and data processing workflows to improve performance and accuracy of results.
Skills
Technical Skills
AI/ML Technologies (e.g., Python, TensorFlow, PyTorch, scikit-learn, etc.)
Natural Language Processing (NLP) Models: Fine-tuning LLMs, semantic analysis, text mining
Databases: Management of relational (SQL) and non-relational (NoSQL) databases
APIs and REST services: Development and integration of APIs to query external databases
Knowledge of ontologies and controlled vocabularies used in text sciences (XML-TEI, RDF)
Experience with cloud infrastructure (Google Colab, AWS, or equivalent platforms) for projects requiring high computational power.
Soft Skills
Interest in the humanities
Innovation, intellectual curiosity, communication skills, and technical support abilities
Strong interpersonal skills, attention to detail, and reliability
Ability to work in a team and collaborate with multidisciplinary teams
Interest in open-source projects
Desired Profile
Degree: Ideally, a Ph.D. (Post-Doc) in Computer Science (Data Science, AI)
Experience: Ideally, at least 2 years of experience in machine learning and text data processing projects
Ability to work with complex models and explain their results clearly.
Work Context
The recruited candidate will work for the Consortium-HN ARIANE at the MSHS in Poitiers (https://mshs.univ-poitiers.fr/)
Remote work may be possible, depending on the conditions set by the Consortium-HN ARIANE
Frequent travel within France and potentially abroad is expected
In Poitiers, the candidate will report to Fatiha IDMHAND (Professor, University of Poitiers), coordinator of the ARIANE Consortium.
Additional Information
European project "AMIS" funded by OSCARS.