En poursuivant votre navigation sur ce site, vous acceptez le dépôt de cookies dans votre navigateur. (En savoir plus)

Data-mining for electrochemical transformations

This offer is available in the following languages:
Français - Anglais

Date Limite Candidature : jeudi 2 juin 2022

Assurez-vous que votre profil candidat soit correctement renseigné avant de postuler. Les informations de votre profil complètent celles associées à chaque candidature. Afin d’augmenter votre visibilité sur notre Portail Emploi et ainsi permettre aux recruteurs de consulter votre profil candidat, vous avez la possibilité de déposer votre CV dans notre CVThèque en un clic !

General information

Reference : UMR6072-BERCUI-001
Workplace : CAEN
Date of publication : Thursday, May 12, 2022
Scientific Responsible name : Bertrand Cuissart
Type of Contract : PhD Student contract / Thesis offer
Contract Period : 36 months
Start date of the thesis : 3 October 2022
Proportion of work : Full time
Remuneration : 2 135,00 € gross monthly

Description of the thesis topic

By using a single electron to form or break bonds during complex processes, electrosynthesis has a great potential in terms of the discovery of new processes and industrialization at low cost. However, the control and optimization of electrocatalyzed reactions remain difficult and the contribution of Artificial Intelligence to solve chemical problems represents a unique opportunity. The AMPERE project led by the LIMA (UMR 7042, INC), LBM (UMR 7203, INC), LHFA (UMR 5069, INC) and GREYC (UMR 6072, INS2I) laboratories is part of this dynamic, it gathers a community of chemists and computer scientists wishing to develop decision support processes facilitating the discovery and optimization of electrochemical transformations. AMPERE is funded as part of the CNRS 80/PRIME program; the latter initiated on the occasion of the 80th anniversary of the CNRS aims at financing original and disruptive interdisciplinary projects (https://www.cnrs.fr/fr/cnrsinfo/80-new-projects-for-the-80prime-program). Beyond this project, this interdisciplinary exchange between artificial intelligence and chemistry enriches each community in terms of fundamental and applied advances.

The research work proposed here is fully integrated into the AMPERE project, it is therefore a research work directly motivated by a chemical application. Based on a screening tool that experimentally generates data on electrochemical reactions, the aim is to design and to implement data mining methods. From current techniques used in chemical reaction databases, we will start by building a database suitable for our electrochemical reactions. This tool being a key in the project, it will have to satisfy the consultation needs of chemists, and constitute a source for computer analyses. Consequently, its realization will be the result of a discussion between experts from both disciplines. Based on the database of electrochemical reactions, the thesis will develop two research axis.

In a first part, algorithms dedicated to mine the studied reaction data studied will be designed. Indeed, the dynamic and complex nature of the reactions requires to represent them in the form of structured data from which the relevant descriptors will be extracted. The computer analysis associated with this part of the work will calculate the remarkable statistical associations between experimental conditions and reaction yields. For a given reaction, the system will be able to propose new parameters in order to increase the yield, as well as an explanation of its choices, understandable by a chemist.

Then, a research work in sequential data mining is planned. A reaction will be modeled as a sequence of states, each state describing a point of the reaction. Sequence mining techniques adapted to this case will make it possible to extract subsequences of ``remarkable'' reaction states. By remarkable, we can mean ``frequent'', ``unique'' or very associated with an external characteristic such as level of performance. To model possible uncertainties, it may be possible to replace the sequences with oriented trees. To carry out the subsequent analysis, it will be necessary to design an original excavation process; this work has a strong innovative character.

Work Context

The main methodological contributions will be in artificial intelligence for data sciences. The doctoral student will have the opportunity to build expertise in the specific framework of the search for associations within structured data, the data being represented here in the form of graphs or sequences. As the work is integrated into an interdisciplinary project, the doctoral student will concretely experience the dialogue guiding the work of this type of project. In addition, he will acquire additional skills in the field of computer science applied to chemistry. Innovation in chemical data analysis being associated with technological or scientific challenges, this skill offers the opportunity to participate in ambitious projects, of varied nature and with significant spin-offs.

The applicant has signed up for the last year of a Master's degree in a field related to computer science or applied mathematics and he has strong programming skills; the candidate may also hold such a diploma. An experience in Data Science will be a plus (data mining, machine learning, ...). The applicant must have the ability to write scientific reports and to communicate research results at an international conferences.

The thesis will begin in the fall of 2022 -- early September or early October. The work will mainly take place at the GREYC, a Norman academic laboratory located in Caen. The project involving several French laboratories, the thesis includes several annual weeks of work in the partner laboratories.

Applications must include the following documents in electronic format: a letter of motivation, a detailed CV describing your studies and your research experience,

Please send your application via the CNRS job portal.

Constraints and risks


Additional Information


We talk about it on Twitter!