PhD in bioinformatic and deep-learning (M/F)
New
- FTC PhD student / Offer for thesis
- 36 month
- BAC+5
Offer at a glance
The Unit
Laboratoire de Recherche en Sciences Végétales
Contract Type
FTC PhD student / Offer for thesis
Working hHours
Full Time
Workplace
31326 AUZEVILLE TOLOSANE
Contract Duration
36 month
Date of Hire
01/10/2026
Remuneration
2300 € gross monthly
Apply Application Deadline : 23 June 2026 23:59
Job Description
Thesis Subject
The astonishing diversity of eukaryotes is the result of a continuous evolutionary process and the emergence of functional innovation. One of the most critical functional innovation associated with the diversification and adaptation of species is their ability to establish mutualist interactions with micro-organisms. Such interactions can be found across eukaryotes and have played a critical role in the diversification of two of them, plants and beetles. So far, most of these interactions have been studied through the prism of the protein-coding genes content. This led to the identification of gene presence/absence or gene families contraction/expansion associated with the ability to engage into these symbioses. However, we recently demonstrated that these symbioses also evolved through the rewiring of pre-existing genes. These transcriptomic rewiring can be explained by different factors, one of the most critical being the cis-elements (CREs). CREs are small portion of sequences often located in the non-coding part of genomes that are recognized by transcription factors and leading to the regulation of target genes. Predicting CREs associated with functional traits is one of the key challenge in biology. Indeed, CREs are often relatively small (few base pairs), embedded in highly variable non-coding regions of the genome and sometime far from the gene they mediate the regulation. The current methods predicting CREs present some limitations such as being restricted to few model species (ignoring the diversity of organisms) or relying on sequence alignment. Over the recent years, vast amounts of omic data (RNA-Seq, ATAC-Seq, Single-Cell Seq, DAP-Seq…) have been produced for a huge diversity of species along with high-quality genomes. However, these data have only been used to answer specific biological questions and not in a comparative framework.
In the INTERSTELLAR (Identification of Eukaryotic cis-regulatory elements through deep-learning approaches) project, we aim to integrate these data in order to develop a deep-learning approach able to predict non-coding elements associated with functional traits over deep-evolutionary scales (hundreds of million years) without sequence alignments in two main lineages, plants and beetles.
Project objectives
To reach its goal, the INTERSTELLAR project is divided into three main objectives:
1. Developing a deep-learning tool to predict CREs. In this first axis, the PhD candidate will re-analyze multi-omic data for model species covering the diversity of plants and beetles species with methods complying with FAIR principles. Some of these methods will be to develop. Then, the candidate will integrate these data to develop a deep-learning approach able to predict CREs in each species.
2. Defining the logic of CRE impact on genome expression. The second objective aims to deploy the trained model to any species using transfer learning approach. For this, the PhD candidate will reconstruct the orthogroups to define species relationships and use them to define the shared and specific CRE across deep-evolutionary times.
3. Identification of CREs associated with functional traits. The last objective of the PhD will be to correlate the CREs discovered in the second objectives with functional traits using statistical approaches. The focus will be first made on mutualistic interactions since significant amount of knowledge and data are available in the host team.
PhD candidate profil
We are looking for a highly motivated person with a strong background in bioinformatic and/or deep-learning approaches.
Candidate should hold a master 2 degree or equivalent in bioinformatic, data analysis, AI or similar. A previous experience in omic data analysis (e.g transcriptomics), in phylogenetic or phylogenomic will be a benefit and an ability to code in Python, R or bash is required. Skills in deep-learning or theoretical knowledge or at least a strong motivation to learn those methods are also required. Skills in working on high performance computational clusters (SLURM or SGE) as well as ability to use and code workflow managers such as Nextflow or Snakemake are a benefit.
The hired candidate should also be able to speak and write in english. Finally, he or she should demonstrate scientific curiosity and be able to set up a literature review as well as work in a dynamic team in an international context.
Your Work Environment
This PhD offer is funding by the CNRS program 80Prime and will be co-supervised by Jean Keller (CNRS, UMR5546 LRSV, Toulouse, France) and Bastien Boussau (CNRS, UMR5558 LBBE, Lyon). The hired candidate will be based at LRSV in Toulouse and will regularly visit the LBBE unit in Lyon. Within the LRSV, the candidate will join the “evolution of plant-microorganisms interactions” team led by Pierre-Marc Delaux. In this interdisciplinary research group, the candidate will benefit from the access to a bioinformatic platform including a high-performance computing cluster as well as to data generated within the data allowing to deploy the deep-learning approaches to under-studied species. The LRSV is part of a laboratories group working on various subject, from ecology to mechanistic biology as well as in in silico analysis of various data.
Within the LBBE, the candidate will be part of the “Génomique Fonctionnelle et Evolutive” team studying the evolution of genomes in relation to the diverse species phenotypes. The team uses and develop standard evolutionary genomic methods as well as deep-learning approaches. The team is part of the LBBE, regrouping more than 200 scientists working in genomic, bioinformatic, evolution, ecology as well as human and animal health. LBBE maintains a pleasant workplace and the candidate will benefit from the support of an informatics engineer team and the access to a high performance calculation cluster.
Constraints and risks
This project of bioinformatic implies a desk-based role; therefore a suitable workstation will be provided to the candidate. Moreover, regular visits to the LBBE unit have to be considered and will be supported by the host team.
Compensation and benefits
Compensation
2300 € gross monthly
Annual leave and RTT
44 jours
Remote Working practice and compensation
Pratique et indemnisation du TT
Transport
Prise en charge à 75% du coût et forfait mobilité durable jusqu’à 300€
About the offer
| Offer reference | UMR5546-JEAKEL-001 |
|---|---|
| CN Section(s) / Research Area | Integrative plant biology |
About the CNRS
The CNRS is a major player in fundamental research on a global scale. The CNRS is the only French organization active in all scientific fields. Its unique position as a multi-specialist allows it to bring together different disciplines to address the most important challenges of the contemporary world, in connection with the actors of change.
Create your alert
Don't miss any opportunity to find the job that's right for you. Register for free and receive new vacancies directly in your mailbox.