Informations générales
Intitulé de l'offre : Bioinformaticien (M/F) (H/F)
Référence : UMR7280-MARDAL-003
Nombre de Postes : 1
Lieu de travail : MARSEILLE 09
Date de publication : mardi 20 mai 2025
Type de contrat : CDD Doctorant
Durée du contrat : 36 mois
Date de début de la thèse : 1 octobre 2025
Quotité de travail : Complet
Rémunération : 2200 gross monthly
Section(s) CN : 51 - Modélisation mathématique, informatique et physique pour les sciences du vivant
Description du sujet de thèse
Thesis title: Computational deciphering and mathematical modeling of the regulatory networks controlling plasmacytoid dendritic cell biology.
PhD project summary: Our aim is to develop a new interdisciplinary strategy to identify and model the molecular regulatory networks and cellular interactions controlling the identity and physiological functions of cell types. We will focus on plasmacytoid dendritic cells (pDCs). These immune cells are of interest to the project because their identity and functions are controversial due to their high plasticity. From a methodological point of view, the remarkable plasticity of pDCs poses a unique computational challenge: depending on their tissue microenvironment and state of activation, they express different gene modules, some specific, others shared with dendritic cells or with innate lymphoid cells. This complexity raises fundamental questions about the identity and functions of pDCs, including whether they constitute a unique cell type or belong to the dendritic cell versus innate lymphoid cell family (1, 2). Recent advances in bioinformatics, notably in single-cell molecular analysis and gene network inference, offer new opportunities to decipher this plasticity (3). These approaches enable the unsupervised identification of coregulated gene modules, opening the way to a better understanding of the molecular mechanisms that define and maintain the identity of pDCs despite their plasticity, or that control their functions according to their state of activation. To solve this controversy, we are combining the strengths of 2 teams, one expert in immunology (Marc Dalod, at CIML, Marseille) and the other in computer science (Magali Richard, at TIMC, Genoble). Using single-cell characterization techniques, we have established the activation trajectory of pDCs during viral infection. Further analysis of these data will require developing together an algorithmic approach to model the mechanisms governing pDC identity and plasticity. pDCs are a major source of type I and III interferons (IFNs), key cytokines in the antiviral defense of vertebrates (4). Some of the molecular mechanisms promoting IFN production by pDCs are known; however, this is not sufficient to understand how this function is controlled in time and space. Filling this gap is crucial, as deregulation of the control of IFN production by pDCs contributes to various diseases, including inflammatory or autoimmune diseases such as psoriasis and lupus, or immune dysregulation diseases induced by respiratory or chronic viral infections (4, 5). The project could therefore open up novel therapeutic avenues for promoting the antiviral role or inhibiting the autoimmune activity of pDCs.
Specific aims:
1. Define the identity of pDCs across tissues, species and conditions, characterizing their regulatory networks compared to those of other immune cell types.
2. Understand the functional plasticity of pDCs, by identifying and characterizing gene modules that vary between successive cellular states of the pDC activation trajectory during viral infection, or according to the tissue or anatomical microenvironment in which they reside.
3. Develop new bioinformatics methods to better identify cell types by finely characterizing the molecular and functional activities of cells in relation to their environment.
Scientific fields and themes:
Bioinformatics / computational biology, mathematical modeling of biological processes, applied to immunology.
Methodology. The project will require an interdisciplinary approach combining experimental and computational analyses. The Dalod team has already generated a robust set of biological data, mapping pDC activation states in various tissues and two viral infection models (5, 6, 7 unpublished), using state-of-the-art technologies (high-content flow cytometry, single-cell RNA sequencing, quantitative spectral confocal microscopy). We will integrate it with currently generated or publicly available datasets, including spatial transcriptomics to facilitate micro-anatomical mapping of pDC niches and their in-situ characterization. On the computational side, our methodological approach is structured in three complementary phases. 1) Using algorithmic approaches (graph theory), we will construct gene networks specific to pDCs, to other dendritic cells and to innate lymphoid cells, and we will develop robust metrics to quantify similarities and dissimilarities between these cell types, in their basal state and according to their activation states. 2) We will exploit bioinformatic approaches for differential analysis on massive data, developed by M. Richard (8), to identify expression deregulations specific to each activation state of pDCs, in order to identify variable gene modules and analyze their dependence on tissue context. 3) We will iteratively validate and refine our models. We will systematically evaluate the performance of the algorithms on reference datasets, using benchmarking protocols developed by M. Richard's team (9). The predictions of our models will be validated experimentally by the Dalod team. This iterative approach will enable continuous adjustment of the models, based on experimental feedback, thus guaranteeing the biological relevance of our methodological developments.
Role of the PhD student. He/she will perform computational analyses to define the identity of cell types from omics data and measure the distance between them. This will involve building and analyzing a single-cell RNA-seq gene expression profiling atlas of the different immune cell types of interest in mouse spleens. We have already identified complementary public datasets, verified their compatibility and initiated their analysis. Epigenetic data of interest have also been identified. The PhD student will therefore have a solid foundation on which to start this task. The PhD student will need to use different strategies for defining cell identities and different methods for calculating distance between cell types. The PhD student will be required to carry out advanced computational analyses to identify the networks of molecular regulation and cellular interactions controlling the activation trajectory of pDCs and their functional plasticity, during an immune response and/or as a function of their anatomical microenvironment. Data sets are already available from the Dalod team. We have determined by single-cell RNA sequencing the activation trajectory of pDCs in vivo in the spleen during a kinetic analysis of the innate immune response to viral infection, in relation to the functions exerted by the different activation states of pDCs and their micro-anatomical localization (6, 7). We have also characterized the gene expression profile of pDCs in different tissues in the basal state (unpublished). The PhD student will therefore have a solid foundation for identifying and analyzing gene regulatory networks showing dynamic variations in activity during the activation trajectory of pDCs or as a function of their tissue localization. For this task, the PhD student will have to test various existing algorithms, some of which have been developed by Mr. Richard's team. The PhD student will also be expected to develop new algorithms, including advanced computational machine learning / artificial intelligence methods. The PhD student will then generate mathematical models of these regulatory networks and use them to infer new hypotheses about the mechanisms controlling pDC biology. An iterative dialogue with experimental researchers and the confrontation of model predictions with experimental data designed to test the new hypotheses will enable refining the model.
References. *Equal contribution. **Co-senior authors.
1. Ziegler-Heitbrock L, Ohteki T, Ginhoux F, Shortman K, Spits H. Reclassifying pDCs as innate lymphocytes. Nat Rev Immunol. 2023. doi: 10.1038/s41577-022-00806-0.
2. Reizis B, Idoyaga J, Dalod M, Barrat F, Naik S, Trinchieri G, Tussiwand R, Cella M, Colonna M. Reclassification of pDCs as innate lymphocytes is premature. Nat Rev Immunol. 2023. doi: 10.1038/s41577-023-00864-y.
3. Badia-i-Monpel P, ..., Saez-Rodriguez J. Gene reulatory network inference in the era of single-cell mutli-omics. Nat Rev Genetics. 2023. doi: 10.1038/s41576-023-00618-5
4. Ngo C*, Garrec C*, Tomasello E*, Dalod M.* The role of plasmacytoid dendritic cells (pDCs) in immunity during viral infections and beyond. Cell Mol Immunol. 2024. doi: 10.1038/s41423-024-01167-5.
5. Ngo C, Rahmani K, Valente M … Zarubica, A**, Dalod M**, Tomasello E**. pDCs are dispensable or detrimental in murine systemic or respiratory viral infections. bioRxiv. doi: 10.1101/2024.05.20.594961
6. Abbas A*, Vu Manh TP*, Valente M … Milpied P, Dalod M**, Tomasello E**. The activation trajectory of pDCs in vivo during a viral infection. Nat Immunol. 2020. doi: 10.1038/s41590-020-0731-4.
7. Valente M, Collinet N, Vu Manh TP … Milpied P, Tomasello E**, Dalod M**. Novel mouse models based on intersectional genetics to identify and characterize pDCs. Nat Immunol. 2023. doi: 10.1038/s41590-023-01454-9.
8. Richard M, Decamps C, Chuffart F, Brambilla E, Rousseaux S, Khochbin S, Jost D. PenDA, a rank-based method for personalized differential analysis: Application to lung cancer. PLoS Comput Biol. 2020. doi: 10.1371/journal.pcbi.1007869.
9. Amblard E, Bertrand V, Martin Pena L, Karkar S, Chuffart F, Ayadi M, Baures A, Armenoult L, Kermezli Y, Cros J, Blum Y**, Richard M**. A robust workflow to benchmark deconvolution of multi-omic data. bioRxiv. doi: 10.1101/2024.11.08.622633
Contexte de travail
This position is funded through the CNRS action “80PRIME 2025”, which supports interdisciplinary projects between research units belonging to different CNRS institutes. Successful completion of this project requires interdisciplinary synergy between experimental immunology (Team 1 “Dendritic cells and antiviral defense”, Marc Dalod, CIML unit, Marseille, CNRS Biology Institute) and computer science (Team 2 “Models and Algorithms for Genomics”, Magali Richard, TIMC unit, Grenoble, CNRS Computer Science Institute). None of the objectives of the project can be achieved without this partnership. The interpretation of analyses will draw on the complementary expertise of the 2 teams. Model validation will be based on an iterative process between the partners.
The position is located at the Centre d'Immunologie de Marseille-Luminy (CIML), a joint research unit of the CNRS, Inserm and Aix Marseille University, comprising around 200 staff and 16 research teams, in the Luminy science and technology park in Marseille (France).
The team of Marc Dalod brings its expertise in antiviral immunology and pDC biology, unique animal models (5, 6, 7) and unique data. It is at the origin of the biological question and guarantees its relevance. It is essential to provide the experimental data that will feed the modeling work. It will carry out the experimental tests of the hypotheses derived from the analyses and models. These tasks will be carried out in close collaboration with Team 2, to ensure the best possible match between the design of experiments and the requirements for computational analysis and mathematical modeling.
The team of Magali Richard specializes in computational analysis and modeling of cellular heterogeneity. The central bioinformatics question it addresses is to understand what defines the identity of a cell, both functionally and molecularly, in close connection with the dynamic interactions of its environment. This project explores this question by drawing on data generated by Team 1. The computational expertise of Team 2 is essential to interpret these complex data and generate novel hypotheses that are difficult to explore using experimental approaches alone. Team 2 will lead the design and implementation of the bioinformatics analyses and mathematical models, to which Team 1 will contribute. Magali Richard will directly supervise the PhD student on computational aspects, with the support of Lucie Lamothe (IR CNRS).
At CIML, the CB2M (Computational Biology, Biostatistics and Modelling) group will help to ensure a suitable working environment for the PhD student, facilitate communication between Teams 1 and 2 by contributing its expertise in both computational biology and immunology, and lend its skills to Team 2 to help develop new analysis methods. CB2M comprises 4 engineers specialized in computational biology, biostatistics and mathematical modeling. They collaborate with the research teams to provide support for interdisciplinary projects and promote the most advanced methods and technologies. The CB2M is also a collaborative ecosystem, where all the bioinformaticians can work in a community within the same premises, offering rich and dynamic daily interactions. CB2M works alongside researchers to monitor projects carried out by bioinformaticians, and co-supervises their activities. The CB2M is strongly committed to the implementation of best practices concerning the reproducibility of analyses, the FAIR aspect of data and the dissemination of results (Open Science).
Contraintes et risques
The PhD student will be co-supervised by Marc Dalod and Magali Richard, with CIML as main affiliation.
Secondary location: The PhD student will spend long periods (2 to 3 months each) in the of team of Magali Richard, to assimilate the complementary expertise and scientific culture of the two laboratories, and promote synergy between them.
The PhD student will have to keep abreast of technological and methodological developments.
Informations complémentaires
Expected training :
1. Master degree in bioinformatics / computational biology and/or mathematics.
2. Appetence for mathematical modeling of biological processes.
Expected scientific skills:
1. Excellent programming skills in R or Python.
2. Knowledge of software required for genomic (e.g. CellRanger, STAR) and single-cell (e.g. Seurat, Monocle, Velocyto or Scanpy) data analysis.
3. Practical understanding of mathematical methods for multidimensional data analysis (PCA, t-SNE, UMAP, pseudotime, statistical learning, etc.).
Desired scientific skills.
1. Working knowledge of containerization tools (Docker, Singularity), version managers (Git) and workflow managers (Snakemake).
2. Interest in methodological aspects of data analysis.
Desired experience.
Successful experience in bioinformatics applied to genomic or transcriptomic data, ideally at single-cell scale.
Language skills.
Good written and oral communication skills. Fluency in scientific English (written and spoken).
Other qualities expected.
1. Ability to work as part of a team in a multidisciplinary environment; ability to listen and make suggestions.
2. Excellent analytical and synthesis skills.
3. Excellent organization and autonomy in project management.
4. Interest in biology, in particularly in immunology.