PhD in statistical mechanics and sampling (M/F)
New
- FTC PhD student / Offer for thesis
- 36 mounth
- BAC+5
Offer at a glance
The Unit
Laboratoire de physique de l'ENS
Contract Type
FTC PhD student / Offer for thesis
Working hHours
Full Time
Workplace
75005 PARIS 05
Contract Duration
36 mounth
Date of Hire
01/10/2026
Remuneration
2300 € gross monthly
Apply Application Deadline : 09 June 2026 23:59
Job Description
Thesis Subject
Generative modelling aims at the unsupervised learning of a probabilistic model capable of generating data matching typical realizations provided as training data. Various approaches have now demonstrated that deep generative models can faithfully model complex data distributions, such as distributions over images, audio, or text. A well-known example is the generation of an image with specific content, say a face, starting from a collection of such images. Recently, it has been proposed to repurpose these powerful generative models (GMs) to tackle the sampling problem, which arises when one does not have data from the distribution of interest but instead knowledge of its unnormalized density [LW18; AKS19; Noé+19]. The goal then becomes to train a generative model that will approximate this target distribution and facilitate its sampling, as required in statistical mechanics or Bayesian inference.
Here, a debiasing step is crucial to avoid uncontrolled approximations in sampling. Pioneering works in this direction, including those of the supervisor [RV22; Gre+23], have demonstrated that subclasses of GMs — normalizing flows and autoregressive networks — can achieve exact sampling by enabling the computation of reweightings of the generated realizations with respect to the target measure. Proofs of concept have been demonstrated across various areas of physics and chemistry, including lattice quantum field theories [Abb+24], biomolecules [Noé+19], and nano-clusters of heavy atoms [Mol+24]. However, while debiasing is straightforward and computationally inexpensive for normalizing flows and autoregressive networks thanks to their tractable likelihood, these classes of GMs are limited by their lack of expressiveness.
Project: The goal of this project is to benchmark and develop debiasing methods for the more powerful diffusion models [Soh+15; Son+21] and flow matching models [Lip+23; ABV23]. We will aim to explore two possible strategies. On the one hand, an approximate likelihood of these models can be computed using the ordinary differential equation (ODE) description equivalent to their traditional stochastic differential equation (SDE) implementation. On the other hand, these models are rooted in non-equilibrium statistical mechanics, which provides tools to estimate trajectory reweightings [Cro98; AV24], in a manner closely related to sequential Monte Carlo [CP20] developed in statistics. Starting from the simple case of sampling from Gaussian mixtures, we will develop and benchmark approaches exploiting these two directions. The most successful design will then be tested on more challenging tasks, such as sampling from molecular systems.
[Abb+24] Ryan Abbott et al. "Applications of flow models to the generation of correlated lattice QCD ensembles". In: Physical Review D 109.9 (May 2024), p. 094514. doi: 10.1103/PhysRevD.109.094514.
[ABV23] Michael S. Albergo, Nicholas M. Boffi, and Eric Vanden-Eijnden. Stochastic Interpolants: A Unifying Framework for Flows and Diffusions. en. arXiv:2303.08797 [cond-mat]. Mar. 2023.
[AKS19] M.S. Albergo, G. Kanwar, and P.E. Shanahan. "Flow-based generative models for Markov chain Monte Carlo in lattice field theory". en. In: Physical Review D 100.3 (Aug. 2019), p. 034515. issn: 2470-0010, 2470-0029. doi: 10.1103/PhysRevD.100.034515.
[AV24] Michael S. Albergo and Eric Vanden-Eijnden. NETS: A Non-Equilibrium Transport Sampler. arXiv:2410.02711. Oct. 2024. doi: 10.48550/arXiv.2410.02711.
[CP20] Nicolas Chopin and Omiros Papaspiliopoulos. An introduction to sequential Monte Carlo. eng. 1st ed. 2020. Springer Series in Statistics. Cham: Springer International Publishing, 2020. isbn: 978-3-030-47847-6 978-3-030-47845-2. doi: 10.1007/978-3-030-47845-2.
[Cro98] Gavin E. Crooks. "Nonequilibrium Measurements of Free Energy Differences for Microscopically Reversible Markovian Systems". en. In: Journal of Statistical Physics 90.5 (Mar. 1998), pp. 1481–1487. issn: 1572-9613. doi: 10.1023/A:1023208217925.
[Gre+23] Louis Grenioux et al. "On Sampling with Approximate Transport Maps". In: Proceedings of the 40th International Conference on Machine Learning. PMLR, July 2023, pp. 11698–11733.
[Lip+23] Yaron Lipman et al. "Flow Matching for Generative Modeling". en. In: ICLR. Sept. 2023.
[LW18] Shuo-Hui Li and Lei Wang. "Neural Network Renormalization Group". en. In: Physical Review Letters 121.26 (Dec. 2018), p. 260601. issn: 0031-9007, 1079-7114. doi: 10.1103/PhysRevLett.121.260601.
[Mol+24] Ana Molina-Taborda et al. "Active Learning of Boltzmann Samplers and Potential Energies with Quantum Mechanical Accuracy". In: J. Chem. Theory Comput. (Oct. 2024). issn: 1549-9618. doi: 10.1021/acs.jctc.4c00506.
[Noé+19] Frank Noé et al. "Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning". en. In: Science 365.6457 (Sept. 2019), eaaw1147. issn: 0036-8075, 1095-9203. doi: 10.1126/science.aaw1147.
[RV22] Gabrié Marylou, Grant M. Rotskoff, and Eric Vanden-Eijnden. "Adaptive Monte Carlo augmented with normalizing flows". In: Proceedings of the National Academy of Sciences 119.10 (Mar. 2022). issn: 0027-8424. doi: 10.1073/pnas.2109420119. arXiv: 2105.12603.
[Soh+15] Jascha Sohl-Dickstein et al. "Deep Unsupervised Learning using Nonequilibrium Thermodynamics". In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: PMLR, July 2015, pp. 2256–2265.
[Son+21] Yang Song et al. "Score-Based Generative Modeling through Stochastic Differential Equations". In: International Conference on Learning Representations. Oct. 2021.
Your Work Environment
The PhD will be hosted between the LPENS and the Centre de Sciences des Données of l'École Normale Supérieure.
Constraints and risks
None
Compensation and benefits
Compensation
2300 € gross monthly
Annual leave and RTT
44 jours
Remote Working practice and compensation
Pratique et indemnisation du TT
Transport
Prise en charge à 75% du coût et forfait mobilité durable jusqu’à 300€
About the offer
| Offer reference | UMR8023-MARGAB-003 |
|---|
About the CNRS
The CNRS is a major player in fundamental research on a global scale. The CNRS is the only French organization active in all scientific fields. Its unique position as a multi-specialist allows it to bring together different disciplines to address the most important challenges of the contemporary world, in connection with the actors of change.
Create your alert
Don't miss any opportunity to find the job that's right for you. Register for free and receive new vacancies directly in your mailbox.