By continuing to browse the site, you are agreeing to our use of cookies. (More details)

PhD position on high-performance ensemble-variational data assimilation

This offer is available in the following languages:
Français - Anglais

Ensure that your candidate profile is correct before applying. Your profile information will be added to the details for each application. In order to increase your visibility on our Careers Portal and allow employers to see your candidate profile, you can upload your CV to our CV library in one click!

Faites connaître cette offre !

General information

Reference : UMR5505-CLEROG-012
Workplace : TOULOUSE
Date of publication : Thursday, June 04, 2020
Scientific Responsible name : Ehouarn SIMON
Type of Contract : PhD Student contract / Thesis offer
Contract Period : 36 months
Start date of the thesis : 1 October 2020
Proportion of work : Full time
Remuneration : 2 135,00 € gross monthly

Description of the thesis topic

Context :
The development of supercomputers over the past few decades has led to tremendous progress in Earth System model (ESM) forecasting. For example, operational ocean forecasting centres are now able to run global configurations with a resolution of 1=12o (around 9 -10 km), to allow the representation of mesoscale dynamics, and to target a resolution of 1=36o (around 3 km) for the next generation of their models. In the same way, very high resolution used by weather forecasting centres has led to the development of non-hydrostatic models. While the increase in model resolution leads to a better representation of complex nonlinear phenomena, it also results in a significant increase in computational costs. Data assimilation methods combine the
heterogeneous and uncertain information provided by models and observations to estimate the state and/or some parameters of a system. Error covariance matrices are required for the data assimilation method to quantify uncertainties in the information that is assimilated. These uncertainties are associated with errors in the background (forecast) state, the observations, and the model. Despite their fundamental impact on the estimates of the state of the system, they remain poorly known in problems arising in oceanography
or meteorology. Modern approaches to estimating background error covariances involve ensembles of model states. Ensembles are designed to sample the probability density function of background error and thus provide information that is useful for improving background error specification. However, for operational ESM forecasting systems, ensemble sizes tend to be very small due to the high computational cost involved in producing them. Therefore, techniques are required to reduce computational costs, while still allowing
the ensembles to provide useful information for covariance estimation.
In this PhD project, the multilevel Monte Carlo (MLMC) methodology will be used to leverage the use of ensembles of different fidelity levels. MLMC is a well-established statistical method whose underlying idea is to take advantage of different levels of numerical resolution in such a way that many (cheap) evaluations of the numerical model are performed on the coarsest levels while fewer (expensive) computations are required on the finest levels, resulting in a reduced computational cost. The core of the method relies on a correction mechanism, based on a telescoping sum of contributions from successive resolutions (levels), and can be seen as a multilevel variance reduction technique. In terms of root mean square error, many coarse-grid evaluations
help reduce the sampling error while the (fewer) fine-grid evaluations help reduce the discretisation error.

Mission:
The primary objective of the project is to develop MLMC strategies to reduce the additional CPU cost associated with the estimation of the background error covariance matrix for ensemble-variational data assimilation methods. Additionally, the successful candidate will investigate the similarities in methodology and ingredients between multigrid methods and MLMC, and develop a unified framework for spatiotemporal
discretisations of PDEs. The proposed methodology will first be validated through numerical experiments on Burgers' equation, which is a widely-used 1D toy model in data assimilation. Then, the 3D extension will be developed and validated on a 2-layer quasi-geostrophic (QG) model implemented in the generic Object-Oriented Prediction System (OOPS) developed at the European Centre for Medium-Range Weather Forecasts (ECMWF). Applications in atmospheric chemistry and/or ocean data assimilation will also be
considered.

Activities :
The PhD student's role and work plan will be structured around the 3 tasks detailed below:
Task 1: Estimation of the background error covariance matrix using multi-fidelity ensembles.
This task will focus on the ensemble generation by using different fidelity levels, and use these ensembles in the estimation of the background error covariance matrix with a lower computational cost. Strategies for the localisation of the covariances estimated from the ensembles within the MLMC framework will be developed. The investigations will relate to theoretical (spatiotemporal localisation, consistency
between levels) and computational issues.
Task 2: State analysis estimation using multi-fidelity ensembles in ensemble variational data assimilation. This task consists of formulating the algorithm of multilevel ensemble variational data assimilation. OOPS currently incorporates the 4DEnVar algorithm which uses ensemble-derived error covariance matrices in the variational framework and estimates the linearised trajectory of the model by using 4D ensembles. Two complementary strategies will be investigated: the MLMC approach and multigrid
methods. This would naturally lead to the need to derive consistency conditions between the ensembles running on the different levels in order to guarantee the convergence of the estimation on the original high resolution level.
Task 3: Numerical experiments on idealised cases in atmospheric chemistry and ocean applications. Once the multilevel algorithm for ensemble variational data assimilation is formulated and first validated on Burgers' equation and/or QG model, Task 3 will be dedicated to the application of the algorithm to more complex problems arising in ESM data assimilation, such as for atmospheric chemistry
and ocean. This task will aim to demonstrate the capacity of the methods to tackle complex, large-scale problems.

Work Context

This PhD is part of the MFDA project, funded by the CNRS programme 80|PRIME 2020. The PhD student will work at CECI (Cerfacs) during the first year and when finalizing the thesis (numerical experiments in more complex systems) during part of the third year. He/She will work at IRIT (INP ENSEEIHT) during the second year and part of the third year. The advisors will be S. Gratton (Toulouse INP, IRIT) and A.
2 Weaver (Cerfacs). He/She will be co-supervised by the participants of the MFDA project (S. Gürol, P. Mycek, E. Simon).

Constraints and risks

Nothing to report

We talk about it on Twitter!