En poursuivant votre navigation sur ce site, vous acceptez le dépôt de cookies dans votre navigateur. (En savoir plus)

High Performance Computing for the Large Hadron Collider M/F

This offer is available in the following languages:
- Français-- Anglais

Date Limite Candidature : vendredi 14 avril 2023

Assurez-vous que votre profil candidat soit correctement renseigné avant de postuler. Les informations de votre profil complètent celles associées à chaque candidature. Afin d’augmenter votre visibilité sur notre Portail Emploi et ainsi permettre aux recruteurs de consulter votre profil candidat, vous avez la possibilité de déposer votre CV dans notre CVThèque en un clic !

Informations générales

Intitulé de l'offre : High Performance Computing for the Large Hadron Collider M/F (H/F)
Acronyme : HPC4LHC
Référence : CPJ-2023-001
Nombre de Postes : 1
Site(s) concerné(s) : Université Lyon / Centre de Calcul de l’IN2P3
Région(s) académique(s) : Lyon
Etablissement(s) partenaire(s) envisagé(s) : Université de Lyon 1
Code(s) établissement(s) : UAR6402
Date de publication : jeudi 16 mars 2023
Type de contrat : Chaire de professeur Junior
Durée du contrat : 4 ans
Date d'embauche prévue : 1 juillet 2023
Quotité de travail : Temps complet
Rémunération : Annual salary from 54 600 Euros to 57 800 Euros depending on professional experience
Thématique scientifique : Experimental or observational Big Data processing, high throughtput and real time processing on high performance computing infrastructures. Artificial Intelligence approaches will be used in the perspective of the Exascale, taking into account the imperatives of energy sobriety.
Mots clés : Big Data, HPC, HTC, Artificial intelligence, Real time computing, Exascale
Section(s) CN : Information sciences: bases of information technology, calculations, algorithms, representations, uses

Profil Recherché

Titulaire d’un doctorat ou diplôme équivalent ou justifiant de titres et travaux scientifiques jugés équivalents par l’instance compétente de l’établissement. Il n’y a aucune condition d’âge ou de nationalité pour candidater. Tous les emplois CNRS sont accessibles aux personnes en situation de handicap en bénéficiant d’aménagement d’épreuves rendus nécessaires par la nature du handicap

Stratégie d'établissement

Computing infrastructures are essential for the exploitation of data produced by research infrastructures for the production of scientific results. The CNRS has large computing capacities with the IN2P3 Computing Centre (CC-IN2P3), which plays a central role in the processing of LHC data, and IDRIS, which is an intensive computing centre, hosting in particular the Jean Zay supercomputer. Within the framework of the Equipex FITS, the two computing centres have started a collaboration for an efficient and transparent cross-use of their computing resources. An exemplary use case for such a structure is the processing of LHC data in its high-luminosity phase, which represents an increase of at least one order of magnitude in the volume and flow of data compared to the current phase. To handle such data flows, ongoing and future developments in artificial intelligence and online data processing will require resources with heterogeneous technologies. The tools that will have to be developed for a joint and optimal use of the two computing centres in this context will pave the way for an optimised exploitation of the available resources and for the exploitation of the more heterogeneous technologies that are currently developped.

Stratégie du laboratoire d'accueil

This project is a continuation of the synergy between CC-IN2P3, IDRIS and GENCI initiated by the FITS project in order to provide resources (calculation, storage) and tools for data processing in research infrastructures (IR and IR*). Downstream of the availability of Exascale computers, characterised by massively accelerated architectures, the proposed project will prepare for their use, with a view to processing massive data, in particular for the LHC experiments. The LHC collaborations have already produced 2 exabytes of data and their current exploitation relies mainly on distributed High Throughput Computing infrastructures. Thus, the WLCG computing grid is built on a worldwide network of more than 250 sites operating interconnected computing farms and storage facilities to process the data. The CC-IN2P3 is one of the twelve level 1 centres (Tier1). Exploring other possible architectures, in particular optimised use of infrastructures such as the Jean Zay supercomputer, could eventually allow cross-use of the different computing infrastructures. This project will also strengthen cooperation between the CC-IN2P3 research team and those of the INS2I laboratories in Lyon, LIRIS or LIP.

Stratégie Internationale

Thanks to this rapprochement, the CNRS can equip itself with complementary interoperable data processing infrastructures such as those that exist in other European countries (Germany, Netherlands, Italy). The use of HPC for processing massive data is already widely used in the United States. The collaboration between the CC-IN2P3 and IDRIS on the issues of cross-use of resources is part of the continuity of the projects in which these two units have been involved for several years and within the framework of leading international collaborations, in particular with CERN, which is leading the development of data processing architectures for the LHC on one hand, and is contributing, on the other hand, to actions aimed at proposing unified processing or storage architectures that are also capable of responding to the needs of other scientific communities.

Répertoire national des structures de recherche (RNSR) du laboratoire d'accueil

197619804K

Résumé du projet scientifique

The project will develop the tools necessary for data processing on different types of computing and data storage infrastructures. To this end, the tools must be designed to make this use largely transparent to users, taking into account the constraints imposed by each type of infrastructure. Workflows and software that manage both computing resources and data storage and transfer in an optimised manner must be designed, adapted and integrated. These developments will require the use of artificial intelligence and diversified computing technologies as well as the adaptation or design of new methods to make the best use of exascale computers. The energy efficiency of data processing chains will also be at the heart of this research. Although the HL-LHC data processing will be taken as a concrete example of implementation, the work will have a broader scope and will certainly guide practices for the intensive data processing of other large research infrastructures in the future.

Résumé du projet d'enseignement

The activities of the person recruited will be part of the training dynamic of one of the partner institutions in Lyon in the field of data calculation, processing and management, which leads to a profile commonly called "Data Scientist". Through this chair project, the person will naturally be led to propose training modules through research in the field. His/her skills should not only be in the field of computing infrastructure architectures, but also in domains and artificial intelligence. The recruited candidate will be able to contribute to training at various levels and teaching structures such as the Claude Bernard Lyon 1 University IUT, Bachelor and Master of Physics or the ENS Lyon. He will have the task of developing a high level of teaching, also in English, in order to respond to the dynamic of international openness.

Environnement Financier

  • Total financé (dont package ANR) : 200 k€
  • Co financement : 210 k€
  • Total du projet : 410 k€

Diffusion scientifique

The dissemination of the results will be done through world-class scientific productions: publications, patents, software... In addition, the results will be communicated to various  targets such as scientific communities, media, decision makers, general public, schools, etc., with an adapted calendar. Specific tools may  be developed such as websites, newsletters, meetings, international symposia, summer schools and conferences.
 More specifically, the results of this research and their impact on physics applications as well as the numerical methods developed within the framework of this project will be presented at dedicated conferences and published in scientific journals in the field. In addition to the journals in the field of computer science, the very prestigious conference series (organised every 18 months) International Conference on Computing in High Energy & Nuclear Physics (CHEP) and publications in the journal Computing and Software for Big Science will be targeted.

Science ouverte

The CNRS is developing a strong policy in favor of open science. Open science consists of making research results "as accessible  as possible and closed as necessary". As such, the CNRS aims to make 100% of the texts of publications resulting from the work of its laboratories accessible , in particular through deposit in HAL. The data produced must also be made available and reusable, except for specific restrictions. In addition, the guiding principles of individual evaluation have been revised in accordance with the DORA declaration, to be more qualitative and to take into account all facets of the researcher's profession.
The publication of codes is a widespread practice that drives progress in data science. This project naturally adopts this established culture and will make public the methods, algorithms and codes developed. Furthermore, the LHC collaborations mentioned in this project each have a strong policy of opening up data, software and publications, supported by CERN, which has been a forerunner in this area. Similarly, the vast majority (>90%) of IN2P3 publications are already open access.

Science et société

The relationship between science and society is now recognized as a full dimension of scientific activity. The project will develop this dimension in synergy with all the partners. The resulting research work will contribute to informing public decision-making. Participatory science initiatives may be initiated with actors from the project’s socio-economic and cultural eco-system .
In addition, the project will implement communication towards various targets such as the scientific community, the media, decision-makers, the general public, schools, etc., with an adapted calendar. The communication actions at the CC-IN2P3 use a wide variety of formats: articles, conferences, meetings with the public (science festival), visits to the infrastructures and the museum, interactive digital events (twitch platform, etc.).

Indicateurs

The activity will be evaluated in particular on the basis of scientific production (publications, software, patents, etc.), on institutional and private partnerships formalized by contracts, on international presence, on the promotion of work to  multidisciplinary scientific communities, on innovation and its transfer to society and on scientific dissemination to non-specialist audiences.
More specifically, the progress of the project will be monitored through the project reviews that are standard practice at the CC-IN2P3 and IN2P3. The success of the project will be measured by the functionality of the developed system, but also by publications in journals such as Computing and Software for Big Science and those related to the research areas concerned, as well as conference presentations such as the International Conference on Computing in High Energy & Nuclear Physics (CHEP) and papers and demonstrations for example at the annual Supercomputing conference in the United States or in journals and conferences related to high performance computing, data analysis or AI.

Modalités d'organisation des auditions

Seul(e)s seront convoqué(e)s aux auditions les candidat(e)s sélectionné(e)s sur dossier par la commission de sélection