En poursuivant votre navigation sur ce site, vous acceptez le dépôt de cookies dans votre navigateur. (En savoir plus)

doctorant (H/F) : Ontology-Driven Phenotyping from Electronic Health Records

This offer is available in the following languages:
Français - Anglais

Date Limite Candidature : mercredi 29 juin 2022

Assurez-vous que votre profil candidat soit correctement renseigné avant de postuler. Les informations de votre profil complètent celles associées à chaque candidature. Afin d’augmenter votre visibilité sur notre Portail Emploi et ainsi permettre aux recruteurs de consulter votre profil candidat, vous avez la possibilité de déposer votre CV dans notre CVThèque en un clic !

General information

Reference : UMR5800-MAGHIN-020
Workplace : BORDEAUX
Date of publication : Wednesday, June 8, 2022
Scientific Responsible name : Bienvenu Meghyn
Type of Contract : PhD Student contract / Thesis offer
Contract Period : 36 months
Start date of the thesis : 1 October 2022
Proportion of work : Full time
Remuneration : 2 135,00 € gross monthly

Description of the thesis topic

Scientific context:

With the increasing adoption of electronic health records (EHRs), the amount of data produced at the patient bedside is rapidly increasing. These data provide new perspectives to create and disseminate new knowledge and to enable the implementation of personalized and predictive medicine. Indeed, the secondary use of biomedical data produced throughout the patient's care is an essential issue and has been the subject of numerous studies for several years. Although platforms for secondary use of EHR data have been developed, their adoption and implementation within healthcare remain slow and the use of these data by clinical and research experts remains complex to implement. The secondary use of EHR data is therefore not an easy task.

In particular, the use of raw EHRs for phenotyping (i.e. identifying patients based on their clinical characteristics) remains a challenge. Querying these data to identify patients matching a specific phenotype (such as patients with metastatic lung cancer treated with tyrosine kinase inhibitors) often results in an incomplete and noisy set of patients due to imperfect, erroneous, and inconsistent data. For example, the data may not contain an accurate diagnosis if patients were diagnosed outside the hospital. This information must then be extracted from free text documents or inferred from other types of available information (treatments, surgical procedures...). Finally, the information shared by healthcare professionals in the context of care is based on common knowledge and much information (or relationships between information) is not explicit in the data. It is therefore necessary to identify patients who meet these criteria and to classify them according to the consistency and completeness of their data in a context of incomplete heterogeneous data that is dependent on time and external knowledge.

Ontology-based data access (OBDA) is a promising declarative approach that leverages semantic knowledge and automatic reasoning to bridge the gap between users' information needs and the way data is actually stored. As OBDA systems mature, it is interesting to study their contribution in the context of secondary use of health data.

Research topic:

The recruited student will work in collaboration with the supervisors to study the capacity of OBDA systems to perform tasks allowing the exploitation of health data in the context of applied use cases. This work will be done using data from the Health Data Warehouse of the Bordeaux University Hospital. The steps envisaged for this thesis are the following:

(1) Implementation and evaluation of an OBDA system in the context of patient phenotyping in cancerology (identification and automatic annotation of cohorts of patients with malignant lung tumors),

(2) Implementation and evaluation of an OBDA system in the context of visualization and rapid exploration of complex patient records (problem-based visualization, patient-centric semantic search engine, temporal representation of health events dependent on a point data set).

The thesis will primarily focus on applied research, with an evaluation of the methods developed in the context of real research projects (metastatic lung cancer, Vexas syndrome...) integrating incomplete, potentially inconsistent and temporal data issues.

Desired profile

- Knowledge of the medical domain and knowledge representation and reasoning (ontologies and description logics) is required.

- Strong English language skills (reading, writing, & speaking) are expected.

Work Context

This position is part of the INTENDED Chair on Artificial Intelligence (https://intended.labri.fr/, 2020-2025), whose aim is to develop intelligent, knowledge-based methods for handling imperfect data.

The PhD thesis will be co-supervised by Meghyn Bienvenu (LaBRI, Bordeaux), Vianney Jouhet (UIAM - CHU de Bordeaux, AheAD - BPH - Inserm U1219) and Fleur Mougin (AheAD - BPH - Inserm U1219) The student will work mainly in the Medical Informatics and Archivistics Unit (UIAM) of the Bordeaux University Hospital, whose missions include the secondary use of data (Health Data Warehouse, phenotyping) and the centralized management of biomedical terminologies. The student will also join the ERIAS team (Equipe de Recherche en Informatique Appliquée à la Santé) of the Inserm Bordeaux Population Health center, which includes a dozen researchers developing approaches to the extraction, integration, representation and interrogation of biomedical data and knowledge, as well as the RATIO team (Reasoning with data, knowledge and constraints) of the LaBRI (Laboratoire Bordelais de Recherche en Informatique), which brings together around fifteen researchers interested in diverses topics around logical reasoning.

Additional Information

For more information on the position, please consult the project website (https://intended.labri.fr/), or get in touch with Vianney Jouhet (vianney.jouhet@chu-bordeaux.fr).

We talk about it on Twitter!