Explainability and privacy for synthetic time series generation
ABG-133320 | Sujet de Thèse | |
04/09/2025 | Contrat doctoral |
- Informatique
Description du sujet
Financial transactions, water, gas, or electricity consumption, biomedical signals... A vast amount of personal data is today generated in the form of timestamped sequences of data called time series hereafter. These time series are collected and stored by companies or public organizations in order to support a large variety of usages (e.g., fraud detection in financial flows, epidemiology, smartgrid management). They carry detailed information about individual behaviors or health status. As a result, for obvious privacy reasons (e.g., large-scale re-identifications [5]), they are today mostly secluded within the systems that collect them, obliterating the benefits expected from large scale time series sharing.
Generative models are promising solutions. Given an input set of time series, they generate a set of synthetic time series that is different from, but statistically close to, the input training set [3]. When protected by sound privacy-preserving mechanisms (e.g., differentially private perturbation [4]), they carry the promise to enable organizations to share (synthetic) time series at a large scale without jeopardizing privacy guarantees. However, utility of synthetic time series is both complex and hard to achieve, especially when strong privacy guarantees, e.g., differential privacy, are met. First no generative model consistently outperforms the others on all the datasets or on all the utility metrics. Second, within a set of synthetic time series, some time series might exhibit punctual anomalies. As a result, time series generative models need to be able to explain their outputs both globally and locally in order to ensure their validity, to understand the sources of errors, and eventually to allow reliable usages.
While there exists a rich litterature studying explainability techniques for classifiers (e.g., [2]) the issue of explaining generative models has largely been ignored until very recently. The need to provide differential privacy guarantees further complicates the issue because the privacy guarantees must cover the explainability algorithms in addition to the generative model, and they require to inject possibly large random perturbations at training time, introducing additional variance in the results.
The goal of this PhD thesis is to design, implement, and thoroughly evaluate explainability techniques for synthetic time-series generation algorithms with differentially private guarantees.
The main tasks of the PhD student will be to:
- Study the state-of-the-art work about privacy-preserving synthetic time-series generation algorithms, time series explainability, and privacy-preserving explainability techniques for classifiers.
- Design differentially private explainability techniques for privacy-preserving synthetic time-series generation algorithms and thoroughly demonstrate and evaluate their privacy and utility guarantees.
- Contribute to the organisation of competitions where the privacy guarantees of synthetic time series generation algorithms are challenged [1] (see for example the Snake challenge: https://snake challenge.github.io/).
Nature du financement
Précisions sur le financement
Présentation établissement et labo d'accueil
This PhD offer is funded by the Chaire CPDDF (Fondation Univ. Rennes) and proposed by the Security and Privacy team (SPICY) from the IRISA institute in Rennes, France. The work will be supervised jointly by Tristan Allard (PhD, HDR) associate professor at the University of Rennes, expert in privacy in data intensive systems, and Romaric Gaudel (PhD, HDR), expert in machine learning and explainability.
The Chaire CPDDF involves six industrial partners (Apixit, Chambre Interdépartemental des Notaires, Crédit Mutuel Arkea, Enedis, Sogescot, Veolia) and one public organization (Région Bretagne). Regular meetings will be organized with the partners in order to favor collaborations on the topic. The successful candidate will be working at IRISA – the largest French research laboratory in the field of computer science and information technologies (more than 850 people). IRISA provides an exciting environment where French and international researchers perform cutting edge scientific activities in all domains of computer science.
Rennes is located in the West part of France in the beautiful region of Brittany. From Rennes, you can reach the sea side in about 45 minutes by car and Paris center in about 90 minutes by train. Rennes is a nice and vibrant student-friendly city. It is often ranked as one of the best student cities in France. Rennes is known and appreciated for its academic excellence, especially in the field of cybersecurity, its professional landmark, the quality of its student life, the affordability of its housing offer, its rich cultural life, and much more.
Site web :
Intitulé du doctorat
Pays d'obtention du doctorat
Etablissement délivrant le doctorat
Profil du candidat
The candidate must have obtained, or be about to obtain, a master degree in computer science or in a related field.
- The candidate must be curious, autonomous, and rigorous.
- The candidate must be able to communicate in English (oral and written). The knowledge of the French language is not required.
- The candidate must have a strong interest in machine learning.
- Skills in cybersecurity, especially in privacy, will be appreciated.
Vous avez déjà un compte ?
Nouvel utilisateur ?
Vous souhaitez recevoir nos infolettres ?
Découvrez nos adhérents
SUEZ
Nokia Bell Labs France
Groupe AFNOR - Association française de normalisation
CESI
Laboratoire National de Métrologie et d'Essais - LNE
Institut Sup'biotech de Paris
Généthon
MabDesign
CASDEN
Tecknowmetrix
ONERA - The French Aerospace Lab
TotalEnergies
Ifremer
ADEME
ASNR - Autorité de sûreté nucléaire et de radioprotection - Siège
PhDOOC
Aérocentre, Pôle d'excellence régional
MabDesign
ANRT