Where PhDs and companies meet
Menu
Login

Taming Linux memory management for data science: process placement, NUMA topology, and runtime interaction // Taming Linux memory management for data science: process placement, NUMA topology, and runtime interaction

ABG-138650
ADUM-74411
Thesis topic
2026-04-22
Université Côte d'Azur
Sophia Antipolis Cedex - Provence-Alpes-Côte d'Azur - France
Taming Linux memory management for data science: process placement, NUMA topology, and runtime interaction // Taming Linux memory management for data science: process placement, NUMA topology, and runtime interaction
  • Computer science
Memory management, process placement, data science, experimental science
Memory management, process placement, data science, experimental science

Topic description

The subject is only described in English. B2/C1 level of English is mandatory to apply.

Large-scale data science workloads are increasingly constrained not by algorithmic complexity or model architecture, but by the physical limits of memory hierarchies and processor topology. On modern servers, Non-Uniform Memory Access (NUMA) architectures and GPU accelerators introduce asymmetric memory access costs that remain largely invisible to application-level code yet have a decisive impact on performance. The Linux kernel mediates access to these resources through a set of scheduling, memory placement, and migration policies that were designed and configured for general-purpose workloads in standard distributions, and that interact in poorly documented ways with the Python runtime, its garbage collector, and the memory allocation patterns of data science libraries such as NumPy, Pandas, and Polars. The result is a class of performance pathologies that are reproducible in practice but have not yet been systematically characterized or addressed.

The central research question is: 'How do Linux process and thread placement policies interact with NUMA topology and GPU memory hierarchies to produce performance pathologies in memory-intensive data science workloads, and what kernel-level or runtime-level interventions can systematically eliminate them?'

Two axes structure the investigation. The first, system-level, characterizes and models the interaction between the Linux memory management subsystem, NUMA placement policies, and the Python runtime through kernel instrumentation, source code analysis, and controlled experiments designed to reproduce and bound the conditions under which pathological behaviour emerges.

The second, intervention-level, proposes, implements, and evaluates concrete mechanisms for process and thread placement optimization, ranging from numactl-based static binding strategies to dynamic kernel-level migration policies and runtime-aware allocation hooks, with evaluation on realistic data science workloads.

The thesis sits at the intersection of three active research communities: operating systems memory management (NUMA policy, page migration, swap behaviour), high-performance and GPU computing (memory coalescing, unified memory, PCIe transfer costs), and Python runtime internals (the CPython allocator, garbage collector, and their interaction with the OS virtual memory subsystem). Its distinguishing contribution is the exclusive focus on the data science execution context, where large working sets, irregular access patterns, and the overhead of interpreted runtimes create a qualitatively different performance profile from the HPC or database workloads that dominate the existing NUMA literature.

What makes this thesis distinctive is its methodological depth: it combines low-level kernel instrumentation and source code analysis with controlled performance experiments, bridging the gap between operating systems research and the practical realities of data science at scale. This combination is rare in the systems performance literature and constitutes the core scientific bet of the thesis.

The specific research directions, experimental designs, and target systems will be refined throughout the thesis in response to emerging results and ongoing collaboration with supervisors. The above framing defines the problem space and initial methodology, not a fixed programme.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The subject is only described in English. B2/C1 level of English is mandatory to apply.

Large-scale data science workloads are increasingly constrained not by algorithmic complexity or model architecture, but by the physical limits of memory hierarchies and processor topology. On modern servers, Non-Uniform Memory Access (NUMA) architectures and GPU accelerators introduce asymmetric memory access costs that remain largely invisible to application-level code yet have a decisive impact on performance. The Linux kernel mediates access to these resources through a set of scheduling, memory placement, and migration policies that were designed and configured for general-purpose workloads in standard distributions, and that interact in poorly documented ways with the Python runtime, its garbage collector, and the memory allocation patterns of data science libraries such as NumPy, Pandas, and Polars. The result is a class of performance pathologies that are reproducible in practice but have not yet been systematically characterized or addressed.

The central research question is: 'How do Linux process and thread placement policies interact with NUMA topology and GPU memory hierarchies to produce performance pathologies in memory-intensive data science workloads, and what kernel-level or runtime-level interventions can systematically eliminate them?'

Two axes structure the investigation. The first, system-level, characterizes and models the interaction between the Linux memory management subsystem, NUMA placement policies, and the Python runtime through kernel instrumentation, source code analysis, and controlled experiments designed to reproduce and bound the conditions under which pathological behaviour emerges.

The second, intervention-level, proposes, implements, and evaluates concrete mechanisms for process and thread placement optimization, ranging from numactl-based static binding strategies to dynamic kernel-level migration policies and runtime-aware allocation hooks, with evaluation on realistic data science workloads.

The thesis sits at the intersection of three active research communities: operating systems memory management (NUMA policy, page migration, swap behaviour), high-performance and GPU computing (memory coalescing, unified memory, PCIe transfer costs), and Python runtime internals (the CPython allocator, garbage collector, and their interaction with the OS virtual memory subsystem). Its distinguishing contribution is the exclusive focus on the data science execution context, where large working sets, irregular access patterns, and the overhead of interpreted runtimes create a qualitatively different performance profile from the HPC or database workloads that dominate the existing NUMA literature.

What makes this thesis distinctive is its methodological depth: it combines low-level kernel instrumentation and source code analysis with controlled performance experiments, bridging the gap between operating systems research and the practical realities of data science at scale. This combination is rare in the systems performance literature and constitutes the core scientific bet of the thesis.

The specific research directions, experimental designs, and target systems will be refined throughout the thesis in response to emerging results and ongoing collaboration with supervisors. The above framing defines the problem space and initial methodology, not a fixed programme.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Début de la thèse : 01/10/2026

Funding category

Funding further details

Contrat doctoral EDSTIC-DS4H

Presentation of host institution and host laboratory

Université Côte d'Azur

Institution awarding doctoral degree

Université Côte d'Azur

Graduate school

84 STIC - Sciences et Technologies de l'Information et de la Communication

Candidate's profile

The profile is only described in English. B2/C1 level of English is mandatory to apply. The candidate must hold a Master or equivalent degree when starting the PhD. The candidate required skills are: - C1 level in English (possibly B2 close to reach C1) - Excellent programming and systems skills (including C programming). We will work in a Linux environment with python and with its data science libraries (numpy, pandas, polars, seaborn, scikit-learn, statmodels). We also use Git. If the candidate is not fluent in Python, they must be *fluent* in another language and able to learn Python fast. - The ideal candidate will have a solid grasp of operating systems architecture. Experience in kernel development will be considered a strong asset. - Excellent communication skills. An important part of the Ph.D. is to communicate on the results. The candidate must be ready to write high quality papers and give stunning talks. These skills will be nurtured during the Ph.D. thesis. - Curious, highly motivated, hard worker, autonomous, perfectionist. A good sign you have a profile to make an excellent Ph.D. thesis is when you cannot stand to do not understand something and can work hard to get it (or make your stuff work). Before deciding to make a Ph.D. thesis, you must read references in this page to be sure you made the right decision http://www-sop.inria.fr/members/Arnaud.Legout/phdstudents.html If you apply, I expect that you will directly get in touch with us very early in the process (arnaud.legout@inria.fr, damien.saucez@inria.fr) to discuss if you are a good fit for the subject and I am a good fit as a supervisor. Discussing early with a potential supervisor about the subject and the supervision style (even if you are not sure to apply) is a sign of maturity and will be highly appreciated.
The profile is only described in English. B2/C1 level of English is mandatory to apply. The candidate must hold a Master or equivalent degree when starting the PhD. The candidate required skills are: - C1 level in English (possibly B2 close to reach C1) - Excellent programming and systems skills (including C programming). We will work in a Linux environment with python and with its data science libraries (numpy, pandas, polars, seaborn, scikit-learn, statmodels). We also use Git. If the candidate is not fluent in Python, they must be *fluent* in another language and able to learn Python fast. - The ideal candidate will have a solid grasp of operating systems architecture. Experience in kernel development will be considered a strong asset. - Excellent communication skills. An important part of the Ph.D. is to communicate on the results. The candidate must be ready to write high quality papers and give stunning talks. These skills will be nurtured during the Ph.D. thesis. - Curious, highly motivated, hard worker, autonomous, perfectionist. A good sign you have a profile to make an excellent Ph.D. thesis is when you cannot stand to do not understand something and can work hard to get it (or make your stuff work). Before deciding to make a Ph.D. thesis, you must read references in this page to be sure you made the right decision http://www-sop.inria.fr/members/Arnaud.Legout/phdstudents.html If you apply, I expect that you will directly get in touch with us very early in the process (arnaud.legout@inria.fr, damien.saucez@inria.fr) to discuss if you are a good fit for the subject and I am a good fit as a supervisor. Discussing early with a potential supervisor about the subject and the supervision style (even if you are not sure to apply) is a sign of maturity and will be highly appreciated.
2026-05-03
Partager via
Apply
Close

Vous avez déjà un compte ?

Nouvel utilisateur ?