Research Internship - Scalable Reinforcement Learning for Online Multi-Agent Decision Making F/M
-
Contract type: Internship
-
Work time: Full time
-
Location Meylan
About NAVER LABS Europe
NAVER LABS Europe is part of the R&D division of NAVER, Korea’s leading Internet portal and a global tech company with a range of services that include search, commerce, content, fintech, robotics and cloud.
About the team
The Optimization with Learning team at NAVER LABS Europe conducts research at the intersection of machine learning and mathematical optimization, with a focus on sequential decision making, combinatorial optimization, and multi-agent coordination. Our work is motivated by challenging real-world robotics problems, in particular large-scale robot fleet coordination tasks in uncertain and dynamic environments. Our goal is to develop principled learning-based approaches for sequential decision making and combinatorial optimization that generalize beyond a single application and advance the broader research field. Through close collaborations with robotics teams across NAVER LABS, researchers have the opportunity to connect fundamental research questions with real operational problems. This creates a unique environment to pursue ambitious research directions, publish in leading conferences, and contribute to emerging AI-driven robotics systems.
The position
With the growing development of robotics services, the problem of orchestrating a fleet of robots (or autonomous agents) under various constraints has recently become a major design bottleneck, especially when seeking to optimize service operations. In the Optimization with Learning team, we are interested in optimizing multi-agent services involving robot fleets moving in open environments. The underlying challenges stem from hard combinatorial optimization problems, such as multi-robot routing and scheduling under uncertainty.
Recent advances in Reinforcement Learning have demonstrated remarkable capabilities in solving complex sequential decision-making problems. However, applying RL to real-world multi-agent systems remains challenging due to the combinatorial nature of the decision space, the intricate operational constraints and the long horizon of the tasks. These challenges are especially relevant in multi-robot coordination and logistics applications, where a large number of agents must continuously make decisions in dynamic and uncertain environments.
The purpose of the internship is to:
- Review the state of the art in Reinforcement Learning for online multi-agent coordination under uncertainty,
- Design, implement and evaluate RL policies for some multi-agent coordination tasks,
- Develop strategies to improve the robustness and scalability of these policies in stochastic environments,
- Evaluate the proposed approach on realistic multi-robot service and logistics scenarios using simulation environments representative of our industrial applications.
What we're looking for
- Enrollment in a PhD or Master's program in machine learning or computer science
- Strong background in deep Reinforcement learning
- Hands-on experience with Python and PyTorch
- Interest in combinatorial optimization and decision-making under uncertainty
- Familiarity with machine learning methods for graph-structured data (e.g., Graph Neural Networks or Graph Transformers) is a plus
Team Publications
Related team publications include:
- Learning to Solve the Multi-Agent Task Assignment Problem for Automated Data Centers - IROS 2025
- GOAL: a Generalist Combinatorial Optimization Agent Learner - ICLR 2025
- Multi-Agent Path Finding with Real Robot Dynamics and Interdependent Tasks for Automated Warehouses - ECAI 2024
- BQ-NCO: Bisimulation Quotienting for Efficient Neural Combinatorial Optimization - NeurIPS 2023
What we offer
- We foster a collaborative environment dedicated to ambitious, multidisciplinary projects that translate advanced research into impactful, real-world solutions, supported by 30+ years of experience in AI and related fields.
- Flexible work/life balance.
-
We are an equal opportunity employer that hires based on skills, experience, and merit. We foster an inclusive and diverse workplace where all qualified candidates are considered fairly, regardless of background.
-
We’re based in Meylan, close to Grenoble, a city that offers the perfect balance of urban life, cutting-edge research and technology, and spectacular mountain landscapes that provide countless opportunities to relax, recharge, and enjoy the outdoors.
All applications will be carefully considered, even if not all required skills are met. We value diverse backgrounds and the potential of each candidate, and we offer training to support the development of necessary skills.
NAVER LABS, co-located in Korea and France, is the organization dedicated to preparing NAVER’s future. Scientists at NAVER LABS Europe are empowered to pursue long-term research problems that, if successful, can have significant impact and transform NAVER. We take our ideas as far as research can to create the best technology of its kind. Active participation in the academic community and collaborations with world-class public research groups are, among others, important ways to achieve these goals. Teamwork, focus and persistence are important values for us.
When applying for this position online, please don't forget to upload your CV and cover letter. Incomplete applications will not be considered.
NAVER LABS Europe is subject to French jurisdiction requiring organisations to stipulate that a job/internship is open to both women and men. None of our jobs/internships are gender specific.

References
Réf: 50fa5976-74bc-4cf2-9b0c-6b1a490b3627