Research Internship - Scalable Reinforcement Learning for Online Multi-Agent Decision Making F/M

I apply

Published on 16/06/2026

Contract type: Internship
Work time: Full time
Location Meylan

About NAVER LABS Europe

NAVER LABS Europe is part of the R&D division of NAVER, Korea’s leading Internet portal and a global tech company with a range of services that include search, commerce, content, fintech, robotics and cloud.

About the team

The Optimization with Learning team at NAVER LABS Europe conducts research at the intersection of machine learning and mathematical optimization, with a focus on sequential decision making, combinatorial optimization, and multi-agent coordination. Our work is motivated by challenging real-world robotics problems, in particular large-scale robot fleet coordination tasks in uncertain and dynamic environments. Our goal is to develop principled learning-based approaches for sequential decision making and combinatorial optimization that generalize beyond a single application and advance the broader research field. Through close collaborations with robotics teams across NAVER LABS, researchers have the opportunity to connect fundamental research questions with real operational problems. This creates a unique environment to pursue ambitious research directions, publish in leading conferences, and contribute to emerging AI-driven robotics systems.

The position

With the growing development of robotics services, the problem of orchestrating a fleet of robots (or autonomous agents) under various constraints has recently become a major design bottleneck, especially when seeking to optimize service operations. In the Optimization with Learning team, we are interested in optimizing multi-agent services involving robot fleets moving in open environments. The underlying challenges stem from hard combinatorial optimization problems, such as multi-robot routing and scheduling under uncertainty.

Recent advances in Reinforcement Learning have demonstrated remarkable capabilities in solving complex sequential decision-making problems. However, applying RL to real-world multi-agent systems remains challenging due to the combinatorial nature of the decision space, the intricate operational constraints and the long horizon of the tasks. These challenges are especially relevant in multi-robot coordination and logistics applications, where a large number of agents must continuously make decisions in dynamic and uncertain environments.

The purpose of the internship is to:

Review the state of the art in Reinforcement Learning for online multi-agent coordination under uncertainty,
Design, implement and evaluate RL policies for some multi-agent coordination tasks,
Develop strategies to improve the robustness and scalability of these policies in stochastic environments,
Evaluate the proposed approach on realistic multi-robot service and logistics scenarios using simulation environments representative of our industrial applications.

What we're looking for

Enrollment in a PhD or Master's program in machine learning or computer science
Strong background in deep Reinforcement learning
Hands-on experience with Python and PyTorch
Interest in combinatorial optimization and decision-making under uncertainty
Familiarity with machine learning methods for graph-structured data (e.g., Graph Neural Networks or Graph Transformers) is a plus

Team Publications

Related team publications include:

Learning to Solve the Multi-Agent Task Assignment Problem for Automated Data Centers - IROS 2025
GOAL: a Generalist Combinatorial Optimization Agent Learner - ICLR 2025
Multi-Agent Path Finding with Real Robot Dynamics and Interdependent Tasks for Automated Warehouses - ECAI 2024
BQ-NCO: Bisimulation Quotienting for Efficient Neural Combinatorial Optimization - NeurIPS 2023

What we offer

We foster a collaborative environment dedicated to ambitious, multidisciplinary projects that translate advanced research into impactful, real-world solutions, supported by 30+ years of experience in AI and related fields.
Flexible work/life balance.
We are an equal opportunity employer that hires based on skills, experience, and merit. We foster an inclusive and diverse workplace where all qualified candidates are considered fairly, regardless of background.
We’re based in Meylan, close to Grenoble, a city that offers the perfect balance of urban life, cutting-edge research and technology, and spectacular mountain landscapes that provide countless opportunities to relax, recharge, and enjoy the outdoors.

All applications will be carefully considered, even if not all required skills are met. We value diverse backgrounds and the potential of each candidate, and we offer training to support the development of necessary skills.

NAVER LABS, co-located in Korea and France, is the organization dedicated to preparing NAVER’s future. Scientists at NAVER LABS Europe are empowered to pursue long-term research problems that, if successful, can have significant impact and transform NAVER. We take our ideas as far as research can to create the best technology of its kind. Active participation in the academic community and collaborations with world-class public research groups are, among others, important ways to achieve these goals. Teamwork, focus and persistence are important values for us.

When applying for this position online, please don't forget to upload your CV and cover letter. Incomplete applications will not be considered.

NAVER LABS Europe is subject to French jurisdiction requiring organisations to stipulate that a job/internship is open to both women and men. None of our jobs/internships are gender specific.

References

Réf: 50fa5976-74bc-4cf2-9b0c-6b1a490b3627

Apply to this open position

Research Internship - Scalable Reinforcement Learning for Online Multi-Agent Decision Making F/M

Internship

Full time

Meylan

Title *

Mrs.

Mr.

Last name *

First name *

E-mail *

Phone number *

Photo

Resume *

As part of the creation of your application profile on the career site of NAVER LABS Europe, the information collected above is processed based on the legal grounds of pre-contractual measures for establishing contact with NAVER LABS Europe and for the potential conclusion of an employment contract with the company, as well as on the legitimate interest in building a CV database.

No transfer outside the European Union will take place.

Unless you take action, your data will be retained for no longer than 2 years.

In accordance with current security standards and policies (PSSI), effective and optimal technical measures are applied to data processing (secure access and protocols, rights management and administration, employee awareness, etc.).

In compliance with the European regulations on data protection, you have the right to object to, access, rectify, and delete your data.

* Required fields

Share job

Research Internship - Scalable Reinforcement Learning for Online Multi-Agent Decision Making F/M

About NAVER LABS Europe

About the team

The position

What we're looking for

Team Publications

What we offer

References

Research Internship - Scalable Reinforcement Learning for Online Multi-Agent Decision Making F/M

NAVER LABS Europe is an equal opportunity employer.