Principal AI/ML Researcher / Engineer — Reasoning, Planning, and Decision-Making Systems

at Airbnb

📍 United States

USD 296,000-370,000 per year

SENIOR

✅ Remote ✅ Hybrid

Used Tools & Technologies

RAG

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Python @ 6 Java @ 6 Machine Learning @ 4 Leadership @ 4 Communication @ 4 Planning @ 7 LLM @ 4 PyTorch @ 4 AI @ 7 Reinforcement Learning @ 4 Robotics @ 4 GenAI @ 4 JAX @ 4

Details

Airbnb is seeking a Principal / Distinguished AI/ML Researcher and/or Engineer with deep experience in reasoning, planning, and decision-making systems. The role focuses on inventing, scaling, and operationalizing intelligent decisioning substrates that blend symbolic and sub-symbolic methods, enabling AI systems that move beyond pattern recognition into deliberation, foresight, and agency. You will design and deploy multi-agent systems, integrate post-trained foundational models with explicit memory and knowledge, and apply reinforcement learning as a first-class component of adaptive planning and control. Collaboration across disciplines and influence on company-wide AI architecture are core aspects of this position.

Responsibilities

Drive foundational and applied research in reasoning engines, planning architectures, and decision-making frameworks at scale; incorporate genAI into ranking, recommendation, and personalization stacks across single-model and multi-agent system-level intelligence.
Advance techniques in LLM/LRM post-training, reinforcement learning–based decisioning, and knowledge-integrated agents.
Design methods for plan induction, value estimation, and contingency modeling within intelligent agents.
Explore and validate protocols for distributed reasoning and joint planning among cooperative agents in multi-agent systems.
Architect RPD systems that integrate post-trained LLMs/LRMs, graph-structured memory (e.g., knowledge graphs), and RL-driven controllers.
Design recursive task planners, search-based or policy-based reasoners, and belief-state trackers interoperable with large model substrates.
Ensure modularity and extensibility through multi-agent frameworks, agentic substrates, and declarative planning pipelines; define communication protocols and cross-agent knowledge alignment mechanisms.
Build and evolve stateful, dynamic models combining supervised learning with online/offline reinforcement, simulation-based rollouts, and symbol grounding.
Implement hybrid pipelines coupling learned embeddings, prompted generative models, and graph-theoretic inference; optimize for adaptive exploration, planning horizon control, and policy robustness.
Set technical direction and provide leadership for planning/reasoning infrastructure; mentor teams in systems thinking, causal modeling, and connectionist-symbolic integrations.
Productionize real-time reasoning loops with low-latency inference, caching, retrieval-augmented generation, and streaming updates to symbolic memory; deploy post-training hooks for inserting logic, constraints, and domain priors into large models.
Create monitoring, attribution, and evaluation pipelines for agent behavior and decision quality; operationalize multi-agent orchestration for reliable and fault-tolerant communication and decision propagation.

Requirements (Minimum Qualifications)

Masters or equivalent in Computer Science, AI, Cognitive Science, or related fields.
Recent published work or patents in AI, Cognitive Science, or related fields.
15+ years in AI/ML, including post-training architectures and production-scale reasoning systems.
Advanced coding proficiency in Java, Python, C++, or similar.
Experience with ML/RL frameworks at scale (examples listed: PyTorch, Ray, JAX, RLlib).
Proven experience integrating LLMs/LRMs with Knowledge Graphs or structured world models.
Deep understanding of Reinforcement Learning and its application to decisioning and planning.
Fluency in hybrid model architectures: connectionist-symbolic fusion, retrieval-based agents, or goal-directed transformers.
Experience working on multi-agent coordination, distributed RL, or cooperative inference systems.

Preferred Qualifications

Ph.D. in AI, Machine Learning, Robotics, Cognitive Systems, or related areas.
Published work or patents in multi-agent reasoning, plan synthesis, knowledge-augmented learning, or generative control.
Experience in cognitive architectures, neuro-symbolic systems, or agent-based simulation environments.
Demonstrated ability to lead cross-functional research-to-production transitions.
Experience with memory architectures, task graphs, or semantic program induction.
Prior work on distributed intelligence platforms with explicit agent interaction models and collective decision-making logic.

Your Location / Work Policy

This position is US - Remote Eligible. The role may include occasional work at an Airbnb office or attendance at offsites, as agreed with your manager. You must live in a U.S. state where Airbnb, Inc. has a registered entity (some states may be excluded).

How We'll Take Care of You

The role's actual base pay depends on many factors (training, transferable skills, experience, business needs). The base pay range is subject to change. This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.

Pay Range

$296,000—$370,000 USD