AI Research Engineer - Applied Scientist Compilers

at Nvidia

📍 Santa Clara, United States

USD 152,000-241,500 per year

MIDDLE

✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Python @ 6 Machine Learning @ 3 Hiring @ 3 Experimentation @ 3 CUDA @ 3 GPU @ 3 AI @ 3 Reinforcement Learning @ 3 Profiling @ 3 Prompt Engineering @ 3

Details

NVIDIA's GPUs are at the core of modern AI infrastructure, from training large-scale models to running inference in production. This position focuses on software and compiler engineering that enable GPU acceleration for modern machine learning models. The team develops AI-based compiler solutions that integrate with NVIDIA's software stack.

Responsibilities

Help trailblaze company efforts in applying AI within conventional compilation pipelines.
Design and implement AI-based technology addressing core problems of low-level GPU programming.
Build training pipelines for supervised fine-tuning and reinforcement learning (RL/RLHF-style or policy optimization variants).
Define model inputs/outputs over low-level compiler representations.
Develop evaluation frameworks to measure code quality, runtime, compile-time overhead, and correctness.
Intelligent (domain/task-based) prompt engineering.
Collaborate with compiler engineers to integrate learned policies into production toolchains.
Prototype and iterate on model architectures, prompts, and fine-tuning strategies for scheduling and allocation tasks.
Create datasets from compiler traces, optimization passes, and target-specific performance signals.
Apply RL techniques to optimize for downstream objectives (performance, spill reduction, instruction-level parallelism, etc.) and run rigorous experiments, ablations, and benchmarking across workloads and hardware targets.

Requirements

M.S. or PhD degree in Computer Engineering, Computer Science, or a related technical field (or equivalent experience).
5+ years of experience building AI/ML systems.
Strong software engineering skills in Python and at least one systems language (C++ preferred).
Hands-on experience training/fine-tuning large models (Transformers, PEFT/LoRA, distributed training).
Solid understanding of machine learning fundamentals and experimentation best practices.
Experience with reinforcement learning (e.g., policy gradients, actor-critic, offline RL, bandit-style optimization).
Knowledge of prompt-engineering techniques.
Ability to work across research and engineering, from prototype to production.

Preferred / Ways to Stand Out

Distributed training/inference at scale.
Experience working with the NVIDIA NeMo framework.
Understanding of GPU performance, experience with benchmarking suites and performance profiling tools.
Formal methods or static analysis familiarity for correctness guarantees.
CUDA programming experience.

Compensation & Benefits

Base salary range: 152,000 USD - 241,500 USD (base salary will be determined based on location, experience, and pay of employees in similar positions).
Eligible for equity and benefits (link provided in original posting).

Additional Information

Applications for this job will be accepted at least until April 26, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is an equal opportunity employer and states non-discrimination in hiring and promotion practices.