Senior Software Engineer, Recipe Pathfinding

at Nvidia

📍 Redmond, United States

USD 184,000-356,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Python @ 6 Communication @ 7 Performance Optimization @ 4 Data Analysis @ 4 Debugging @ 4 LLM @ 4 PyTorch @ 4 CUDA @ 3

Details

NVIDIA is seeking a Senior Software Engineer to discover and innovate new low-precision and sparsity recipes in the pretraining setting. A recipe defines which operators in an LLM are transformed into low-precision and/or sparsified variants, thereby unlocking efficiency gains. Recipes can be statically defined at model load time or dynamically adapt to a layer’s input distribution. Recipes can incorporate and compose algorithmic techniques like rotations or low-rank decompositions to tame aggressors like outliers. The team develops next-generation software to make use of novel hardware features on current and next-gen GPUs. The scope spans all phases of the LLM life cycle: pretraining, post-training, and generation. This is a coding-heavy role focused on infrastructure, data, tooling, and performance. Work will directly support NVIDIA's internal software systems for recipe prototyping and is a component of productization in libraries like Megatron-LM and Transformer Engine.

Applications for this job will be accepted at least until August 31, 2025. Base salary ranges (dependent on location, experience, and level): Level 4: 184,000 USD - 287,500 USD; Level 5: 224,000 USD - 356,500 USD. You will also be eligible for equity and benefits.

Responsibilities

Create well-designed and well-tested software systems and proofs-of-concept to support recipe exploration for research settings.
Analyze and prototype state-of-the-art methods for quantization and sparsity.
Benchmark, profile, and optimize LLM workloads in cluster settings.
Develop new data analysis tools and visualizations to aid in numerics debugging.
Improve developer and researcher productivity by addressing obstacles (e.g., slow CI systems, slow training systems).
Participate in code reviews and address code review feedback.

Requirements

MS or PhD or equivalent experience in Computer Science or a related field, and 5+ years of relevant software engineering experience.
Proficient in Python.
Experience with PyTorch or a similar deep learning framework.
Strong software engineering background with a focus on concise and well-tested code.
Experience working with ML accelerators and performance optimization and debugging.
Strong written and oral communication skills.

Ways to stand out

Proficient in precision and numerics for ML.
Familiarity with C++ and CUDA.
Strong foundation in LLM pretraining, post-training, or generation.

Technologies and libraries mentioned

Python, PyTorch
C++, CUDA
Megatron-LM, Transformer Engine
Quantization, sparsity, low-precision methods, low-rank decompositions
Benchmarking, profiling, cluster/ distributed training tools
Data analysis and visualization tools for numerics debugging

Benefits & culture

Eligible for equity and company benefits (see NVIDIA benefits page).
NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.