Senior Software Engineer, Recipe Pathfinding

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Python @ 6 Communication @ 7 Performance Optimization @ 4 Data Analysis @ 4 Debugging @ 4 LLM @ 4 PyTorch @ 4 CUDA @ 3

Details

NVIDIA is seeking a Senior Software Engineer to discover and innovate new low-precision and sparsity recipes in the pretraining setting. A recipe defines which operators in an LLM are transformed into low-precision and/or sparsified variants, thereby unlocking efficiency gains. Recipes can be statically defined at model load time or dynamically adapt to a layer’s input distribution. Recipes can incorporate and compose algorithmic techniques like rotations or low-rank decompositions to tame aggressors like outliers. The team develops next-generation software to make use of novel hardware features on current and next-gen GPUs. The scope spans all phases of the LLM life cycle: pretraining, post-training, and generation. This is a coding-heavy role focused on infrastructure, data, tooling, and performance. Work will directly support NVIDIA's internal software systems for recipe prototyping and is a component of productization in libraries like Megatron-LM and Transformer Engine.

Applications for this job will be accepted at least until August 31, 2025. Base salary ranges (dependent on location, experience, and level): Level 4: 184,000 USD - 287,500 USD; Level 5: 224,000 USD - 356,500 USD. You will also be eligible for equity and benefits.

Responsibilities

  • Create well-designed and well-tested software systems and proofs-of-concept to support recipe exploration for research settings.
  • Analyze and prototype state-of-the-art methods for quantization and sparsity.
  • Benchmark, profile, and optimize LLM workloads in cluster settings.
  • Develop new data analysis tools and visualizations to aid in numerics debugging.
  • Improve developer and researcher productivity by addressing obstacles (e.g., slow CI systems, slow training systems).
  • Participate in code reviews and address code review feedback.

Requirements

  • MS or PhD or equivalent experience in Computer Science or a related field, and 5+ years of relevant software engineering experience.
  • Proficient in Python.
  • Experience with PyTorch or a similar deep learning framework.
  • Strong software engineering background with a focus on concise and well-tested code.
  • Experience working with ML accelerators and performance optimization and debugging.
  • Strong written and oral communication skills.

Ways to stand out

  • Proficient in precision and numerics for ML.
  • Familiarity with C++ and CUDA.
  • Strong foundation in LLM pretraining, post-training, or generation.

Technologies and libraries mentioned

  • Python, PyTorch
  • C++, CUDA
  • Megatron-LM, Transformer Engine
  • Quantization, sparsity, low-precision methods, low-rank decompositions
  • Benchmarking, profiling, cluster/ distributed training tools
  • Data analysis and visualization tools for numerics debugging

Benefits & culture

  • Eligible for equity and company benefits (see NVIDIA benefits page).
  • NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.