Senior Machine Learning Applications and Compiler Engineer, LPX

at Nvidia
📍 Toronto, Canada
CAD 135,000-220,000 per year
SENIOR
✅ Hybrid

Used Tools & Technologies

GPU

Required Skills & Competences

Algorithms @ 4 Data Structures @ 7 Machine Learning @ 4 TensorFlow @ 3 Communication @ 4 Rust @ 7 Debugging @ 7 PyTorch @ 3 Deep Learning @ 4 AI @ 4 Profiling @ 7 LLVM @ 4

Details

We are seeking a Senior Machine Learning Applications and Compiler Engineer to develop algorithms and optimizations for the LPX inference and compiler stack. You will work at the intersection of large-scale systems, compilers, and deep learning, crafting how neural network workloads map onto future NVIDIA platforms.

Responsibilities

  • Build, develop, and maintain high-performance runtime and compiler components, focusing on end-to-end inference optimization.
  • Define and implement mappings of large-scale inference workloads onto NVIDIA's systems.
  • Extend and integrate with NVIDIA's software ecosystem, contributing to libraries, tooling, and interfaces that enable seamless deployment of models across platforms.
  • Benchmark, profile, and monitor key performance and efficiency metrics to ensure the compiler generates efficient mappings of neural network graphs to inference hardware.
  • Collaborate closely with hardware architects and design teams to feedback software observations, influence future architectures, and co-design features that unlock new performance and efficiency points.
  • Prototype and evaluate new compilation and runtime techniques, including graph transformations, scheduling strategies, and memory/layout optimizations tailored to spatial processors.
  • Publish and present technical work on novel compilation approaches for inference and related spatial accelerators at top tier ML, compiler, and computer architecture venues.

Requirements

  • MS or PhD in Computer Science, Electrical/Computer Engineering, or related field, or equivalent experience, with 5 years of relevant experience.
  • Strong software engineering background with proficiency in systems-level programming (e.g., C/C++ and/or Rust) and solid CS fundamentals in data structures, algorithms, and concurrency.
  • Hands-on experience with compiler or runtime development, including IR design, optimization passes, or code generation.
  • Experience with LLVM and/or MLIR, including building custom passes, dialects, or integrations.
  • Familiarity with deep learning frameworks such as TensorFlow and PyTorch, and experience working with portable graph formats such as ONNX.
  • Solid understanding of parallel and heterogeneous compute architectures, such as GPUs, spatial accelerators, or other domain specific processors.
  • Strong analytical and debugging skills, with experience using profiling, tracing, and benchmarking tools to drive performance improvements.
  • Excellent communication and collaboration skills, with the ability to work across hardware, systems, and software teams.
  • Ideal candidates will have direct experience with MLIR-based compilers or other multilevel IR stacks, especially in the context of graph-based deep learning workloads.

Ways to stand out from the crowd

  • Prior work on spatial or dataflow architectures, including static scheduling, pipeline parallelism, or tensor parallelism at scale.
  • Contributions to open source ML frameworks, compilers, or runtime systems, particularly in areas related to performance or scalability.
  • Demonstrated research impact, such as publications or presentations at conferences like PLDI, CGO, ASPLOS, ISCA, MICRO, MLSys, NeurIPS, or similar.
  • Experience with large-scale AI distributed inference or training systems, including performance modeling and capacity planning for multi-rack deployments.

Other information

  • Location: Toronto, Canada.
  • Office policy: Hybrid (#LI-Hybrid).
  • Base salary range: 135,000 CAD - 185,000 CAD for Level 3; 170,000 CAD - 220,000 CAD for Level 4.
  • You will also be eligible for equity and benefits.
  • Applications accepted at least until March 27, 2026.
  • This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes.