Senior CPU Workloads and Simulation Architect

at Nvidia
USD 224,000-431,200 per year
SENIOR
✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences

Python @ 7 Data Science @ 7 TensorFlow @ 4 Communication @ 4 System Architecture @ 4 PyTorch @ 4 GPU @ 4 Deep Learning @ 4 AI @ 4 Robotics @ 4 HPC @ 4

Details

Do you want to help improve CPU architectures to support growth in AI, deep learning, HPC, gaming, virtual reality, and autonomous vehicles? Come join the CPU performance architecture team as a Senior CPU Workloads & Simulation Architect and help us push performance boundaries for NVIDIA’s line of CPU products!

Responsibilities

  • Research, architect, implement, and evaluate mechanisms for capturing and studying complex applications suitable for architectural and microarchitectural CPU analysis in simulation. This includes multi-core, multi-thread, and heterogeneous workloads spanning CPU/GPU/NIC, simulated at the user-level, VM-level, and full-system level.
  • Implement tools, processes, and systems for collecting traces and checkpoints for complex multi-threaded heterogeneous applications and support other architects in using those tools to study workloads.
  • Contribute to developing functional and performance models of ARM-based systems. Focus on infrastructure for recording and replaying workload sequences for performance and power analysis.
  • Stay on top of guidelines in industry and academia relating to simulation, checkpointing, tracing, deterministic replay, and architectural/microarchitectural analysis of complex heterogeneous computer systems.

Requirements

  • BS/MS in EE, CE, or CS or equivalent experience
  • 12 or more years of relevant experience
  • Experience with CPU workload methodology: state capture and replay, trace analysis, SimPoint, etc.
  • Knowledge of CPU and system architecture and microarchitecture
  • Strong C/C++ and Python programming skills
  • Excellent communication and collaboration skills

Ways to Stand Out

  • Strong knowledge in sampling methodology and data science
  • Experience with CPU/GPU application development and optimization in PyTorch, TensorFlow, and similar frameworks
  • Proficiency in the ARM instruction set architecture
  • Experience developing user-mode and/or kernel-mode drivers
  • Background in writing functional and/or performance simulators

About NVIDIA

NVIDIA is a global leader in accelerated computing, delivering breakthroughs in AI, HPC, and advanced system design. Technologies power applications across industries — from robotics and autonomous vehicles to healthcare and climate research. With the introduction of the Grace CPU Superchip and the announcement of the Vera CPU, NVIDIA has expanded into the CPU server market, complementing GPUs and SoCs. The CPU architecture team drives innovations that integrate with NVIDIA’s broader technology stack to enable faster AI model training, efficient data processing, and scalable cloud deployments.

Compensation and Benefits

  • Base salary range:
    • Level 5: 224,000 USD - 356,500 USD
    • Level 6: 272,000 USD - 431,250 USD
  • Eligible for equity and benefits (link to NVIDIA benefits in original posting).

Application Details

  • Applications for this job will be accepted at least until May 2, 2026.
  • This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.
  • NVIDIA is an equal opportunity employer and does not discriminate on the basis of protected characteristics listed in the posting.