Senior Test Development Engineer – Datacenter GPU Systems

at Nvidia
USD 168,000-310,500 per year
SENIOR
βœ… Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Linux @ 4 Python @ 4 Hiring @ 4 Leadership @ 4 Networking @ 7 Technical Leadership @ 4 CUDA @ 4 GPU @ 4

Details

NVIDIA is at the forefront of the AI revolution, and our datacenter products are the engines powering this transformation. We seek a Senior Test Development Engineer to join our Silicon Solutions Architecture Development team. In this critical role, you will design next-generation testing methodologies that ensure the performance, reliability, and integrity of pioneering GPU server systems used in the world's most demanding computing environments. If you thrive on solving sophisticated technical challenges, shaping future hardware, and ensuring flawless product quality, this is your opportunity to make a direct impact on the future of AI and high-performance computing.

Responsibilities

  • Design and implement novel test plans, tools, and automation frameworks to validate GPU functionality, performance, and reliability in complex datacenter environments.
  • Develop stress tests and methodologies to detect, characterize, and eliminate silent data errors.
  • Partner with architecture and silicon construction teams to influence system and chip-level features that improve diagnostics, debuggability, and root-cause analysis.
  • Analyze test results, investigate complex failures, and drive solutions in close collaboration with design, firmware, and software teams.
  • Provide technical leadership, guide junior engineers, and shape validation strategy across datacenter product lines.

Requirements

  • BS/MS in Electrical Engineering, Computer Engineering, Computer Science, or related field (or equivalent experience).
  • 8+ years of experience in hardware validation, test development, or datacenter hardware engineering.
  • Expert programming skills in Python and/or C/C++ for automation and tool development.
  • Deep Linux/Unix expertise, including advanced shell scripting.
  • Strong knowledge of server architecture: CPUs, GPUs, PCIe, networking, and storage.
  • Proven ability to own and deliver complex projects; proactive, hard-working approach.

Ways to Stand Out

  • Hands-on experience with NVIDIA GPU architecture (Hopper, Ampere) and software stack (CUDA, NCCL).
  • Experience testing high-speed interconnects such as NVLink or InfiniBand.
  • Familiarity with AI/ML or HPC benchmarking and stress-testing tools.
  • Proven track record of identifying and resolving critical bugs in pre-production hardware.

Compensation & Benefits

  • Base salary is determined based on location, experience, and internal pay. Base salary ranges provided:
    • Level 4: 168,000 USD – 264,500 USD
    • Level 5: 196,000 USD – 310,500 USD
  • Eligible for equity and company benefits.

Other Details

  • Location: Santa Clara, CA, United States.
  • Office policy: Hybrid (#LI-Hybrid).
  • Employment type: Full time.
  • Applications accepted at least until September 12, 2025.

NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We do not discriminate in hiring and promotion practices on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or other characteristics protected by law.