Senior Test Development Engineer β Datacenter GPU Systems
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Linux @ 4 Python @ 4 Leadership @ 4 Networking @ 7 Debugging @ 4 Technical Leadership @ 4 CUDA @ 4 GPU @ 4Details
NVIDIA is at the forefront of the AI revolution, and our datacenter products are the engines powering this transformation. We seek a Senior Test Development Engineer to join our Silicon Solutions Architecture Development team. In this critical role, you will design next-generation testing methodologies that ensure the performance, reliability, and integrity of pioneering GPU server systems used in the world's most demanding computing environments. If you thrive on solving sophisticated technical challenges, shaping future hardware, and ensuring flawless product quality, this is your opportunity to make a direct impact on the future of AI and high-performance computing.
Responsibilities
- Innovate & Build β Design and implement novel test plans, tools, and automation frameworks to validate GPU functionality, performance, and reliability in complex datacenter environments.
- Safeguard Data Integrity β Develop stress tests and methodologies to detect, characterize, and eliminate silent data errors.
- Build the Future of Hardware β Partner with architecture and silicon construction teams to influence system and chip-level features that improve diagnostics, debuggability, and root-cause analysis.
- Deep Dive Debugging β Analyze test results, investigate complex failures, and drive solutions in close collaboration with design, firmware, and software teams.
- Lead & Mentor β Provide technical leadership, guide junior engineers, and shape validation strategy across datacenter product lines.
Requirements
- BS/MS in Electrical Engineering, Computer Engineering, Computer Science, or a related field (or equivalent experience).
- 8+ years of experience in hardware validation, test development, or datacenter hardware engineering.
- Expert programming skills in Python and/or C/C++ for automation and tool development.
- Deep Linux/Unix expertise, including advanced shell scripting.
- Strong knowledge of server architecture: CPUs, GPUs, PCIe, networking, and storage.
- Demonstrated ability to own and deliver complex projects; proactive and hardworking approach.
Ways to Stand Out
- Hands-on experience with NVIDIA GPU architecture (Hopper, Ampere) and software stack (CUDA, NCCL).
- Experience testing high-speed interconnects such as NVLink or InfiniBand.
- Familiarity with AI/ML or HPC benchmarking and stress-testing tools.
- Proven track record of identifying and resolving critical bugs in pre-production hardware.
Benefits & Compensation
- Base salary range by level:
- Level 4: 168,000 USD - 264,500 USD
- Level 5: 196,000 USD - 310,500 USD
- Eligible for equity and company benefits. (See NVIDIA benefits page.)
- NVIDIA is an equal opportunity employer committed to diversity and inclusion.
Applications for this job will be accepted at least until September 26, 2025.
#LI-Hybrid