Senior Software SDET Test Development Engineer

at Nvidia
USD 136,000-264,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Software Development @ 4 Ansible @ 4 CentOS @ 7 Docker @ 4 Jenkins @ 4 Kubernetes @ 4 Linux @ 7 DevOps @ 7 Python @ 4 Java @ 4 GitHub @ 4 CI/CD @ 7 TensorFlow @ 4 JavaScript @ 4 Parallel Programming @ 4 Debugging @ 7 NLP @ 4 LLM @ 4 PyTorch @ 4 Agile @ 4 CUDA @ 4 GPU @ 4

Details

NVIDIA is the world leader in GPU computing and an AI Computing Company. This role joins the platform SWQA team to develop and execute reliability and validation tests for NVIDIA HGX/DGX/MGX platforms, server OS, firmware and CUDA software stacks. The position requires strong Linux and server-level automation experience, reliability testing, CI/CD/DevOps, and hands-on work with AI tools and model benchmarking.

Responsibilities

  • Develop and execute platform test plans for NVIDIA HGX/DGX/MGX platforms covering servers, OS, firmware and CUDA software stack.
  • Install and test various system OS, server firmware and software stacks on bare-metal and virtualized environments.
  • Drive root cause analysis for reliability and validation test failures and propose mitigations.
  • Build, develop and debug server and OS level automation frameworks (front-end and back-end) and tests.
  • Review partner and supplier test results and prescribe additional reliability testing on components, servers, and packaging as needed.
  • Work in an agile software development team with high production quality standards.
  • Manage bug lifecycle and collaborate across teams to drive solutions.

Requirements

  • Bachelor’s degree (or equivalent experience) in a STEM field; Master's degree or 5+ years proven experience is acceptable.
  • Proven OS and server-level automation experience; CI/CD and DevOps experience.
  • Hands-on experience with automation and tooling such as Python, SHELL, Ansible, Jenkins, C/C++, Java, JavaScript.
  • Strong server and Linux troubleshooting and debugging experience (Ubuntu, RedHat, CentOS, SuSE, Fedora) in bare-metal and KVM/VMWare/Hyper-V environments.
  • Experience with model testing, AI tools/frameworks (TensorFlow, PyTorch, Cursor — as listed), and NLP/LLM benchmarking.
  • Experience using AI development tools for test plan creation, test case development and automation.
  • Strong experience with firmware (FW), BMC/OpenBMC, network protocols, enterprise storage devices, PCIe devices, IO sub-devices, CPU and memory, ACPI, UEFI, Redfish (noted as a huge plus).
  • Experience with GitHub/GitLab/Gerrit, PXE, SLURM, Docker, Kubernetes, container/orchestration and related stack (noted as huge plus).

Ways to stand out

  • Experience with AI-related tools, LLMs and NLP.
  • Experience working with NVIDIA GPU hardware.
  • Solid understanding of virtualization in Linux (KVM, Docker orchestrated with Kubernetes).
  • Background in parallel programming, ideally CUDA/OpenCL.

Benefits & Compensation

  • Base salary ranges by level:
    • Level 3: 136,000 USD - 212,750 USD
    • Level 4: 168,000 USD - 264,500 USD
  • Eligible for equity and NVIDIA benefits.
  • Competitive total compensation and generous benefits package.

Additional information

  • Applications accepted at least until November 11, 2025.
  • NVIDIA is an equal opportunity employer committed to diversity and inclusion.