Senior Software SDET Test Development Engineer

at Nvidia
USD 140,000-270,200 per year
SENIOR
✅ On-site

Used Tools & Technologies

Not specified

Required Skills & Competences

Software Development @ 4 Ansible @ 4 CentOS @ 7 Docker @ 3 Jenkins @ 4 Kubernetes @ 3 Linux @ 7 DevOps @ 7 Python @ 4 Java @ 4 GitHub @ 3 CI/CD @ 4 TensorFlow @ 4 JavaScript @ 4 Parallel Programming @ 4 Debugging @ 4 NLP @ 7 LLM @ 4 PyTorch @ 4 Agile @ 4 CUDA @ 4 GPU @ 4 AI @ 4 OpenCL @ 4 Slurm @ 3

Details

NVIDIA is the world leader in GPU computing, positioned as an AI Computing Company. This role is for a candidate with enterprise server integration, strong Linux experience, reliability testing with telemetry, scale-out cluster experience, test plan development, AI tools/NLP background, and DevOps/CI-CD experience to join the platform SWQA team.

Responsibilities

  • Develop and execute NVIDIA HGX/DGX/MGX platform test plans on servers, OS, firmware and CUDA software stack from design documentation.
  • Install and test various system OS, server firmware and software stacks.
  • Drive root cause analysis for reliability and validation test failures and implement mitigations.
  • Build, develop and debug server- and OS-level automation front-end and back-end frameworks and tests.
  • Review partner and supplier test results and prescribe additional reliability testing on components, servers, and packaging as needed.
  • Work in an agile software development team with high production quality standards.
  • Manage bug lifecycle and collaborate across groups to drive for solutions.

Requirements

  • Bachelor’s degree (or equivalent experience) in a STEM field. Master’s degree or 5+ years of proven experience preferred.
  • Proven experience in OS and server-level automation, CI/CD processes and DevOps using technologies such as Python, Shell, Ansible, Jenkins, C/C++, Java, JavaScript.
  • Strong server and Linux (Ubuntu, RedHat, CentOS, SuSE, Fedora, etc.) troubleshooting and debugging experience in bare-metal and KVM/VMWare/Hyper-V environments.
  • Hands-on experience in model testing and AI frameworks/tools (TensorFlow, PyTorch, Cursor, etc.), plus NLP and LLM benchmarking.
  • Experience using AI development tools for test plan creation, test case development and test case automation.
  • Experience with firmware (FW), BMC/OpenBMC, network protocols, enterprise storage devices, PCIe buses/devices, IO sub-devices, CPU and memory, ACPI, UEFI spec, and Redfish is a strong plus.
  • Familiarity with GitHub/GitLab/Gerrit, PXE, SLURM, Kubernetes, Docker, and container/orchestration tooling is a plus.

Ways to Stand Out

  • Experience with AI-related tools, LLMs and NLP.
  • Experience working with NVIDIA GPU hardware is a strong plus.
  • Solid understanding of virtualization in Linux (KVM, Docker orchestrated with Kubernetes).
  • Background in parallel programming, ideally CUDA/OpenCL.

Compensation

  • Base salary ranges by level:
    • Level 3: 140,000 USD - 224,250 USD
    • Level 4: 168,000 USD - 270,250 USD
  • Eligible for equity and benefits.

Benefits

Additional Information

  • Applications accepted at least until February 28, 2026.
  • This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is an equal opportunity employer and values diversity.