Used Tools & Technologies
GPURequired Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Software Development @ 4
Ansible @ 4
CentOS @ 4
Docker @ 4
Linux @ 4
Python @ 4
Parallel Programming @ 4
QA @ 4
Agile @ 4
CUDA @ 4
Cloud Computing @ 6
AI @ 4
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
We are looking for a Senior Software Development Engineer in Test to join the Compute CUDA Quality Assurance team for NVIDIA's Enterprise SWQA release schedules. The role focuses on automation development, test and validation infrastructure, and applying AI tools to aid in solving complex issues and improving test automation.
Responsibilities
- Automate testbench-independent test specification and execution workflows for worldwide chip validation teams running tests on silicon.
- Develop and maintain automation framework and infrastructure used by distributed heterogeneous servers with NVIDIA GPUs to verify multiple designs/points-of-reference in many configurations (automation farm and cloud).
- Develop systems that run at large scale (hundreds of tests per day) in distributed environments.
- Develop test plans and orchestrate testing for Compute software releases across compute architecture platforms including Tesla GPUs, NVIDIA turnkey systems and OEM systems.
- Incorporate advanced AI tools into test infrastructure to enhance testing capabilities and streamline operations.
- Improve code coverage and develop roadmaps prioritizing software development schedule for full lifecycle of tool development, test, and deployment.
- Collaborate across teams to define, implement automation, and productize new features.
- Build and operate key pieces of automation framework infrastructure; lead automation support and participate in automating manual test cases.
- Focus on customer experience by improving usability and performance attainment.
- Test software functionality and internal code/structure and run regression tests for existing CUDA/Driver features.
- Work in a dynamic agile software development team with high production quality standards.
Requirements
- BS or MS in Engineering (or equivalent experience) with 5+ years testing SW development cycle.
- Solid understanding of embedded systems, Linux, Python, C and C++.
- Experience with cloud infrastructure (stated as a big plus).
- Proven experience using AI tools for automation and test plan development applied to daily tasks.
- Strong technical skills with deep understanding of orchestration & automation systems, data center and cloud architecture.
- Solid understanding of QA methodology and attention to detail.
- Knowledge of cluster and cluster management.
- Experience developing test strategies, high-quality test plans, and executing tests.
- Proficient in building test setups and fine tuning hardware and software components that enable cloud computing services.
Preferred / Ways to stand out
- Expertise packaging software in Linux (rpms, debs) and knowledge of Linux distributions (CentOS, Ubuntu, SLES, RedHat, Fedora).
- Applying AI-powered tools to improve efficiency and quality (test case/plan/script generation, defect detection, bug fixing, day-to-day assistance).
- Experience with configuration and deployment management (Ansible), containers (Docker), and virtualization infrastructure software (Xen, KVM).
- Good understanding of the C/C++ toolchain in Linux including cross-compilation (automake/autoconf, cmake, meson).
- Background in parallel programming, ideally CUDA C/C++ and OpenACC.
Compensation & Other Details
- Base salary range: 140,000 USD - 224,250 USD for Level 3, and 168,000 USD - 270,250 USD for Level 4.
- You will also be eligible for equity and benefits (see https://www.nvidia.com/en-us/benefits/).
- Applications accepted at least until May 31, 2026.
NVIDIA is an equal opportunity employer. NVIDIA uses AI tools in its recruiting processes.