Senior Network Validation Engineer

at Nvidia
USD 160,000-253,000 per year
SENIOR
βœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 4 Ansible @ 4 Grafana @ 4 Jenkins @ 4 Kubernetes @ 4 Linux @ 4 Prometheus @ 4 Python @ 7 CI/CD @ 4 Networking @ 4 Perl @ 7 Debugging @ 7 API @ 4 Reporting @ 4 QA @ 6 Robot Framework @ 4 GPU @ 4

Details

Our technology has no boundaries! NVIDIA is building the most modern and groundbreaking compute platforms globally for widespread use. Because of this work, scientists, researchers and engineers can advance their ideas. At its core, our visual computing technology not only enables an amazing computing experience, but it is also energy efficient. NVIDIA pioneered a supercharged form of computing loved by the most demanding computer users in the world β€” scientists, designers, artists, and gamers. Beyond technology, our people and diverse company culture make NVIDIA an innovative and dynamic place to work.

We are looking for a Senior Network Validation Engineer to lead and hands-on contribute to network validation activities in the Datacenter Systems Engineering team. You will work closely with solutions, network & storage architects, hardware system engineers, validation engineers, OEM/ODMs, and AE teams to ensure product validation and test coverage are optimal for data center scale AI products. The ideal candidate is self-motivated, comfortable in a lab environment, experienced in debug and root-cause analysis, and has strong automation and scripting skills. You should be capable of thriving in a fast paced environment with evolving product definitions.

Responsibilities

  • Design validation plans from bare metal to at-scale data center integration tests.
  • Debug, triage issues, perform root cause analysis, verify fixes, define new tests, and improve product test plans.
  • Configure, administer, troubleshoot, and oversee the qualification of Ethernet and InfiniBand networks in large-scale datacenter environments.
  • Perform server function and network validations including Ethernet & InfiniBand protocol and system-level reliability tests and end-to-end application tests.
  • Design, develop, and maintain automation frameworks and test automation suites, including automated reporting, while increasing end-to-end automation coverage with each release cycle.
  • Track and coordinate all validation activities from bring up to production release.
  • Collaborate with multi-functional teams including application teams, hardware designers, networking team, firmware, security, etc. to debug HW/SW product issues.
  • Provide inputs to architecture teams for next generation Data Center networking design.

Requirements

  • M.S. degree in Engineering, Computer Science, or a related field (or equivalent experience).
  • 10+ years of experience overall.
  • Over 5 years of proven experience in Software Quality Engineering and Network Testing, including significant contributions to QA strategies and test documentation.
  • Strong skills in Python (preferred) or other scripting languages such as Perl and Shell.
  • Hands-on experience with Jenkins or similar CI/CD pipelines.
  • Strong technical abilities in problem solving, design, coding and debugging.
  • Extensive hands-on experience configuring and troubleshooting data center networking, including Layer 2/Layer 3 protocols such as VLAN, BGP, EVPN, and spine-leaf topologies. InfiniBand network experience is desired.
  • Experience with test tools from Ixia or Spirent and working experience in test management.
  • Hands-on experience with Unix or Linux operating systems.
  • Strong team collaboration, multitasking ability, and good interpersonal and documentation skills.
  • Solid foundation in and understanding of software engineering practices.
  • Excellent design, debugging and problem-solving skills, with a strong bias for action, quality and engineering excellence.

Preferred / Ways to stand out

  • CCIE (Routing & Switching / Service Provider / Data Center) certification.
  • Demonstrated experience with RDMA technologies and related protocols such as InfiniBand or RoCE.
  • Knowledge or experience of AI Data Center validation with GPU clusters.
  • Experience in REST API and Kubernetes and background in network automation tools like Ansible, Jenkins & Robot Framework.
  • Experience in IPv6 and telemetry at data center scale with observability tools like Grafana & Prometheus.

Compensation & Benefits

  • Base salary range: 160,000 USD - 253,000 USD (will be determined based on location, experience, and internal pay equity).
  • You will also be eligible for equity and benefits (see NVIDIA benefits page).

Additional Information

  • Location: Santa Clara, California, United States.
  • Employment type: Full time.
  • Applications accepted at least until October 18, 2025.
  • NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.