System Software Architect, HPC Networking

at Nvidia
USD 224,000-356,500 per year
MIDDLE
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Ansible @ 5 Docker @ 3 Grafana @ 3 Kubernetes @ 3 Linux @ 6 Prometheus @ 3 DevOps @ 3 Python @ 5 Datadog @ 3 Git @ 3 Mathematics @ 3 Networking @ 3 Salt @ 5 GPU @ 3

Details

Our technology has no boundaries! NVIDIA is building state-of-the-art GPU platforms that enable scientists, researchers and engineers to advance their ideas. The role sits at the intersection of networking, data center engineering, and GPU-accelerated computing. You will work on product-level architecture, design, development and hands-on validation for high-performance DGX SuperPOD systems in data center environments.

Responsibilities

  • Provide Ethernet and routing expertise to customers during project delivery to design, architect, and test Ethernet networking solutions.
  • Work on multi-functional teams to provide Ethernet network expertise for server infrastructure builds, accelerated computing workloads, and GPU-enabled AI applications.
  • Craft and evaluate DevOps automation scripts for network operations, design architectures, and develop switch fabric configurations.
  • Implement network configuration and validation tasks for data centers.
  • Create Methods of Procedure (MOP) and deployment documentation.
  • Validate network architectures and configurations in the lab.
  • Interact with customers to obtain required information to design and build optimal solutions.
  • Use software tools to validate and monitor network performance.

Requirements

  • Bachelor’s degree in Engineering, Computer Science, Mathematics, Physics, or equivalent experience.
  • 10+ years of proficiency in networking fundamentals, switching, routing, TCP/IP, and data center architecture.
  • Strong Ethernet, TCP/IP and routing hands-on experience including BGP, VxLAN and EVPN; network troubleshooting using packet sniffers (8+ years).
  • Strong Linux administration expertise with understanding of system-level issues, kernel drivers, PCIe devices, and computer hardware architecture (8+ years).
  • Experience engaging with clients for requirement analysis, designing next-gen data center network architectures, and validating functionality through simulations (6+ years).
  • Extensive experience in consulting and/or technical support (6+ years).
  • Background in datacenter engineering, CLOS networking architecture, or cloud applications is advantageous (4+ years).
  • Automation experience delivering fully automated network provisioning solutions using Ansible, Salt, and Python (3+ years).
  • Ability to work independently, prioritize tasks, and deliver high-quality customer experience.
  • Strong presentation skills and ability to provide internal and customer training.
  • Ability to travel domestically and internationally up to 25% and work non-standard hours (nights/weekends) as needed.

Ways to stand out

  • Strong social and collaboration skills and accountability for results.
  • Experience with Kubernetes and Docker.
  • Familiarity with networking digital simulations (NVIDIA Air, GNS3, EVE-NG).
  • Experience with network and server observability and management tools such as Grafana, Prometheus, Datadog.
  • Personal Git repository with Python code examples.
  • Experience using AI tools to improve work quality and meet timelines.

Compensation & Benefits

  • Base salary range: 224,000 USD - 356,500 USD (determined based on location, experience, and internal pay equity).
  • Eligible for equity and benefits (link to benefits provided in original posting).

Other information

  • Location: Santa Clara, CA, United States.
  • Employment type: Full time.
  • Application acceptance at least until July 29, 2025.
  • NVIDIA is an equal opportunity employer and values diversity across all characteristics protected by law.