Senior Network Engineer, Enterprise Products

at Nvidia

📍 Santa Clara, United States

USD 168,000-327,800 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Security @ 7 Ansible @ 4 Cumulus Linux @ 3 Docker @ 3 Grafana @ 4 Kubernetes @ 3 Linux @ 6 Prometheus @ 4 Terraform @ 4 Python @ 6 GitHub @ 3 Datadog @ 3 Algorithms @ 7 Distributed Systems @ 4 Leadership @ 4 Bash @ 6 Communication @ 7 Networking @ 4 Performance Monitoring @ 3

Details

Join the NVIDIA Enterprise Products team as a Senior Network Engineer focused on designing, deploying, validating, and automating scalable datacenter networking for enterprise AI/ML systems. You will work cross-functionally with compute, software, and storage experts to develop reference cluster designs and on-prem cloud-ready solutions that interoperate with cloud service providers to enable hybrid enterprise AI. Emphasis is placed on large network design, distributed systems, datacenter architecture, and automation of repetitive networking activities.

Responsibilities

Own the deployment of scalable datacenter networking for enterprise AI/ML systems.
Deploy and validate cluster designs, optimizing them for enterprise facilities and real-world operation.
Collaborate with experts in networking, compute, software, and storage to drive innovation and delivery.
Lead multi-disciplinary projects, translating high-level goals into detailed specifications and robust implementations.
Engineer on-premises cloud-native solutions that integrate with multiple cloud providers (hybrid solutions).
Act as a pivotal contributor in compute and hardware architecture domains, driving technical excellence.
Apply multidisciplinary knowledge across Ethernet, InfiniBand, data center LAN, WAN, and software-defined networks.
Conduct TCO analysis and identify opportunities for operational improvements, sustainability, and cost optimization in network operations.
Automate repetitive network activities and produce clear documentation (Methods of Procedure, deployment guides) to support production readiness and incident resolution.

Requirements

Bachelor's degree or equivalent experience plus 10+ years in hardware or infrastructure architecture.
Proven experience designing and deploying on-prem cloud-native platforms with scaling and resilience considerations at chassis, rack, cluster, and data center levels.
Deep knowledge of networking protocols and technologies: Ethernet, TCP/IP, VLAN, VXLAN, BGP, EVPN, MPLS, QoS, and InfiniBand.
Extensive experience with optical networking and cabling: fiber types, transceiver modules (SFP/SFP+, QSFP, OSFP), signal modulation, FEC, and multi-platform compatibility.
Strong system-level thinking around high availability, scalability, and security in compute environments; experience enhancing reference designs.
Hands-on experience with infrastructure-as-code and monitoring tools including Base Command Manager (BCM), Ansible, Terraform, Grafana, and Prometheus.
Proficient with Linux (including Cumulus OS) and scripting with Python and Bash.
Familiarity with NVIDIA networking products such as Mellanox switches, Cumulus Linux, BlueField DPUs, and InfiniBand technologies.
Demonstrated leadership in cluster design, networking, security, and remote access management; experienced working independently and across distributed teams and time zones.
Strong written and verbal communication skills for conveying complex technical concepts and creating clear operational documentation.

Ways to stand out

Certifications: Cisco (CCIE), Arista (ACE), Juniper (JNCIE), NVIDIA (NCP-AIN).
Deep expertise in RDMA technologies such as RoCE.
Broad cross-domain experience across Networking, Compute, Storage, and Platform Sizing, with emphasis on Infrastructure Cost Optimization and TCO analysis.
Strong understanding of network topologies, load balancing, and congestion control algorithms; engagement with standards and open-source communities.
Proficient Python with a public GitHub portfolio of relevant projects; experience with Kubernetes and Docker; familiarity with Datadog and advanced performance monitoring.
Hands-on experience with networking simulators and digital twin tools such as NVIDIA Air, GNS3, and EVE-NG for virtual testing.

Compensation & Benefits

Base salary ranges (depending on level and location):
- Level 4: 168,000 USD - 258,750 USD
- Level 5: 208,000 USD - 327,750 USD
Eligibility for equity and NVIDIA benefits.
Applications accepted at least until August 9, 2025.

Equal Opportunity

NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment. We do not discriminate on the basis of legally protected characteristics and value diversity in our workforce.