Senior Storage And Networking Product Engineer

at Nvidia
USD 168,000-264,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Ansible @ 4 Ceph @ 4 Chef @ 4 Go @ 4 Grafana @ 3 Kubernetes @ 4 Linux @ 4 Prometheus @ 3 Terraform @ 4 Python @ 4 Hiring @ 4 Bash @ 4 Networking @ 4 Debugging @ 4 Puppet @ 4 Compliance @ 4 GPU @ 4

Details

At NVIDIA, the Storage & Networking Product Engineer role contributes to building highly available, high-performance infrastructure for AI, ML, and HPC workloads. The role focuses on integrating storage systems and advanced networking technologies to ensure low latency, high efficiency, scalability, system resilience, and automation.

Responsibilities

  • Architect, deploy, and maintain distributed storage clusters with a focus on scalable performance and data durability.
  • Develop and improve high-performance networking architectures for storage environments, ensuring low-latency data paths for AI/ML and HPC workloads.
  • Configure and tune RDMA, NVMe-over-Fabrics, RoCE, InfiniBand, and Ethernet-based fabrics for maximum performance.
  • Partner with GPU, networking, and systems teams to ensure seamless end-to-end performance across the full stack.
  • Develop automated systems for monitoring, recording, and notifying in storage and networking.
  • Build and maintain capacity planning models for network efficiency and storage growth.
  • Troubleshoot complex network-storage interactions, including bottlenecks in distributed filesystems, parallel storage, and interconnects.
  • Implement data protection and compliance controls such as encryption in-transit, access control, and auditing.
  • Foster automation in storage and networking operations through infrastructure-as-code and orchestration guided by AI/ML.

Requirements

  • BS/MS in Computer Science, Electrical Engineering, or a related field, or equivalent experience.
  • 12+ years of experience in storage systems engineering, production infrastructure, or large-scale data center operations.
  • Deep knowledge of networking protocols and technologies: TCP/IP, Ethernet, InfiniBand, RDMA, RoCE, NVMe-oF, Fibre Channel.
  • Hands-on experience with high-performance storage systems: Lustre, GPFS, Ceph, distributed object storage, enterprise SAN/NAS.
  • Expertise in Linux systems engineering, including tuning, performance analysis, and debugging.
  • Skilled in coding/scripting using Python, Bash, Go, or C/C++ to automate, monitor, and optimize performance.
  • Experience with configuration management and orchestration tools (Ansible, Terraform, Puppet, Chef, Kubernetes).
  • Familiarity with observability stacks (Prometheus, Grafana, Elastic, InfluxDB) to monitor and optimize storage and network performance.
  • Proficient in recognizing and resolving complex system bottlenecks within storage and networking layers.

Ways To Stand Out

  • Experience crafting and operating RDMA-accelerated HPC/AI clusters at scale, with hands-on expertise with network topologies and large-scale switch/router deployments.
  • Familiarity with network telemetry and packet capture tools (sFlow, NetFlow, Wireshark) and a proven history of capacity planning and optimizing performance for distributed storage systems over high-speed networks.
  • Background in jointly developing storage networks for AI/ML training pipelines, large-scale inference, and RAG workflows.
  • Proficiency in hybrid cloud storage and networking solutions (Kubernetes CSI, cloud-native fabrics, hybrid on-prem/cloud setups).
  • Contributions to open-source networking or storage projects.

Compensation & Benefits

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 168,000 USD - 264,500 USD. You will also be eligible for equity and benefits.

Equal Opportunity

NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. The company does not discriminate in hiring or promotion practices on the basis of legally protected characteristics.

Additional Info

  • Location: Santa Clara, California, United States.
  • Applications accepted at least until September 29, 2025.