Principal Software Engineer - Networking Hyperscale Engineering
at Nvidia
USD 248,000-391,000 per year
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Linux @ 6
Communication @ 6
Networking @ 4
Debugging @ 4
CUDA @ 4
GPU @ 6
AI @ 6
NCCL @ 6
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA is seeking an experienced Principal Software Engineer to join the US-based Networking Hyperscale Engineering team. The role focuses on co-developing NIC software and communication paths with top-tier cloud and AI customers, designing and optimizing NIC and communication paths for next-generation GPU and NIC platforms, and influencing NVIDIA's NIC software roadmap across Linux kernel, RDMA/RoCE, DPDK, DOCA, NCCL, and NIC firmware.
Responsibilities
- Co-develop NIC software and communication paths with strategic, top-tier customers to enable and scale large AI superclusters.
- Design and implement high-performance C/C++ components on Linux using DPDK, kernel-bypass techniques, and RDMA/RoCE.
- Develop and integrate kernel, driver, and NIC firmware features to improve throughput, latency, and reliability for AI workloads.
- Work closely with NCCL and distributed training teams to tune end-to-end collectives performance over NVIDIA networking at scale.
- Own complex performance and functionality debugging with customers and represent the team in cross-organization architecture discussions.
Requirements
- 15+ years overall experience in a similar or related systems / networking software role.
- Bachelor’s, Master’s or PhD in Software Engineering, Computer Science, Computer Engineering, Electrical Engineering, or a related field (or equivalent experience).
- Deep C/C++ expertise and strong Linux systems knowledge.
- Hands-on experience with kernel networking, RDMA/RoCE, NIC drivers, or DPDK.
- Proven experience developing and debugging network operating systems (NOS) and routing/switching protocols used in AI data centers (for example BGP, ECMP, EVPN/VXLAN).
- Practical experience with DOCA, NIC firmware interfaces, or other hardware-accelerated networking stacks for large-scale systems.
- Excellent communication skills and a track record of effective collaboration with developers, partners, and customers.
Ways to stand out
- Deep knowledge of Linux kernel / systems internals, SoC / SmartNIC / NIC embedded systems, and data center switches and NOS.
- Hands-on experience with RDMA/RoCE, GPU-related networking (for example GPUDirect RDMA), and high-performance, low-latency data paths.
- Background optimizing NCCL or other distributed training stacks on large GPU clusters for throughput and tail latency.
- Experience working with hyperscalers or major cloud providers on strategic, performance-critical AI networking deployments.
- Contributions to open-source networking, RDMA, DPDK, kernel, CUDA/NCCL, or related ecosystems.
Compensation and benefits
- Base salary range: 248,000 USD - 391,000 USD (determined based on location, experience, and pay of employees in similar positions).
- Eligible for equity and benefits (link to NVIDIA benefits provided in original posting).
Additional information
- Applications accepted at least until May 19, 2026.
- This posting is for an existing vacancy.
- NVIDIA uses AI tools in its recruiting processes.
- NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.