Distinguished Engineer – Data Center System Software Architect
at Nvidia
USD 308,000-471,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Linux @ 4 Networking @ 4 System Architecture @ 4 CUDA @ 3Details
NVIDIA data center systems (DGX, HGX) integrate NVIDIA GPUs, NVLink, InfiniBand networking, NVIDIA Grace CPUs and an optimized AI/HPC software stack. This role is for a senior system software architect to own end-to-end architecture at the system software level, across firmware, kernel drivers, operating systems, and user-mode drivers. The role engages internal component leads and hyperscaler/cloud customers to take products to market.
Responsibilities
- Serve as the primary technical point of contact for major customers: lead technical discussions, define KPIs, gather requirements, and address complex technical queries.
- Lead technical innovation and strategic collaborations with major hyperscalers to architect next-generation data center products.
- Align NVIDIA's roadmap with major customers' requirements through direct engagement.
- Develop and drive adoption of new technologies and protocols.
- Make critical technical decisions in ambiguous situations and mitigate risks through left-shift strategies.
- Lead complex, cross-functional projects to completion and influence large-scale collaborative environments without direct authority.
Requirements
- Deep expertise in scalable and performant server system architecture, with focus on software/hardware interfaces.
- Extensive experience with complex system software for accelerators (GPUs, DPUs, FPGAs).
- Mastery of system firmware (SBIOS, OpenBMC), embedded systems, and Linux kernel internals.
- Strong experience implementing and developing kernel drivers, user-mode drivers, and related OS-level components.
- Proficiency with Out-of-Band and In-Band management architectures and device management protocols (MCTP, PLDM, SPDM, RDE), and system management protocols (Redfish, IPMI).
- Extensive knowledge of networking technologies and protocols including TCP/IP, Ethernet, InfiniBand, and advanced switching and routing concepts.
- Experience collaborating with platform security experts to define tradeoffs between security and usability.
- Demonstrated success leading complex, cross-functional programs and executing left-shift strategies to de-risk program execution.
- BS or MS in Computer Science, Electrical Engineering or related field (or equivalent experience).
- 20+ years of experience in system architecture and design.
Ways to stand out / Preferred
- Knowledge of cloud and cluster-level deployment and management systems.
- Participation in standards bodies such as OCP and DMTF.
- Familiarity with NVIDIA HPC programming models and libraries (CUDA, cuDNN, DOCA).
- Knowledge of enterprise storage architectures and distributed parallel processing paradigms.
Benefits
- Base salary range: 308,000 USD - 471,500 USD (determined based on location, experience, and market).
- Eligibility for equity and company benefits.
- NVIDIA is an equal opportunity employer committed to diversity and inclusion.
Additional details
- Location: Santa Clara, CA, United States.
- Employment type: Full time.
- Applications accepted at least until August 13, 2025.