Distinguished Engineer β Data Center System Software Architect
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Linux @ 4 Networking @ 4 System Architecture @ 4 CUDA @ 3Details
NVIDIA data center systems, such as DGX and HGX, have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. This role involves owning the end-to-end architecture of these products at the system software level, including firmware, kernel drivers, operating systems, and user mode drivers. The candidate will collaborate internally across component teams and externally with industry-leading cloud service providers to take these products to market.
Responsibilities
- Serve as the primary technical point of contact for major customers.
- Lead technological discussions, define KPIs, gather requirements, and address complex technical queries.
- Lead technical innovation and strategic collaborations with major hyperscalers to architect next-generation data center products.
- Align NVIDIA's roadmap with major customers' requirements.
- Develop and drive adoption of new technologies and protocols.
- Make critical technical decisions to mitigate risks in ambiguous situations.
Requirements
- Deep expertise in scalable and performant server system architecture focusing on software/hardware interfaces.
- Extensive experience with complex system software for accelerators (GPUs, DPUs, FPGAs).
- Mastery of system firmware (SBIOS, OpenBMC), embedded systems, and Linux kernel internals.
- Proficiency in management architectures and device/system management protocols such as MCTP, PLDM, SPDM, RDE, Redfish, and IPMI.
- Extensive knowledge of networking technologies and protocols including TCP/IP, Ethernet, InfiniBand, advanced switching and routing.
- Experience collaborating with platform security experts regarding security and usability tradeoffs.
- Demonstrated success leading large cross-functional projects with influence without direct authority.
- BS or MS degree in Computer Science, Electrical Engineering, or related field (or equivalent experience).
- 20+ years in system architecture and design.
Ways to stand out
- Knowledge of cloud and cluster deployment and management systems.
- Contributions to standards bodies such as OCP and DMTF.
- Familiarity with NVIDIA HPC programming models and libraries (CUDA, cuDNN, DOCA).
- Knowledge of enterprise storage architectures and distributed parallel processing paradigms.
NVIDIA is a leader in AI, HPC, and visualization. They emphasize innovation, creativity, and passion within their teams. The company values diversity and is an equal opportunity employer.
Compensation
The base salary range is $308,000 - $471,500 USD, adjusted by location, experience, and peer pay. Eligibility for equity and benefits is included.