Distinguished Engineer – Data Center System Software Architect
at Nvidia
📍 Santa Clara, United States
$308,000-471,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Communication @ 4 Networking @ 4 System Architecture @ 4Details
NVIDIA data center systems, such as DGX and HGX, have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're looking for a strong technical architect to own the end-to-end architecture of these products, at the system software level. Including firmware, kernel drivers, operating systems, and user mode drivers. You will work with component leads internally and engage with industry leading cloud service providers on taking these products to market.
Responsibilities
- Drive the system architecture for a complex server platform in a cross functional environment.
- Work directly with major customers to understand their requirements and work to align their roadmap with NVIDIA’s roadmap.
- Work with business partners and vendors to shape their products to meet NVIDIA’s needs.
- Develop a roadmap of new technologies and protocols and drive their design and adoption.
- Mentor architects and engineering teams to grow them into future leaders.
- Make key technical decisions even when faced with ambiguity, and mitigate execution risks by following left shift strategy.
Requirements
- Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.
- Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs.
- Expertise in Out of Band and Inband management architectures.
- Knowledge of device management protocols such as MCTP, PLDM and RDE.
- Knowledge of system management protocols such as Redfish and IPMI.
- Experience working with platform security experts to define tradeoffs between security and ease of use.
- Demonstrable experience in implementing left shift strategy to de-risk program execution.
- Excellent written and verbal communication skills.
- BS or MS degree in Computer Engineering, Computer Science, or related degree or equivalent experience.
- 20+ years in the area of System architecture and design.
Ways to stand out from the crowd:
- Knowledge of cloud and cluster level deployment and management systems.
- Participation and contributions in standards bodies such as OCP and DMTF.
- Familiarity with CXL architectures.
- Knowledge in storage and networking technologies.
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative, passionate and self-motivated, we want to hear from you!