Distinguished Software Engineer - NVLink Fusion Software
at Nvidia
š Santa Clara, United States
USD 308,000-471,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Communication @ 4 Networking @ 4 System Architecture @ 7Details
NVIDIA is expanding its data center platform and node designs from single-node HGX/DGX systems to large multi-node NVLink domain rack architectures. NVLink Fusion will enable industry-leading AI scale-up and scale-out performance by integrating NVIDIA GPUs, NVLink, InfiniBand networking, NVIDIA Grace CPUs, and partner ASICs/CPUs into rack-scale architectures. This role is a technical architect position to champion cross-team work across Software, Architecture, Networking, and Systems engineering to define NVLink Fusion architecture, ensure seamless integration of partner ASICs/CPUs, and establish software abstraction layers, reference software, guidance/documentation, and partner engagement models.
Responsibilities
- Define NVLink Fusion architecture, leveraging NVIDIA scale-up and scale-out technologies as a foundation.
- Establish appropriate software abstraction layers and reference software required for NVLink Fusion partners to extend NVIDIA's rack-scale architecture.
- Work directly with major customers to understand requirements and align their roadmaps with NVIDIA's roadmap.
- Collaborate with business partners and vendors to shape partner products to meet NVIDIA's needs.
- Mentor architects and engineering teams to grow them into future leaders.
- Make key technical decisions when faced with ambiguity and mitigate execution risks by applying a left-shift strategy to accelerate time to market.
Requirements
- BS or MS degree in Computer Engineering, Computer Science, or a related field, or equivalent experience.
- 16+ years of experience in system architecture and design.
- Deep experience designing scalable and performant server systems, particularly at the software/hardware interface.
- Previous experience working with complex system software for accelerators (GPUs, DPUs, or FPGAs).
- Expertise in out-of-band and in-band management architectures.
- Knowledge of device management protocols such as MCTP, PLDM, and RDE.
- Knowledge of system management protocols such as Redfish and IPMI.
- Demonstrable experience implementing left-shift strategies to de-risk program execution.
- Excellent written and verbal communication skills.
Ways to Stand Out
- Knowledge of cloud and cluster-level deployment and management systems.
- Participation and contributions in standards bodies such as OCP and DMTF.
- Familiarity with CXL, UCIe and other chip-to-chip (C2C) technology architectures.
- Knowledge in storage and networking technologies.
Compensation & Benefits
- Base salary range: 308,000 USD - 471,500 USD (determined by location, experience, and pay of employees in similar positions).
- Eligibility for equity and benefits (see NVIDIA benefits page).
Additional Information
- Applications accepted at least until August 13, 2025.
- NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.