Senior Firmware Engineer ā CSP Engagements
at Nvidia
š Santa Clara, United States
USD 184,000-356,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Software Development @ 4 Performance Optimization @ 4 Debugging @ 4 CUDA @ 4 GPU @ 4Details
NVIDIA is seeking a Senior Firmware Engineer to join the CSP Engagements team, focusing on system software for datacenter products such as GB200. This role combines deep technical expertise in embedded firmware development with customer-facing responsibilities to enable cloud service providers with next-generation computing platforms. You will work at the intersection of hardware and software, driving technical solutions from concept through deployment.
Responsibilities
- Design and develop firmware solutions for manageability and observability of data center servers.
- Actively participate in hardware bring-up activities, out-of-band (OOB) firmware development, protocol stacks (Redfish, PLDM, MCTP, NSM) and hardware-software co-design for Cloud Service Provider (CSP) deployments.
- Debug and troubleshoot NVIDIA GPU firmware issues, power management, performance, and thermal control problems for data center deployments; provide active support to CSPs.
- Partner directly with CSPs to deliver technical solutions, co-develop & co-debug features and optimizations, and provide support during new product introductions.
- Perform advanced system debugging, root cause analysis, and performance optimization for large-scale data center environments.
- Collaborate with AE, FAE, and Solution Architect teams to deliver integrated customer solutions and technical documentation.
Requirements
- Deep expertise in data center server architectures, HPC systems, and hardware-software co-design.
- Deep expertise in embedded firmware, server management controllers, and hardware bring-up with a proven track record of shipping production BMC solutions.
- Strong knowledge of DMTF protocols and related management/provisioning stacks (Redfish, IPMI, PLDM, MCTP, SPDM), telemetry frameworks, and out-of-band management architectures.
- Expert-level skills in C/C++ in resource-constrained embedded environments, RTOS, device drivers, and low-level protocols (I2C, SPI, UART, PCIe, MCTP).
- Experience with RAS including error handling, error injection, fault isolation, and system health monitoring.
- BS or MS in Computer Engineering, Computer Science, or related field (or equivalent experience).
- 8-12 years of system software development experience.
Preferred / Ways to stand out
- Knowledge of cloud and cluster-level deployment and management systems.
- Experience with GPU computing (CUDA) and deep learning workloads.
- Knowledge of memory fabric and CXL architectures.
Compensation & Other Information
- Base salary range: 184,000 USD - 287,500 USD (Level 4); 224,000 USD - 356,500 USD (Level 5). Actual base salary will be determined based on location, experience, and internal pay considerations.
- You will also be eligible for equity and benefits.
- Applications for this job will be accepted at least until August 12, 2025.
NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer.