Manager, Engineering - Data Center Management
at Nvidia
π Santa Clara, United States
USD 224,000-425,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Software Development @ 7 Python @ 6 Leadership @ 3 Communication @ 6 Git @ 3 Jira @ 3 Debugging @ 6 Technical Leadership @ 3 Project Management @ 3Details
NVIDIA is seeking a strong technical architect to own end-to-end manageability architecture for next-generation, rack-level AI supercomputing platforms deployed in data centers. You will work with internal and external component leads, drive customer use cases, align architecture with customer requirements, and release high-quality products to market.
Responsibilities
- Drive server management for large clusters and data centers deploying NVIDIA GPUs and Grace solutions.
- Work with data center architects and cloud customers to define requirements for implementation to ensure rapid product development.
- Collaborate closely with hardware teams to define low-level requirements and architecture for data center management products.
- Own and deliver firmware for low-level management components and manage a team to deliver firmware with quality.
- Work with internal teams to ensure requirements are designed and implemented correctly across firmware and software modules.
- Collaborate with other leads to design and build data center health management workflows.
- Drive reliability and optimization in firmware architecture from a data center viewpoint.
- Work closely with cluster bring-up teams to resolve issues quickly and own delivered firmware in terms of quality, reliability, and telemetry performance.
Requirements
- 10+ years of relevant experience working on server firmware (BMC) and platform software development.
- BS, MS, or PhD in Electrical Engineering, Computer Science, or a related field, or equivalent experience.
- Hands-on experience with data center health management workflows and a proven record of delivering server firmware for large data centers.
- Strong knowledge of data center management, server architecture, and server manageability in data centers.
- 4+ years of proven experience managing teams of engineers.
- Strong and demonstrable skills in C/C++ and Python; experience programming and debugging server platforms.
- Experience with SCM systems (e.g., Git, Perforce) and project management tools like Jira.
- Excellent written and oral communication skills, strong work ethic, teamwork orientation, and commitment to delivering quality work.
- Self-starter who enjoys finding creative solutions to complicated problems and is hands-on with coding.
Ways to Stand Out
- Hands-on experience with data center health management and server manageability.
- Proven technical leadership driving large, complex problems with 25+ engineers.
Compensation & Benefits
- The base salary range is 224,000 USD - 356,500 USD for Level 3, and 272,000 USD - 425,500 USD for Level 4. The final base salary will be determined based on location, experience, and comparable pay for similar positions.
- You will also be eligible for equity and company benefits.
Other Information
- Applications for this job will be accepted at least until August 13, 2025.
- NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.