Senior Engineering Manager - Compute Server Bring-Up
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 4 Communication @ 7 Git @ 4 Networking @ 4 Jira @ 4 Project Management @ 4 Reporting @ 4 QA @ 4 System Architecture @ 7 Customer Support @ 4 GPU @ 4Details
NVIDIA data center systems have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA Networking, NVIDIA Data Center CPUs, and a fully optimized NVIDIA AI and HPC software stack.
We are seeking an excellent Senior Engineering Manager to lead the Compute Server Bring-Up team. This team is responsible for the bring-up, integration, validation and troubleshooting for compute tray platforms of GPU racks β ensuring servers are fully functional and validated as per requirement before mass deployment in data centers. You will directly lead all aspects of a group of bring-up engineers and form a larger virtual team spanning across NVIDIA software & firmware teams to ensure successful bring up compute platforms both internally and with customers.
Responsibilities
- Own initial power-on and board bring-up: lead the initial power-on and functional validation of compute trays (CPU, GPU, NIC, storage including NVMe, cooling, etc.) internally and with customers. Ensure all functional requirements are met.
- Form and lead a virtual team across NVIDIA software & firmware teams to ensure subject matter experts are available as needed throughout bring-up. Provide regular reporting on status of bring-up and ensure cross-company teams are activated to help.
- Oversee flashing, updating, and validation of firmware for all server components as per defined architecture. Ensure validation for boundary, stress, and regression testing; confirm telemetry, logging, and hardware management features work as required. Document pain points, bring-up failures, recovery flows, and provide actionable feedback to hardware, firmware, and software teams.
- Support factory & manufacturing flows, firmware updates, and diagnostic procedures. Ensure BOM change signoff and process optimization.
- Lead debug, issue resolution and customer support: root cause analysis and resolution of bring-up failures. Collaborate with partners, ODMs, and customers for technical support.
- Own and maintain platform design guides, bring-up checklists, and install instructions. Provide training and enablement for internal and external teams.
- Drive product life cycles with QA teams, ensuring robust bring up, productization, and delivery.
- Conduct performance evaluations, develop a culture of excellence, and ensure high productivity.
Requirements
- 5+ years of relevant experience managing systems/platform software teams, ideally in server bring-up, firmware development, or data center solutions. Deep experience operating successfully in a matrix environment, forming and leading high-impact virtual teams spanning multiple disciplines.
- BS, MS, or PhD in EE/CS or related field (or equivalent experience) with 12+ overall years of experience. Strong knowledge of compute tray designs, firmware enablement, and system-level architecture.
- Proven track record of delivering scalable server products and solutions for large-scale data centers. Experience collaborating with hardware, firmware, manufacturing, diagnostics and QA teams.
- Experience with SCM (Git, Perforce) and project management tools (Jira).
- Excellent written and oral communication skills, strong work ethic, and dedication to teamwork.
- Hands-on experience with x86/ARM system architecture and coding (C/C++, Python).
- Self-starter who finds creative solutions to complicated problems.
- Proven excellence in server architecture and cross-team collaboration to deliver server products to defined KPIs.
Ways to stand out
- Experience leading bring-up for sophisticated compute architectures like GB200 NVL72.
Compensation & Other Details
- Base salary range: 272,000 USD - 425,500 USD (determined based on location, experience, and pay of employees in similar positions).
- You will also be eligible for equity and benefits (see NVIDIA benefits).
- Applications accepted at least until December 17, 2025.
Equal Opportunity
NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. NVIDIA does not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.