Vacancy is archived. Applications are no longer accepted.
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 3 Docker @ 4 Jenkins @ 4 Kubernetes @ 4 Linux @ 4 Vault @ 4 Python @ 7 Communication @ 4 Mentoring @ 4 Debugging @ 4 API @ 4 JSON @ 3 OAuth @ 3 QA @ 4 Robot Framework @ 3Details
NVIDIA is seeking a Senior Firmware Release Lifecycle Infrastructure Architect to design and build infrastructure and tools that manage the end-to-end firmware lifecycle for data center compute servers (DGX, HGX, MGX) built with NVIDIA GPUs, CPUs, and DPUs. The role focuses on ingesting firmware binaries, assembling secure validated firmware bundles, and providing scalable, maintainable tooling and automation to support high-concurrency firmware packaging and flashing for large-scale data centers.
Responsibilities
- Design and architect scalable Firmware Lifecycle Management (FLM) systems that ingest firmware binary images and assemble secure, validated firmware bundles for deployment across servers, blades, and racks.
- Develop infrastructure to support high-concurrency firmware packaging pipelines across multiple platforms and SKUs.
- Collaborate with cross-functional teams (firmware, hardware, software, QA) to gather requirements and deliver robust, scalable solutions across tens of products and hundreds of variants.
- Architect and implement front-end, back-end, APIs, UIs, and CLIs to support FLM workflows, ensuring maintainability and performance.
- Integrate third-party components and services (examples: Jenkins, Artifactory, Vault) into the FLM ecosystem.
- Own design and evolution of scalable APIs with a focus on maintainability and extensibility.
- Implement automation frameworks and pipelines using Jenkins, Docker, and Kubernetes; incorporate automation, observability, telemetry, and high-availability design.
- Continuously seek process automation and resilience improvements, including telemetry and observability enhancements.
Requirements
- 8+ years of experience in software architecture, systems programming, automation infrastructure, and firmware package creation — preferably in data center or enterprise environments.
- Bachelor's degree or equivalent experience.
- Strong background in designing scalable and modular architectures, with the ability to identify and mitigate performance bottlenecks.
- Advanced Python programming skills and deep understanding of object-oriented design principles and scalable code practices.
- Expertise in Linux system programming, including shell scripting, system debugging, and automation toolchains.
- Experience with firmware workflows and lifecycle management, including familiarity with Redfish APIs, update mechanisms, and industry standards (e.g., DMTF).
- Hands-on experience integrating third-party tools and building automation frameworks using Jenkins, Docker, and Kubernetes.
- Familiarity with Artifactory and Robot Framework is a plus.
- Excellent communication skills; ability to document and present technical designs to stakeholders across hardware, firmware, software, and QA.
- Familiarity with OS fundamentals such as process scheduling, memory management, and system security models.
Preferred / Ways to Stand Out
- Understanding of Out-of-Band/In-Band management and one or more protocols: MCTP, PLDM, SPDM, Redfish.
- Experience with common firmware stacks such as OpenBMC and BIOS/UEFI.
- Prior experience in firmware provisioning and knowledge of low-level hardware interfaces such as PCIe, I2C, SPI, USB.
- Familiarity with RESTful architectures, JSON-over-HTTPS, OAuth-based authentication, and secure API development.
- Experience leading cross-team architectural discussions and mentoring junior engineers.
Compensation & Benefits
- Base salary range: 184,000 USD - 287,500 USD (will be determined based on location, experience, and internal pay parity).
- Eligible for equity and company benefits.
Additional Information
- Applications accepted at least until August 9, 2025.
- NVIDIA is an equal opportunity employer and values diversity.