Software Architect, NIM Factory
at Nvidia
π Santa Clara, United States
USD 272,000-425,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 3 Software Development @ 3 Kubernetes @ 3 Python @ 5 Airflow @ 3 CI/CD @ 3 Distributed Systems @ 7 Leadership @ 5 Communication @ 3 SRE @ 5 Microservices @ 3 API @ 3 Technical Leadership @ 3 LLM @ 3 GPU @ 3Details
NVIDIA is seeking a Software Architect to define and own the technical vision for the NVIDIA Inference Microservices (NIM) Factory. You will set architectural direction for building, deploying, and scaling enterprise-grade AI services, staying hands-on to guide critical implementations. The scope spans day-0 launches through hardening into enterprise-grade software, ensuring reliability, performance, and security across thousands of GPUs. You will also shape strategy for emerging challenges like disaggregated LLM inference and safeguard long-term technical health of the platform.
Responsibilities
- Define the end-to-end technical architecture for the NIM Factory, including container build systems, CI/CD, Kubernetes deployment patterns, and runtime optimization.
- Drive technical strategy and roadmap; make high-impact decisions on frameworks, technologies, and standards that empower dozens of engineering teams.
- Architect and influence the design of workflow orchestration systems that underpin the NIM Factory.
- Guide and support senior engineers across the organization and cultivate a culture centered on technical excellence and innovation.
- Advocate guidelines in software development covering API composition, automation, observability, and secure supply chain management.
- Collaborate with leadership across research, backend, SRE, and product to align technical vision with product goals and influence technical roadmaps.
Requirements
- 15+ years of experience building large-scale, production distributed systems.
- Consistent track record in a technical leadership or architect role, setting technical direction and implementing solutions.
- Deep architectural expertise in cloud-native technologies, including Kubernetes, containers, and microservices.
- Exceptional ability to mentor and grow senior engineers, with a passion for raising the technical bar across the organization.
- Proficiency in languages like Python for building tooling and services.
- Experience architecting solutions for GPU-accelerated or other high-performance computing workloads.
- Excellent communication and collaboration skills; ability to articulate complex technical concepts to diverse audiences and drive consensus.
- A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience.
Ways to stand out (Nice-to-have)
- Hands-on experience with LLM inference stacks such as Triton Inference Server, TensorRT-LLM, or vLLM.
- Experience optimizing large-model serving (KV cache sharding/paging, tensor/sequence parallelism, speculative decoding, dynamic batching).
- Experience architecting next-generation container build systems or CI/CD platforms at scale.
- Background with workflow orchestration engines (e.g., Temporal, Airflow) for complex distributed processes.
- Expertise in designing multi-tenant, multi-cluster, or edge/air-gapped deployment architectures.
Benefits
- Competitive base salary (range provided below), equity eligibility, and a generous benefits package.
- NVIDIA emphasizes diversity and is an equal opportunity employer.
Additional details
- Base salary range: 272,000 USD - 425,500 USD.
- Applications for this job will be accepted at least until September 18, 2025.