Software Architect, NIM Factory

at Nvidia
USD 272,000-425,500 per year
MIDDLE
βœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Software Development @ 3 Kubernetes @ 3 Python @ 5 Airflow @ 3 CI/CD @ 3 Distributed Systems @ 7 Leadership @ 5 Communication @ 3 SRE @ 5 Microservices @ 3 API @ 3 Technical Leadership @ 3 LLM @ 3 GPU @ 3

Details

NVIDIA is seeking a Software Architect to define and own the technical vision for the NVIDIA Inference Microservices (NIM) Factory. The role sets architectural direction for building, deploying, and scaling enterprise-grade AI services, and requires hands-on guidance for critical implementations. The scope includes day-0 launches and hardening them into reliable, performant, and secure software across thousands of GPUs. The architect will also shape strategy for emerging challenges like disaggregated LLM inference and safeguard the long-term technical health of the platform.

Responsibilities

  • Define the end-to-end technical architecture for the NIM Factory, including container build systems, CI/CD, Kubernetes deployment patterns, and runtime optimization.
  • Drive technical strategy and roadmap; make high-impact decisions on frameworks, technologies, and standards to empower dozens of engineering teams.
  • Architect and influence the design of workflow orchestration systems that underpin the NIM Factory.
  • Guide and support senior engineers across the organization and cultivate a culture of technical excellence and innovation.
  • Advocate for software development guidelines covering API composition, automation, observability, and secure supply chain management.
  • Collaborate with leadership across research, backend, SRE, and product to align technical vision with product goals and influence technical roadmaps.

Requirements

  • 15+ years of experience building large-scale, production distributed systems.
  • Consistent track record in a technical leadership or architect role, setting technical direction and implementing solutions.
  • Deep architectural expertise in cloud-native technologies, including Kubernetes, containers, and microservices.
  • Exceptional ability to mentor and grow senior engineers.
  • Proficiency in languages like Python for building tooling and services.
  • Experience architecting solutions for GPU-accelerated or other high-performance computing workloads.
  • Excellent communication and collaboration skills; ability to articulate complex technical concepts and drive consensus.
  • Degree in Computer Science, Computer Engineering, or related field (BS or MS) or equivalent experience.

Ways to stand out

  • Hands-on experience with LLM inference stacks such as Triton Inference Server, TensorRT-LLM, vLLM.
  • Experience optimizing large-model serving (KV cache sharding/paging, tensor/sequence parallelism, speculative decoding, dynamic batching).
  • Experience architecting next-generation container build systems or CI/CD platforms at scale.
  • Background with workflow orchestration engines (e.g., Temporal, Airflow) for complex distributed processes.
  • Expertise in designing multi-tenant, multi-cluster, or edge/air-gapped deployment architectures.

Benefits

  • Competitive base salary (see range below), eligibility for equity and a comprehensive benefits package.
  • Opportunity to work at a leading AI platform company and influence large-scale AI infrastructure.

Compensation

  • Base salary range: 272,000 USD - 425,500 USD (determined by location, experience, and comparable pay).
  • Eligible for equity and benefits.

Additional notes

  • Applications accepted at least until September 18, 2025.
  • NVIDIA is an equal opportunity employer committed to a diverse work environment.