Software Architect, NIM Factory

at Nvidia
USD 272,000-425,500 per year
MIDDLE
βœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 3 Software Development @ 3 Kubernetes @ 3 Python @ 5 Airflow @ 3 CI/CD @ 3 Distributed Systems @ 7 Leadership @ 5 Communication @ 3 SRE @ 5 Microservices @ 3 API @ 3 Technical Leadership @ 3 LLM @ 3 GPU @ 3

Details

NVIDIA is seeking a Software Architect to define and own the technical vision for the NVIDIA Inference Microservices (NIM) Factory. You will set architectural direction for building, deploying, and scaling enterprise-grade AI services, staying hands-on to guide critical implementations. The scope spans day-0 launches through hardening into enterprise-grade software, ensuring reliability, performance, and security across thousands of GPUs. You will also shape strategy for emerging challenges like disaggregated LLM inference and safeguard long-term technical health of the platform.

Responsibilities

  • Define the end-to-end technical architecture for the NIM Factory, including container build systems, CI/CD, Kubernetes deployment patterns, and runtime optimization.
  • Drive technical strategy and roadmap; make high-impact decisions on frameworks, technologies, and standards that empower dozens of engineering teams.
  • Architect and influence the design of workflow orchestration systems that underpin the NIM Factory.
  • Guide and support senior engineers across the organization and cultivate a culture centered on technical excellence and innovation.
  • Advocate guidelines in software development covering API composition, automation, observability, and secure supply chain management.
  • Collaborate with leadership across research, backend, SRE, and product to align technical vision with product goals and influence technical roadmaps.

Requirements

  • 15+ years of experience building large-scale, production distributed systems.
  • Consistent track record in a technical leadership or architect role, setting technical direction and implementing solutions.
  • Deep architectural expertise in cloud-native technologies, including Kubernetes, containers, and microservices.
  • Exceptional ability to mentor and grow senior engineers, with a passion for raising the technical bar across the organization.
  • Proficiency in languages like Python for building tooling and services.
  • Experience architecting solutions for GPU-accelerated or other high-performance computing workloads.
  • Excellent communication and collaboration skills; ability to articulate complex technical concepts to diverse audiences and drive consensus.
  • A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience.

Ways to stand out (Nice-to-have)

  • Hands-on experience with LLM inference stacks such as Triton Inference Server, TensorRT-LLM, or vLLM.
  • Experience optimizing large-model serving (KV cache sharding/paging, tensor/sequence parallelism, speculative decoding, dynamic batching).
  • Experience architecting next-generation container build systems or CI/CD platforms at scale.
  • Background with workflow orchestration engines (e.g., Temporal, Airflow) for complex distributed processes.
  • Expertise in designing multi-tenant, multi-cluster, or edge/air-gapped deployment architectures.

Benefits

  • Competitive base salary (range provided below), equity eligibility, and a generous benefits package.
  • NVIDIA emphasizes diversity and is an equal opportunity employer.

Additional details

  • Base salary range: 272,000 USD - 425,500 USD.
  • Applications for this job will be accepted at least until September 18, 2025.