Senior Software Engineer - NIM Factory Container and Cloud Infrastructure

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 4 Docker @ 4 Kubernetes @ 4 Python @ 4 CI/CD @ 4 Hiring @ 4 Communication @ 6 Helm @ 4 SRE @ 4 Microservices @ 4 API @ 4 LLM @ 4 CUDA @ 4 GPU @ 4

Details

NVIDIA is hiring a Senior Software Engineer to focus on container and cloud infrastructure for NVIDIA Inference Microservices (NIMs) and hosted services. The role involves designing and implementing a core container strategy, building enterprise-grade tooling for container build/packaging/deployment, and improving reliability, performance, and scale across thousands of GPUs. The team will work on supporting emerging deployment patterns such as disaggregated LLM inference.

Responsibilities

  • Design, build, and harden containers for NIM runtimes and inference backends; enable reproducible, multi-arch, CUDA-optimized builds.
  • Develop Python tooling and services for build orchestration, CI/CD integrations, Helm/Operator automation, and test harnesses.
  • Enforce quality with typing, linting, and unit/integration tests.
  • Help design and evolve Kubernetes deployment patterns for NIMs, including GPU scheduling, autoscaling, and multi-cluster rollouts.
  • Optimize container performance (layer layout, startup time, build caching, runtime memory/IO, network, and GPU utilization); instrument with metrics and tracing.
  • Evolve base image strategy, dependency management, and artifact/registry topology.
  • Collaborate across research, backend, SRE, and product teams to ensure day-0 availability of new models.
  • Mentor teammates and set high engineering standards for container quality, security, and operability.

Requirements

  • BS/MS in Computer Science, Computer Engineering, or related field, or equivalent experience.
  • 6+ years building production software with a strong focus on containers and Kubernetes.
  • Strong Python skills building production-grade tooling and services.
  • Experience with Python SDKs/clients for Kubernetes and cloud services.
  • Expert knowledge of Docker/BuildKit, containerd/OCI, image layering, multi-stage builds, and registry workflows.
  • Deep experience operating workloads on Kubernetes.
  • Hands-on experience building and running GPU workloads in Kubernetes, including NVIDIA device plugin, MIG, CUDA drivers/runtime, and resource isolation.
  • Excellent collaboration and communication skills; ability to influence cross-functional design decisions.

Ways to stand out

  • Expertise with Helm chart design systems, Operators, and platform APIs serving many teams.
  • Experience with OpenAI API and Hugging Face API; understanding differences between inference backends (vLLM, SGLang, TRT-LLM).
  • Background in benchmarking and optimizing inference container performance and startup latency at scale.
  • Prior experience designing multi-tenant, multi-cluster, or edge/air-gapped container delivery.
  • Contributions to open-source container, Kubernetes, or GPU ecosystems.

Benefits & Compensation

  • Competitive salary and a generous benefits package; eligible for equity and NVIDIA benefits.
  • Base salary ranges by level:
    • Level 4: 184,000 USD - 287,500 USD
    • Level 5: 224,000 USD - 356,500 USD

Applications accepted at least until September 14, 2025. NVIDIA is an equal opportunity employer committed to diversity and inclusion.