Senior Software Engineer, DGX Cloud Lepton Marketplace
at Nvidia
USD 224,000-425,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Software Development @ 4 Go @ 7 Kubernetes @ 4 Distributed Systems @ 4 Mathematics @ 4 IaaS @ 4 API @ 4 Design Patterns @ 7 GPU @ 4Details
One of DGX Cloud’s top priorities is to build a two-side marketplace, connecting AI start-ups and ISVs to NCPs and CSPs via the DGX Cloud Lepton opinionated PaaS platform. This mission supports NVIDIA’s larger push to grow the AI ecosystem and support sovereign AI buildouts around the world. The team accelerates the buildout and integration of NCPs and CSPs into this marketplace to enable seamless access to GPU-optimized virtual machines for developers worldwide.
Responsibilities
- Develop the two-side marketplace, including integration of compute providers and developing discovery and bidding experiences to match supply with demand.
- Design and implement IaaS API integrations, collaborating with external engineering teams to ensure reliable, scalable, and consistent connectivity across diverse cloud environments.
- Shape integration strategies and develop stateful workflow orchestration.
- Drive improvements in testing, observability, and automation to ensure high-quality, fault-tolerant solutions.
- Work on cluster operations, operator development, node health monitoring, and GPU resource scheduling as part of Kubernetes-focused infrastructure engineering.
Requirements
- 12+ years of experience in developing software infrastructure for large-scale AI systems.
- Direct experience in a software engineering role within a highly technical organization with demonstrable impact.
- Software development experience with Kubernetes APIs and frameworks (including cluster operations and operator development).
- Familiarity with setting up cloud infrastructure environments (VMaaS, VPCs, RDMA, shared file-systems).
- Proven track record with 3rd-party API integrations: communicating with external teams, writing API clients, and improving integration reliability.
- Comfortable working in a fast-paced environment and collaborating with external engineering teams to test and debug integrations.
- Technical knowledge of a systems programming language (strong preference for production Go) and a solid understanding of software design patterns for stateful workflow orchestration.
- BS in Computer Science, Engineering, Physics, Mathematics or a comparable degree or equivalent experience.
- 2+ years in a similar role and experience on large-scale production systems. Experience with common software engineering principles, tools and techniques.
Compensation and Benefits
- Base salary ranges (determined by location, experience, and pay of employees in similar positions):
- Level 5: 224,000 USD - 356,500 USD
- Level 6: 272,000 USD - 425,500 USD
- You will also be eligible for equity and benefits (see NVIDIA benefits page).
Additional information
- If you are excited about deepening your experience in cloud infrastructure, Kubernetes, distributed systems, and API development and love working in dynamic, fast-moving teams, please apply.
- Applications for this job will be accepted at least until July 29, 2025.
- NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.