Senior AI Infrastructure Software Engineer

at Nvidia

📍 Santa Clara, United States

USD 184,000-287,500 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Kafka @ 4 Kubernetes @ 4 Redis @ 4 Python @ 7 SQL @ 4 NoSQL @ 4 CI/CD @ 4 Distributed Systems @ 6 Hiring @ 4 Communication @ 4 JavaScript @ 7 MongoDB @ 4 Microservices @ 4

Details

NVIDIA's Applied AI team for chip design is building scalable agentic systems to support researchers and production applications. This role focuses on designing, developing, and operating AI infrastructure that enables agents, copilots, and other generative-AI tools to reason, plan, call tools, and generate code reliably at scale.

Responsibilities

Design, develop, and improve scalable infrastructure to support next-generation AI applications, including copilots and agentic tools.
Drive improvements in architecture, performance, and reliability to bring LLMs and advanced agent frameworks to production at scale.
Build and maintain core infrastructure for deploying and running agents and agentic applications in production.
Collaborate across hardware, software, and research teams; mentor and support peers; promote best engineering practices and technical excellence.
Stay informed on AI infrastructure advancements and contribute to continuous innovation across the organization.

Requirements

Master or PhD in Computer Science or a related field, or equivalent experience.
Minimum of 5 years of experience in large-scale distributed systems or AI infrastructure.
Advanced expertise in Python (required) and strong experience with JavaScript.
Deep knowledge of software engineering principles, OOP and functional programming, and writing high-performance, maintainable code.
Demonstrated expertise building scalable microservices and web applications.
Experience with SQL and NoSQL databases (especially MongoDB and Redis) in production.
Experience with containers, Kubernetes, and CI/CD pipelines.
Solid experience with distributed messaging systems (e.g., Kafka) and event-driven or decoupled architectures.
Practical experience integrating and fine-tuning LLMs or agent frameworks (examples cited: LangChain, LangGraph, AutoGen, OpenAI Functions), RAG, vector databases, and related retrieval/embedding systems.
Demonstrated end-to-end ownership across architecture, development, deployment, integration, and operations/support.
Excellent communication skills and a collaborative, proactive approach.

Benefits

Base salary range (location- and level-dependent):
- Level 4: 184,000 USD - 287,500 USD
- Level 3: 148,000 USD - 235,750 USD
Eligible for equity and company benefits (see NVIDIA benefits page).
Role tagged as hybrid (#LI-Hybrid).

Additional Information

Applications accepted at least until July 29, 2025.
NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.