Senior AI Infrastructure Software Engineer
at Nvidia
π Santa Clara, United States
USD 148,000-287,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Kafka @ 4 Kubernetes @ 4 Redis @ 4 Python @ 7 SQL @ 4 NoSQL @ 4 CI/CD @ 4 Distributed Systems @ 4 Communication @ 4 JavaScript @ 7 MongoDB @ 4 Microservices @ 4Details
NVIDIA's Applied AI team for chip design is building agentic tools and infrastructure to enable researchers and engineers to deploy LLMs and advanced agent frameworks at scale. This role focuses on designing, developing, and maintaining the core infrastructure that powers agentic applications, copilots, and other generative AI systems used across hardware, software, and research teams.
Responsibilities
- Design, develop, and improve scalable infrastructure to support next-generation AI applications, including copilots and agentic tools.
- Drive improvements in architecture, performance, and reliability to enable use of LLMs and advanced agent frameworks at scale.
- Build and maintain production systems including microservices, web apps, databases, containers, Kubernetes, and CI/CD pipelines.
- Integrate distributed messaging systems and event-driven or decoupled architectures for robust enterprise solutions.
- Collaborate closely with researchers, hardware, and software teams; mentor and support peers and encourage best engineering practices.
- Stay informed of the latest advancements in AI infrastructure and contribute to continuous innovation across the organization.
Requirements
- Master or PhD in Computer Science or related field, or equivalent experience.
- Minimum of 5 years of experience in large-scale distributed systems or AI infrastructure.
- Advanced expertise in Python (required) and strong experience with JavaScript.
- Deep knowledge of software engineering principles, OOP and functional programming, and writing high-performance, maintainable code.
- Demonstrated experience building scalable microservices and web applications.
- Experience with SQL and NoSQL databases (especially MongoDB and Redis) in production.
- Experience with containers, Kubernetes, and CI/CD tooling and practices.
- Practical experience with distributed messaging systems (e.g., Kafka) and event-driven architectures.
- Practical experience integrating and fine-tuning LLMs or agent frameworks (examples called out: LangChain, LangGraph, AutoGen, OpenAI Functions), retrieval-augmented generation (RAG), and working with vector databases.
- Demonstrated end-to-end ownership of engineering solutions: architecture, development, deployment, integration, and ongoing operations/support.
- Excellent communication skills and a collaborative, proactive approach.
Compensation & Additional Info
- Base salary ranges (location- and level-dependent):
- Level 3: 148,000 USD - 235,750 USD
- Level 4: 184,000 USD - 287,500 USD
- Eligible for equity and benefits (see NVIDIA benefits).
- Application deadline: at least until September 28, 2025.
- Location: Santa Clara, CA, United States. #LI-Hybrid
Benefits
- Equity eligibility and NVIDIA benefits (health, retirement, etc.).
- Opportunity to work at the intersection of research, engineering, and product development on cutting-edge generative AI and chip design problems.