Senior AI Infrastructure Software Engineer

at Nvidia
USD 148,000-287,500 per year
SENIOR
βœ… Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kafka @ 4 Kubernetes @ 4 Redis @ 4 Python @ 7 SQL @ 4 NoSQL @ 4 CI/CD @ 4 Distributed Systems @ 4 Communication @ 4 JavaScript @ 7 MongoDB @ 4 Microservices @ 4

Details

NVIDIA's Applied AI team for chip design is building agentic tools and infrastructure to enable researchers and engineers to deploy LLMs and advanced agent frameworks at scale. This role focuses on designing, developing, and maintaining the core infrastructure that powers agentic applications, copilots, and other generative AI systems used across hardware, software, and research teams.

Responsibilities

  • Design, develop, and improve scalable infrastructure to support next-generation AI applications, including copilots and agentic tools.
  • Drive improvements in architecture, performance, and reliability to enable use of LLMs and advanced agent frameworks at scale.
  • Build and maintain production systems including microservices, web apps, databases, containers, Kubernetes, and CI/CD pipelines.
  • Integrate distributed messaging systems and event-driven or decoupled architectures for robust enterprise solutions.
  • Collaborate closely with researchers, hardware, and software teams; mentor and support peers and encourage best engineering practices.
  • Stay informed of the latest advancements in AI infrastructure and contribute to continuous innovation across the organization.

Requirements

  • Master or PhD in Computer Science or related field, or equivalent experience.
  • Minimum of 5 years of experience in large-scale distributed systems or AI infrastructure.
  • Advanced expertise in Python (required) and strong experience with JavaScript.
  • Deep knowledge of software engineering principles, OOP and functional programming, and writing high-performance, maintainable code.
  • Demonstrated experience building scalable microservices and web applications.
  • Experience with SQL and NoSQL databases (especially MongoDB and Redis) in production.
  • Experience with containers, Kubernetes, and CI/CD tooling and practices.
  • Practical experience with distributed messaging systems (e.g., Kafka) and event-driven architectures.
  • Practical experience integrating and fine-tuning LLMs or agent frameworks (examples called out: LangChain, LangGraph, AutoGen, OpenAI Functions), retrieval-augmented generation (RAG), and working with vector databases.
  • Demonstrated end-to-end ownership of engineering solutions: architecture, development, deployment, integration, and ongoing operations/support.
  • Excellent communication skills and a collaborative, proactive approach.

Compensation & Additional Info

  • Base salary ranges (location- and level-dependent):
    • Level 3: 148,000 USD - 235,750 USD
    • Level 4: 184,000 USD - 287,500 USD
  • Eligible for equity and benefits (see NVIDIA benefits).
  • Application deadline: at least until September 28, 2025.
  • Location: Santa Clara, CA, United States. #LI-Hybrid

Benefits

  • Equity eligibility and NVIDIA benefits (health, retirement, etc.).
  • Opportunity to work at the intersection of research, engineering, and product development on cutting-edge generative AI and chip design problems.