Senior AI Engineer, Nemo Retriever - Model Optimization and MLOps

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Docker @ 3 Kubernetes @ 3 Python @ 4 MLOps @ 4 Helm @ 3 Microservices @ 4 NLP @ 7 LLM @ 4 PyTorch @ 4 OpenAPI @ 4 GPU @ 4

Details

NVIDIA's technology is at the heart of the AI revolution, powering solutions from self-driving cars and robotics to co-pilots and more. This opportunity places you at the forefront of intelligent assistants and information retrieval, working with the NVIDIA NIM platform which offers GPU-accelerated inferencing microservices for pre-trained and customized AI models across various environments including clouds, data centers, and workstations.

NVIDIA NeMo Retriever supports building multimodal extraction, re-ranking, and embedding pipelines with strong accuracy and data privacy. It boosts AI applications such as advanced retrieval-augmented generation (RAG) and Agentic AI workflows. The team is seeking an AI Engineer skilled in ML development, system optimization, and MLOps to tackle challenges in Generative AI, LLM, MLLM, and RAG using cutting-edge platforms.

Responsibilities

Develop and maintain NIM microservices containerizing optimized AI models using OpenAPI standards with Python or equivalent performant languages.
Collaborate with partner teams to gather requirements, build and evaluate proof of concepts, and develop production tool roadmaps.
Support the development of integrated systems called AI Blueprints to enable unified, turnkey experiences.
Build and maintain Continuous Delivery pipelines to facilitate faster and safer deployment while upholding operational standards.
Conduct peer reviews focused on performance, scalability, and correctness.

Requirements

Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or related fields, or equivalent experience.
8+ years of experience in similar roles.
Expertise in Python programming and Deep Learning frameworks like PyTorch.
Experience delivering software in cloud environments with knowledge of cloud infrastructure patterns.
Familiarity with MLOps tools and technologies such as Docker-Compose, Containers, Kubernetes, Helm, and data center deployments.
Hands-on knowledge of ML libraries, especially PyTorch, TensorRT, and TensorRT-LLM.
Deep understanding of NLP, LLM, MLLM, Generative AI, and RAG workflows.
Self-starter mindset with enthusiasm for continuous learning and team knowledge sharing.
Highly motivated and curious about emerging technologies.

Benefits

Offering competitive salaries and a comprehensive benefits package, NVIDIA is recognized as a top-tier employer in technology. Equity and benefits eligibility included. NVIDIA fosters diversity and is an equal opportunity employer.