Vacancy is archived. Applications are no longer accepted.

Senior LLM Systems Engineer, NeMo Microservices Platform

at Nvidia
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 4 Python @ 4 Machine Learning @ 4 gRPC @ 4 Protobuf @ 4 Rust @ 4 Microservices @ 4 Debugging @ 4 HTTP @ 4 JSON @ 4 NLP @ 4 LLM @ 4 GPU @ 4

Details

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work.

We are looking for a Senior LLM Systems Engineer to help build our NeMo Microservice Platform. Our team is building the primitives to let software engineers train and deploy AI at scale. We are dedicated to developing speech and NLP technologies that tackle real problems. We contribute to all steps of the machine learning lifecycle: from conceptualization, to applied research, engineering for optimized inference, and deployment.

Responsibilities

  • Development of distributed cloud applications, microservices, and platform able to scale up to huge models
  • Creating microservices for task-specific AI cloud services
  • Implementing core infrastructure for cloud-native AI evaluation
  • Improving service stability, observability, and reliability
  • Pursuing speed-of-light performance under high load

Requirements

  • BS, Masters, or equivalent experience in computer science, computer architecture, or related field
  • 5+ years of experience
  • Ability to work independently, define project goals and scope, interact directly with open source community, and manage own development effort
  • Experience implementing microservices and cloud-native applications using HTTP REST, gRPC, protobuf, JSON and related technologies
  • Understanding of performance, security, and reliability in complex distributed infrastructure
  • Excellent Python programming and software design skills, including debugging, performance and service health analysis, and test design

Ways to Stand Out from the Crowd

  • Background with the enterprise software life cycle
  • Experience in deep learning research, particularly model evaluation techniques
  • Experience with Rust or Golang

Benefits

  • Eligible for equity and additional benefits

Salaries are based on location, experience, and comparable pay in similar positions.