Senior Systems Software Engineer, TAO Machine Learning Data Modeling

at Nvidia
USD 148,000-287,500 per year
SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Docker @ 6 Kubernetes @ 6 ETL @ 4 Algorithms @ 4 Machine Learning @ 4 Hiring @ 4 LLM @ 3 PyTorch @ 4

Details

NVIDIA is hiring a Senior Systems Software Engineer to join the TAO Toolkit ML Data and Platforms Team. The team builds frameworks, services, algorithms, and tools that power large NVIDIA multi-modal foundation models and their customization. In this role you will develop algorithms and scalable systems to make automated sense of petabytes of unstructured data using machine and deep learning, collaborating with deep learning architects and engineers to enable pioneering AI models.

Responsibilities

  • Find and create the right data for multi-modal models, including synthetic generation using GenAI and simulation, with scalable systems.
  • Design ML and DL architectures and loss functions to formulate automated pseudo-labeling and GenAI solutions for multi-modal tasks.
  • Design and develop active and passive learning paradigms (in-loop and out-of-loop annotators) to iteratively mine informative data.
  • Design insightful evaluation metrics (unsupervised, semi- and supervised settings) for model and data performance characterization.
  • Build scalable and robust ETL pipelines using ML and DL models to deliver high-quality datasets.
  • Collaborate with internal teams to define requirements, enhance products, and automate workflows.

Requirements

  • Bachelor’s degree in Computer Engineering, Computer Science, Electrical Engineering, Robotics, or related field (or equivalent experience).
  • 5+ years of ML/DL-related engineering experience with strong architecture and design skills.
  • Excellent foundational understanding of machine learning and deep learning.
  • Proficient understanding of perception systems (2D, 3D and/or temporal).
  • Expertise in out-of-distribution concepts and related issues.
  • Knowledge of PyTorch, distributed machine learning, and distributed file systems.
  • 5+ years leading complex, sometimes ambiguous projects, particularly for high-throughput services at supercomputing scale.

Ways to stand out / Nice to have

  • Familiarity with multiple perception domains: object detection, segmentation, multiple object tracking, metric learning.
  • Knowledge of internal workings of diffusion models.
  • Familiarity with 3D geometrical aspects of simulation and inverse computer graphics.
  • Proficient running applications on cloud platforms using Kubernetes and Docker, and using ML frameworks like PyTorch.
  • Experience building systems and familiarity with deep learning tooling such as NVIDIA TensorRT-LLM, Multimodal-LLM, and Triton Server.

Compensation & Benefits

  • Base salary ranges (determined by location, experience, and comparable roles):
    • Level 3: 148,000 USD - 235,750 USD
    • Level 4: 184,000 USD - 287,500 USD
  • Eligible for equity and benefits.

Additional information

  • Applications accepted at least until August 3, 2025.
  • NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.