Senior Deep Learning Systems Engineer, Datacenters

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Software Development @ 4 Docker @ 4 Linux @ 4 Python @ 4 TensorFlow @ 4 Bash @ 4 Networking @ 4 Performance Monitoring @ 4 System Architecture @ 7 PyTorch @ 4 CUDA @ 4 GPU @ 4

Details

As NVIDIA makes inroads into the Datacenter business, this team is focused on maximizing performance and power efficiency of deep learning applications on datacenter-class hardware and establishing data-driven approaches to hardware design and system software development.

Responsibilities

Develop software infrastructure to characterize and analyze a broad range of Deep Learning applications.
Evolve cost-efficient datacenter architectures tailored to meet the needs of Large Language Models (LLMs).
Work with experts to develop analysis and profiling tools in Python, bash and C++ to measure key performance metrics of DL workloads running on NVIDIA systems.
Analyze system and software characteristics of DL applications (CPU, GPU, networking, IO interactions with DL workloads).
Develop analysis tools and methodologies to measure key performance metrics and estimate potential for efficiency improvement.

Requirements

Bachelor’s degree in Electrical Engineering or Computer Science or equivalent experience (Master's or PhD preferred).
8 years or more of relevant experience.
Experience in at least one of:
- System Software: Operating Systems (Linux), Compilers, GPU kernels (CUDA), DL Frameworks (PyTorch, TensorFlow).
- Silicon Architecture and Performance Modeling/Analysis: CPU, GPU, Memory or Network Architecture.
Programming experience in C/C++ and Python. Exposure to bash scripting.
Exposure to containerization platforms (docker) and datacenter workload managers (slurm) is a plus.
Deep understanding of computer system architecture and performance analysis with demonstrated hands-on experience.
Demonstrated ability to work in virtual/multi-site environments and to take ownership from start to finish.

Ways to stand out

Background with system software, OS intrinsics, GPU kernels (CUDA), or DL frameworks (PyTorch, TensorFlow).
Experience with silicon performance monitoring or profiling tools (e.g., perf, gprof, nvidia-smi, dcgm).
In-depth performance modeling experience in CPU, GPU, Memory or Network Architecture.
Exposure to containerization platforms (docker) and datacenter workload managers (slurm).
Prior experience working with multi-site or cross-functional teams.

Compensation & Additional Info

Base salary ranges (determined by location, experience, and comparable roles):
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
Eligible for equity and benefits.
Applications accepted at least until July 29, 2025.
#LI-Hybrid

Benefits

NVIDIA benefits (details available on NVIDIA website).