Senior Data Center Performance Engineer - Benchmarking and Optimization
at Nvidia
USD 184,000-356,500 per year
Used Tools & Technologies
Not specified
Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value.
About proficiency levels:
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Docker @ 4
Kubernetes @ 4
Linux @ 7
Python @ 4
TensorFlow @ 4
Networking @ 4
Parallel Programming @ 3
Performance Monitoring @ 4
Performance Optimization @ 4
System Architecture @ 7
PyTorch @ 4
CUDA @ 3
GPU @ 3
- 1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;
- 3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;
- 7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;
- 10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.
Details
NVIDIA is expanding its data center platform ecosystem from single-node HGX/DGX systems to large multi-node NVLink domain rack architectures. These platforms combine NVIDIA GPUs, NVLink, InfiniBand networking, Grace CPUs, and an optimized AI/HPC software stack. This role leads performance benchmarking and optimization efforts to ensure data center solutions deliver industry-leading performance for AI training, inference, and HPC workloads at scale.
Responsibilities
- Design and execute comprehensive performance benchmarking strategies for data center platforms and products.
- Characterize real-world AI training, inference, and HPC workloads at scale.
- Define, track, and report key performance indicators (throughput, latency, efficiency, scaling).
- Build automation tools and frameworks for performance monitoring and analysis.
- Identify and analyze performance bottlenecks across compute, memory, network, and storage subsystems.
- Work closely with architecture, hardware, software, networking, storage, and customer teams to resolve performance issues.
- Drive performance improvements through system tuning, configuration optimization, and architectural recommendations for future systems.
Requirements
- M.S. or Ph.D. in Computer Science, Electrical Engineering or related field (or equivalent experience).
- 8+ years of experience in performance engineering or system architecture.
- Deep understanding of computer architecture, hardware-software interaction, and computing at scale.
- Strong proficiency with performance profiling tools (Linux perf, NVIDIA Nsight Systems).
- Familiarity with GPU computing and parallel programming (CUDA).
- Background with HPC networking technologies (InfiniBand, RoCE, NVLink).
- Programming skills in Python, C++, and shell scripting.
- Excellent analytical and problem-solving abilities; adaptability and passion to learn new technologies.
- Ability to communicate effectively and work with cross-functional global teams.
Ways to Stand Out
- Experience with AI/ML frameworks (PyTorch, TensorFlow, JAX).
- Knowledge of MPI, collective communications (NCCL), and distributed training/inference.
- Familiarity with NVIDIA DGX/HGX platforms and other data center solutions.
- Experience with containers, cloud provisioning and scheduling tools (Docker, Kubernetes, SLURM).
- Understanding of storage systems and I/O performance.
- Track record of performance optimization in production environments; experience with AI code generation tools.
Compensation and Benefits
- Base salary ranges by level:
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
- Eligible for equity and benefits.
Location & Schedule
- Location: Santa Clara, California, United States.
- Full-time role; standard full-time hours (40 hours/week by default if unspecified).
Additional Information
- Applications accepted through at least December 20, 2025.
- NVIDIA is an equal opportunity employer and commits to fostering a diverse work environment.