Senior Storage Performance Engineer
at Nvidia
π Santa Clara, United States
USD 200,000-322,000 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Grafana @ 4 Kubernetes @ 3 Prometheus @ 4 Python @ 6 TensorFlow @ 4 Bash @ 6 Communication @ 4 PyTorch @ 4 GPU @ 4Details
NVIDIA is seeking a Senior Storage Performance Engineer to join the team in Santa Clara, CA. The role focuses on creating, implementing, and analyzing complex performance benchmarks to optimize NVIDIA's infrastructure stack. Your work will directly impact AI inference and training, NVIDIA NIMs, retrieval-augmented generation (RAG) pipelines, HPC applications, and storage platforms.
Responsibilities
- Craft and deliver performance benchmarks across AI, HPC, and enterprise storage platforms.
- Test and benchmark storage appliances (block, file, object) against NVIDIA data center solutions.
- Operate and tune AI inference and training workloads using tools such as PyTorch, TensorFlow, and NVIDIA NIMs.
- Benchmark and analyze RAG pipelines (ingestion, retrieval, inference) including vector database interactions.
- Profile and optimize MPI-based and multi-node distributed applications.
- Collaborate with product managers, system architects, and partners to fine-tune hardware/software stack performance.
Requirements
- 12+ years of experience in performance engineering, benchmarking, or HPC/AI systems.
- Deep expertise in AI/ML and deep learning frameworks: PyTorch, TensorFlow, Triton.
- Strong background in storage systems and filesystems (block, file, object storage).
- Proven experience with MPI, OpenMP, and Slurm in large-scale compute environments.
- Proficiency in Python, Bash, and automation frameworks for job orchestration and results parsing.
- Experience profiling and optimizing multi-node, distributed HPC applications.
- Excellent communication skills and the ability to switch between deep technical work and high-level business impact.
- BS, MS, or PhD (or equivalent experience) in Computer Science, Electrical Engineering, or related field.
Ways to stand out
- Experience with RAG pipelines and vector databases (FAISS, Milvus, Qdrant).
- Familiarity with Kubernetes and CSI-based persistent storage systems.
- Knowledge of GPU profiling tools such as Nsight Systems and PyTorch Profiler.
- Experience with telemetry and monitoring frameworks (Prometheus, Grafana).
- Demonstrated enthusiasm for exploring the boundaries of AI, HPC, and storage capabilities.
Benefits / Additional information
- Base salary range (location and experience dependent): 200,000 USD - 322,000 USD.
- Eligible for equity and company benefits (link to NVIDIA benefits referenced in original posting).
- Applications accepted at least until September 29, 2025.
- NVIDIA is an equal opportunity employer committed to diversity and inclusion.