Senior Developer Technology Engineer, CPU Performance

at Nvidia
USD 152,000-287,500 per year
SENIOR
✅ Hybrid

Used Tools & Technologies

Not specified

Required Skills & Competences

Spark @ 4 Algorithms @ 4 Data Structures @ 4 Communication @ 4 Networking @ 4 Parallel Programming @ 4 Prioritization @ 4 GPU @ 4 AI @ 4

Details

We are currently seeking a Senior Developer Technology Engineer, CPU Performance at NVIDIA. You will research and develop techniques to accelerate large-scale applications running on NVIDIA’s advanced CPU platforms, analyze and optimize complex database and data analytics workloads, and influence next-generation hardware and software designs. The role involves publishing findings, presenting to the developer community, and collaborating with internal research, hardware, system software, libraries, and tools teams.

Responsibilities

  • Research and develop techniques to accelerate large scale applications on NVIDIA CPU platforms.
  • Perform in-depth analysis and optimization of complex database and data analytics workloads to ensure the best possible performance on modern CPU-focused hardware architectures.
  • Investigate hardware and system bottlenecks and optimize performance of critical applications on heterogeneous systems (CPUs and GPUs).
  • Work directly with external and internal technical experts (industry and academia) to design parallel algorithms and implement optimizations.
  • Publish and present discovered optimization techniques in developer blogs or relevant conferences to engage and educate the developer community.
  • Influence the design of next-generation hardware architectures, software, and programming models in collaboration with research, hardware, system software, libraries, and tools teams.

Requirements

  • Masters or PhD in Computer Science, Computer Engineering, or a related computationally focused science degree (or equivalent experience).
  • At least 5+ years of relevant work or research experience.
  • Expert knowledge of modern CPU architectures (ARM, x86) and system/OS concepts.
  • In-depth expertise with CPU architecture fundamentals, especially the memory subsystem (cache, DRAM, storage).
  • Hands-on experience with low-level parallel programming, vectorization, CPU intrinsics, and concurrent data structures.
  • Programming fluency in modern C/C++ with a deep understanding of algorithms, concurrency, and optimization techniques.
  • Good communication and organization skills, logical problem solving and prioritization abilities.

Ways to Stand Out

  • Experience optimizing the performance of distributed database systems and frameworks (e.g., production databases or Spark).
  • Background with compression, storage systems, networking, and distributed computer architectures.
  • Knowledge of GPU architectures.

Compensation & Benefits

  • Base salary ranges: $152,000 - $241,500 USD for Level 3, and $184,000 - $287,500 USD for Level 4.
  • Eligible for equity and benefits (link to NVIDIA benefits provided in the posting).

Additional Information

  • #LI-Hybrid (hybrid office policy).
  • Applications accepted at least until March 21, 2026. This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes and is an equal opportunity employer committed to diversity.