Senior Accelerated Computing Architect

at Nvidia
USD 184,000-356,500 per year
SENIOR
✅ On-site

Used Tools & Technologies

HPC

Required Skills & Competences

Python @ 3 Algorithms @ 4 Data Structures @ 4 Machine Learning @ 4 Communication @ 3 Prioritization @ 4 Performance Optimization @ 4 API @ 3 CUDA @ 4 GPU @ 4 AI @ 4 Profiling @ 4 OpenCL @ 4

Details

NVIDIA is developing software and system architectures for accelerated high performance computing, scientific computing, machine learning, AI, datacenter, and automotive computing. This position offers the opportunity to make a meaningful impact in a fast-moving, technology-focused company.

Responsibilities

  • Perform in-depth analysis and optimization to ensure the best possible performance on current and/or next-generation NVIDIA GPUs.
  • Create and optimize core parallel algorithms, data structures, and reference codes to provide the best possible solutions for NVIDIA GPUs.
  • Understand and analyze the interplay of hardware and software architectures on core algorithms, programming models, and applications.
  • Actively collaborate with hardware design, software engineering, product, and research teams to guide the direction of accelerated computing.
  • Dive into accelerated computing applications to facilitate software-hardware co-design.
  • Write and present work via white papers, conference publications, official blog posts, patent applications, etc., as appropriate.

Requirements

  • MS or Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, or equivalent experience.
  • 6+ years of relevant work experience.
  • Strong mathematical fundamentals, including linear algebra and numerical methods.
  • A passion for performance optimization.
  • Hands-on experience with the massively parallel GPU programming model (e.g., CUDA or OpenCL). Familiarity with multi-node communication APIs like MPI or OpenSHMEM/NVSHMEM is a plus.
  • Strong knowledge of C and C++ with solid understanding of software design, programming techniques, and algorithms. Familiarity with threading APIs for multicore CPUs and Unix-style Inter-process Communication (IPC) APIs is a plus.
  • Familiarity with Python is a plus.
  • Good communication and organization skills, logical problem solving, time management, and task prioritization.
  • Experience benchmarking, profiling, and characterizing workloads on GPU and CPU clusters.

Compensation & Benefits

  • Base salary ranges (location, experience, and level dependent):
    • Level 4: 184,000 USD - 287,500 USD
    • Level 5: 224,000 USD - 356,500 USD
  • Eligible for equity and company benefits.

Additional Information

  • Applications for this job will be accepted at least until May 11, 2026.
  • This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.
  • NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment.