Senior Solutions Architect, HPC Systems Engineer

at Nvidia
πŸ“ World
πŸ“ United States
USD 184,000-287,500 per year
SENIOR
βœ… Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Docker @ 4 Kubernetes @ 4 Linux @ 4 DevOps @ 4 MLOps @ 4 Communication @ 7 Networking @ 4 Product Management @ 4 Debugging @ 4 CUDA @ 1 GPU @ 4

Details

NVIDIA is looking for an experienced GPU and network systems Solutions Architect & Engineer to help bring new AI hardware and software technologies to production in customer data centers. As part of the NVIDIA SA organization, you will drive deployment of end-to-end technology solutions integration at strategic customers and provide recommendations to business and engineering teams on product roadmap.

Responsibilities

  • Work with NVIDIA AI Native, Consumer Internet and Enterprise customers on large data center GPU server and networking system deployments as a Solution Architect Engineer.
  • Guide customer discussions on network design, compute/storage and support bring-up of server/network/cluster deployments. Visit customer data centers during bring-up phases.
  • Demonstrate subject matter expertise in advanced GPU and network systems and act as a trusted technical advisor to NVIDIA's strategic customers.
  • Bring customer-specific requirements to product teams to influence product roadmap features.
  • Identify new project opportunities for NVIDIA products and technology solutions in data center and AI applications.
  • Work closely with GPU/Network Systems Engineering, Product Management and Sales teams.
  • Conduct regular technical customer meetings for product roadmap reviews, cluster debugging, feature discussions and introductions to new technology solutions.
  • Build custom product demonstrations and POCs addressing critical business needs.
  • Analyze and debug compute/network configuration and performance issues to deliver performant clusters.
  • Use conferencing tools extensively; travel required for on-site customer visits and industry events (~20%).

Requirements

  • BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, or other Engineering fields, or equivalent experience.
  • 8+ years of Systems/Solution Engineering (or similar engineering roles) experience is ideal.
  • System-level expertise of CPU/GPU server architecture, NICs, Linux, system software and kernel drivers.
  • Experience with networking switches for Ethernet/InfiniBand and data center infrastructure (power/cooling).
  • Knowledge of DevOps/MLOps technologies such as Docker/containers and Kubernetes.
  • Strong systems engineering, coding and debugging skills; experience with C/C++ and Linux kernel/drivers is valued.
  • Hands-on experience with NVIDIA GPU systems/SDKs (e.g., CUDA), NVIDIA networking technologies (NICs, RoCE, InfiniBand), and/or ARM CPU solutions is a plus.
  • Familiarity with virtualization technology concepts.
  • Effective time management and ability to balance multiple tasks.
  • Strong verbal and written communication skills; able to share ideas and code clearly via documents and presentations.

Ways to stand out

  • External customer-facing background.
  • Experience with bring-up and deployment of large clusters.
  • Systems engineering, coding and debugging including C/C++ and Linux kernel/drivers.
  • Hands-on experience with NVIDIA GPU systems/SDKs (CUDA), NVIDIA Networking (NICs, RoCE, InfiniBand), and ARM CPU solutions.
  • Familiarity with virtualization.

Compensation & Benefits

  • Base salary range: 184,000 USD - 287,500 USD (determined based on location, experience, and pay of employees in similar positions).
  • Eligible for equity and benefits.

Other details

  • Occasional travel (~20%) for on-site customer visits and industry events.
  • Applications accepted at least until July 29, 2025.
  • NVIDIA is an equal opportunity employer and committed to fostering a diverse work environment.