Senior HPC Support Engineer, InfiniBand - NVLink

at Nvidia

📍 Seattle, United States

USD 108,000-201,200 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Marketing @ 4 System Administration @ 4 Linux @ 4 Python @ 4 R @ 4 AWS @ 4 Bash @ 4 Networking @ 4 Debugging @ 4 Customer Support @ 6 ChatGPT @ 4 GPU @ 4

Details

We are seeking a motivated Senior HPC Technical Support Engineer - AI Infrastructure focusing on InfiniBand, NVLink and AI GPU Cluster technology, passionate about data center and networking technologies, to provide comprehensive solutions for sophisticated installations, maintenance, or operations for a broad scope of groundbreaking networking products. As a primary point of contact for our customers, you will assist them with technical questions, debugging and resolving their issues. As a member of our Technical Support team, you are a conscientious, proficient communicator who takes ownership in resolving issues while ensuring a high level of customer satisfaction. The role also involves regular interaction with Engineering, Marketing, and Support teams on technical matters.

Responsibilities

Resolve sophisticated customer concerns and technical issues through research, reproduction, and problem solving for customers installing and supporting systems using Linux (multi-distro), with focus on NVIDIA InfiniBand, NVLink, GPU technology and end-to-end solutions.
Respond to customer product support inquiries via telephone, email, or conference calls.
Resolve customer issues during installation, operation, maintenance, or with product application/interoperability with other vendors.
Participate in cross-functional team meetings and provide feedback to engineering and marketing regarding product requirements, customer experience, and support tools.
Act as a technical resource: develop, refine, and document standard methodologies and share them with internal teams (Support/R&D) to improve support processes.
Conduct site visits and conference calls with customers.

Requirements

5+ years providing in-depth customer support and debugging for hardware and software products.
Exceptional interpersonal skills and ownership of issue resolution for critical customer problems.
Linux OS experience including system administration and networking (LFCS / RHCSA level).
Networking technologies, protocols, and routing, including IP, L2 and L3 (CCNP / CompTIA Networking+ / Cloud+ level).
Containerized solutions experience (DCA and/or CKA level), virtualization (KVM / ESXi), and cloud infrastructure (AWS / OCI).
Ability to debug networking protocols using tools such as tcpdump and Wireshark or similar packet-generation and analysis tools.
Bash and Python scripting abilities.
Strong organizational skills; able to prioritize and multi-task with limited supervision.
Integrating AI tools (Cursor, Gemini, ChatGPT, Copilot, Glean, etc.) into daily workflow.
Four-year degree in Computer Science, or Electrical or Computer Engineering, or equivalent experience.

Ways to stand out

NVIDIA certifications related to AI infrastructure, operations and networking.
Deep knowledge of InfiniBand, RDMA, NVLink and NVIDIA GPU technology.
Experience with clustering or HPC data-center technologies including upper-layer protocols (MPI, NCCL).
Additional OS experience such as Microsoft Windows, VMware, Unix.
Configuration and operational expertise with traditional network switch/router and open platforms.

Compensation & Benefits

Base salary range (determined by location, experience, and comparator pay):
- Level 3: 108,000 USD - 172,500 USD
- Level 4: 120,000 USD - 201,250 USD
Eligible for equity and benefits (see NVIDIA benefits page: https://www.nvidia.com/en-us/benefits/).

Other details

Applications for this job will be accepted at least until September 12, 2025.
NVIDIA is an equal opportunity employer committed to fostering a diverse work environment and does not discriminate on the basis of legally protected characteristics.