NCX Engineer, AI Accelerator

at Nvidia
USD 184,000-356,500 per year
MIDDLE
āœ… On-site

Used Tools & Technologies

Machine Learning LLM

Required Skills & Competences

Ansible @ 3 Go @ 3 Grafana @ 2 Kubernetes @ 3 Linux @ 6 Prometheus @ 2 DevOps @ 6 Terraform @ 3 Python @ 3 CI/CD @ 2 Distributed Systems @ 3 MLOps @ 3 TensorFlow @ 3 Leadership @ 3 Networking @ 3 CRM @ 3 ServiceNow @ 3 API @ 3 PyTorch @ 3 Salesforce @ 3 OpenTelemetry @ 2 CUDA @ 3 GPU @ 3 Observability @ 3 AI @ 3 InfiniBand @ 3

Details

NVIDIA is seeking an NCX Engineer, AI Accelerator to join the AI Accelerator team and collaborate closely with strategic customers to implement and enhance advanced AI workloads. You will deliver hands-on technical assistance for large-scale AI deployments and distributed systems, helping customers get efficient performance from NVIDIA's AI platform across varied environments and partner platforms.

Responsibilities

  • Build and deploy custom AI solutions on NCP and Neo Cloud platforms, including distributed training, inference optimization, and MLOps pipelines based on NVIDIA reference architectures.
  • Act as the primary technical contact for strategic NCP customers: provide remote and on-site support, troubleshoot complex production issues, and guide partner engineering teams on NVIDIA platform guidelines.
  • Deploy and manage AI workloads across DGX Cloud, NCP data centers, and major cloud service provider environments using Kubernetes, containers, and GPU scheduling systems aligned to NCP builds.
  • Profile and tune large-scale training and inference workloads (reduce latency, cost, operational risk) and implement observability and SLO/SLA monitoring.
  • Implement and extend NVIDIA reference architectures on partner platforms; develop integrations with partner control planes and customer environments to ensure API, data pipeline, and enterprise software connectivity.
  • Produce implementation guides, runbooks, and post-mortem documentation that codify standard methodologies for running NVIDIA AI workloads at scale on NCP platforms.

Requirements

  • BS, MS, or Ph.D. in Computer Science, Computer/Electrical Engineering, or a related technical field, or equivalent experience.
  • 8+ years experience in customer-facing technical roles such as Solutions Engineering, DevOps, Site Reliability, or ML Infrastructure Engineering, ideally supporting large-scale cloud or service-provider environments.
  • Strong expertise in Linux systems and distributed computing.
  • Experience with Kubernetes, containers, and GPU scheduling on multi-tenant or service-provider platforms.
  • Demonstrated AI/ML experience supporting large-scale training and inference workloads (LLMs, generative models, recommendation systems) in production or critical environments.
  • Solid programming skills in Python and Go, with hands-on experience using frameworks such as PyTorch or TensorFlow for training and serving.
  • Ability to collaborate with customer and partner engineering teams, lead complex technical investigations to root cause, and communicate architectures and recommendations to engineering and leadership audiences.

Ways to stand out

  • Experience with the NVIDIA ecosystem: DGX systems, CUDA, NeMo, Triton, NIM, and NVIDIA networking technologies (InfiniBand, RoCE).
  • Direct experience collaborating with NVIDIA Cloud Partners, hyperscale CSPs, or managed AI cloud platforms; implementing NVIDIA reference architectures for AI infrastructure.
  • Deep familiarity with MLOps and cloud-native practices: containerization, CI/CD pipelines, observability stacks (Prometheus, Grafana, OpenTelemetry), and GitOps workflows.
  • Experience with infrastructure-as-code tools (Terraform, Ansible) for repeatable deployment and configuration of GPU-accelerated clusters and NCP building blocks.
  • Experience integrating AI platforms with enterprise systems (Salesforce, ServiceNow, other ITSM/CRM) to support end-to-end customer solutions and managed services.

Compensation & Benefits

  • Base salary ranges (location-, level-, and experience-dependent):
    • Level 4: 184,000 USD - 287,500 USD
    • Level 5: 224,000 USD - 356,500 USD
  • You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until May 9, 2026. NVIDIA uses AI tools in its recruiting processes and is an equal opportunity employer committed to diversity.