Solution Architect - OEM AI Software

at Nvidia
📍 World
📍 United States
USD 120,000-235,800 per year
MIDDLE
✅ Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Software Development @ 3 Docker @ 2 Kubernetes @ 2 TensorFlow @ 3 Communication @ 3 Mathematics @ 3 LLM @ 3 PyTorch @ 3 GPU @ 3

Details

NVIDIA is seeking an outstanding Solutions Architect to help grow our OEM enterprise AI business. You will become a trusted technical advisor to OEM partners and work on software solutions that enable enterprise Generative AI workflows. This role requires hands-on experience with Generative AI, LLMs, deep learning, and GPU technologies, and involves collaboration across sales, engineering, and partner teams to design, deploy, and optimize production AI solutions.

Responsibilities

  • Architect enterprise-grade end-to-end generative AI software solutions for OEM partners.
  • Collaborate closely with OEM partners' software development teams to craft joint AI solutions.
  • Support pre-sales activities including technical presentations and demonstrations of Generative AI capabilities.
  • Work with NVIDIA engineering teams to provide feedback and contribute to the evolution of generative AI software.
  • Engage directly with customers and partners to understand requirements and challenges.
  • Lead workshops and design sessions to define and refine generative AI solutions, with emphasis on enterprise workflows.
  • Implement strategies for efficient and effective training of LLMs to achieve peak performance.
  • Design and implement RAG-based workflows to improve content generation and information retrieval.

Requirements

  • 3-5+ years of hands-on experience as a solution architect or similar role with a focus on AI solutions.
  • BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering or related fields (or equivalent experience).
  • Proven track record deploying and optimizing Generative AI models for inference in production.
  • Expertise in training and fine-tuning LLMs using frameworks such as TensorFlow, PyTorch, or Hugging Face Transformers.
  • Proficiency in model deployment and optimization techniques for efficient inference on various hardware platforms, with emphasis on GPUs.
  • Solid understanding of GPU cluster architecture and parallel processing for accelerated training and inference.
  • Excellent communication and collaboration skills; ability to articulate complex technical concepts to technical and non-technical stakeholders.
  • Experience leading workshops, training sessions, and presenting technical solutions to diverse audiences.

Ways To Stand Out

  • Experience deploying Generative AI models in cloud and on-premises infrastructures.
  • Experience with NVIDIA GPUs and software libraries such as NVIDIA NIM, NVIDIA NeMo, NVIDIA Triton Inference Server, TensorRT, TensorRT-LLM.
  • Proven ability to optimize LLMs for inference speed, memory efficiency, and resource utilization.
  • Familiarity with Docker/containerization and Kubernetes for scalable model deployment.
  • Deep understanding of GPU cluster architecture, parallel computing, and distributed computing concepts.

Compensation & Benefits

  • Base salary ranges by level:
    • Level 2: 120,000 USD - 189,750 USD
    • Level 3: 148,000 USD - 235,750 USD
  • You will also be eligible for equity and benefits. Exact base salary will be determined based on location, experience, and internal pay parity.

Additional Information

  • Locations indicated: US (TX) and Remote.
  • Employment type: Full time.
  • Application acceptance at least until July 29, 2025.
  • NVIDIA is an equal opportunity employer committed to fostering a diverse work environment.