Principal Engineer, Performance Analysis - AI Applications and Services

at Nvidia

📍 Santa Clara, United States

$272,000-419,800 per year

SENIOR
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Kubernetes @ 4 Linux @ 8 Algorithms @ 4 Data Structures @ 4 Distributed Systems @ 4 Communication @ 7

Details

We are seeking a highly motivated performance engineer to join our AI Applications organization to work on distributed cloud native accelerated video analytics applications. Our team is building distributed cloud native accelerated real-time video streaming AI inference and video analytics platforms running on the Edge and cloud in a Kubernetes environment as part of the Metropolis ecosystem. As a performance engineer, you will work with the Application teams to understand the architecture, profile, identify bottlenecks and optimize. You will build a good understanding of application resource utilization characteristics across CPU, GPU and network accelerators. A good understanding of distributed systems performance is must to scale these applications across multiple CPU and GPU nodes. Your duties include collecting data and information on the applications you wish to optimize, identifying areas for improvement and developing strategies to bring about those positive changes.

Responsibilities

  • You will plan, enable and drive performance initiatives across our Cloud Native application teams.
  • Review, develop, deploy and manage tools and strategies to systematically run performance experiments.
  • Collect and organize performance data and share with key partners.
  • Work closely with application teams to understand application resource utilization characteristics. Identify performance issues through profiling of the various components.
  • You will learn and have a good understanding of various accelerators in the system for an application workload and recommend E2E performance optimizations relative to capabilities of the system.
  • You will assist developers and product teams on best accelerators and systems for E2E system performance.
  • Improve and standardize performance measurement processes across our applications and GPU systems.
  • Work closely with GPU cloud native teams at Nvidia to deploy the latest and most optimal GPU resource sharing strategies for our applications in a Kubernetes environment.

Requirements

  • Masters degree or PhD in Computer Science or a related field, or equivalent experience.
  • 15+ years of experience in optimizing system design, complexity analysis, software design in Unix/Linux systems, performance, and application issues.
  • Experience in real-time streaming AI inference systems.
  • A history of working on distributed accelerated systems and solving sophisticated performance problems.
  • Deep hands-on experience with Distributed systems based on Kubernetes.
  • Experience with on-prem and cloud systems and ability to work with partners across multiple teams.
  • Experience using and handling and optimizing modern Cloud and container-based Enterprise computing architectures.
  • Strong verbal and written communication and teamwork skills.
  • Ability to multitask effectively in a multifaceted environment and action driven with strong analytical skills.

Ways To Stand Out From The Crowd

  • Background with real-time computer vision AI inference and/or Analytics platforms.
  • Experience in application issues, algorithms, and data structures.
  • Understanding of the functioning of AI services, deep learning and AI.
  • Exposure to scheduling and resource management systems.
  • Knowledge of GPU programming such as OpenCL or CUDA and knowledge of Multi-node GPU setups, GPU clusters, or Cloud computing.