Senior Systems Engineer - AV Infrastructure Cloud Platform

at Nvidia

📍 Santa Clara, United States

USD 184,000-356,500 per year

SENIOR

✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Security @ 4 Go @ 7 Kubernetes @ 6 Linux @ 4 IaC @ 7 Python @ 7 CI/CD @ 4 AWS @ 4 Communication @ 7 Networking @ 4 SRE @ 4 GPU @ 4

Details

We are seeking a motivated cloud platform Senior Systems Engineer to join our team in building and scaling our cloud-native infrastructure which enables developers to run 100s of services. You'll play a critical role in driving infrastructure innovation across our organization.

Responsibilities

Apply strong programming skills to develop cloud platform tooling and automation to enhance developer productivity and operational efficiency across our cloud infrastructure.
Lead development of infrastructure automation frameworks and CI/CD pipelines, ensuring robust, scalable, and secure cloud-native application deployment.
Engage with engineering users to understand requirements and improve their experience by recommending robust, scalable cloud solutions.
Contribute to the design and architecture of cloud infrastructure and networking components to meet the evolving needs of the internal developer platform.
Improve cloud infrastructure and services reliability and performance.

Requirements

BS/MS in Computer Science, Engineering, or equivalent STEM-related experience.
8+ years of professional experience in the related field.
4+ years experience in Kubernetes-based platform tooling development.
4+ years experience in cloud infrastructure automation and management.
Strong programming fundamentals with expertise in Go and Python.
Ability to shift seamlessly between Linux system environments and Python programming.
Deep AWS expertise across core services including VPC, IAM, EC2, S3, RDS, CloudFront, EKS.
Comprehensive understanding of Kubernetes and Cloud Native Architecture, with hands-on experience managing large-scale production clusters.
Good understanding of Site Reliability Engineering best practices, alerting, and observability.
Advanced Kubernetes workload management expertise including traffic management, deployment strategies, observability, and security implementations.
Strong Infrastructure as Code fundamentals with experience in infrastructure CI/CD pipelines, automation frameworks, and IaC libraries.

Ways to Stand Out

Fun and motivated teammate who enjoys challenges and celebrates success.
Experience with Agentic AI tools for computing infrastructure management.
Self-starter with strong problem-solving and customer-facing communication skills.
Excellent written and verbal interpersonal skills.
Contributions to open-source cloud-native projects, especially in Kubernetes tooling, infrastructure automation, or cloud-native applications.
Experience building sophisticated tooling and SRE automation on large GPU/CPU clusters.

Additional Information

Our 20-year expertise in visual computing includes GPU invention for graphics across diverse fields. We stand at the beginning of a new AI computing era, driven by GPU deep learning. We value diversity, intellectual curiosity, problem-solving, openness, and a collaborative, blame-free environment. We promote self-direction to work on meaningful projects with support and mentorship.

The base salary range is 184,000 USD - 356,500 USD, determined by location, experience, and pay of employees in similar roles. Eligibility for equity and benefits is included.

NVIDIA is an equal opportunity employer committed to diversity and non-discrimination practices.