Senior Software Engineer, Cloud Functions
at Nvidia
π Santa Clara, United States
USD 184,000-356,500 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Go @ 4 SQL @ 4 Java @ 4 Distributed Systems @ 4 Hiring @ 4 SRE @ 6 Rust @ 4 GPU @ 4Details
NVIDIA is seeking a Senior Software Engineer to join the Cloud Functions / Mission Control team. The team builds NVIDIA Mission Control Software that runs on superpods as an autonomous hardware recovery engine responsible for baseline validation tests, remedial actions (break/fix workflows), and periodic health checks for hardware components. The platform automates diagnosis and repair across public clouds, private clouds, and virtual and physical hardware to improve GPU/CPU utilization.
Responsibilities
- Design and implement scalable, reliable software components to maintain an inventory of resources (hosts, GPUs, switches) and to automate diagnosis and repair actions.
- Enable Agentic AI within the core platform to create remedial workflows.
- Influence the product roadmap and collaborate across departments to reduce SRE toil and improve hardware utilization.
- Collaborate with organizations across NVIDIA to drive platform adoption and improve GPU utilization.
- Define and run benchmarks for various subsystems to validate performance and stability.
- Lead and deliver high-impact projects focused on quality, performance, stability, and low resource consumption.
- Develop a robust feedback control system that analyzes system-health signals and automatically runs commands to remediate issues.
- Program in modern languages such as Go and Rust (also C/C++ and Java are used within the org).
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience).
- Keen interest in driving Agent AI projects and integrating agentic workflows into operational platforms.
- Approximately 10 years of equivalent experience.
- Demonstrated ability building scalable and robust distributed systems.
- Proven track record of product rollouts and collaborating with early adopters.
- Proficiency in programming in C/C++, Java, Rust, or Go.
- Technical stewardship experience driving projects across an organization.
Preferred / Ways to Stand Out
- Deep understanding of multi-threading and distributed systems concepts.
- Strong track record of delivering complex projects to production.
- Expertise in optimizing SQL queries and database performance.
- Expert-level knowledge of Go and/or Rust.
Compensation & Benefits
- Base salary is determined by location, experience, and internal pay equity.
- Base salary ranges provided in the posting:
- Level 4: 184,000 USD - 287,500 USD
- Level 5: 224,000 USD - 356,500 USD
- Eligible for equity and benefits. See NVIDIA benefits for details.
Additional Information
- Location (as listed): Santa Clara, CA, United States.
- Applications accepted at least until September 12, 2025.
- NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.