Senior Software Engineer, Cloud Functions

at Nvidia
USD 184,000-356,500 per year
SENIOR
βœ… On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Go @ 4 SQL @ 4 Java @ 4 Distributed Systems @ 4 Hiring @ 4 SRE @ 6 Rust @ 4 GPU @ 4

Details

NVIDIA is seeking a Senior Software Engineer to join the Cloud Functions / Mission Control team. The team builds NVIDIA Mission Control Software that runs on superpods as an autonomous hardware recovery engine responsible for baseline validation tests, remedial actions (break/fix workflows), and periodic health checks for hardware components. The platform automates diagnosis and repair across public clouds, private clouds, and virtual and physical hardware to improve GPU/CPU utilization.

Responsibilities

  • Design and implement scalable, reliable software components to maintain an inventory of resources (hosts, GPUs, switches) and to automate diagnosis and repair actions.
  • Enable Agentic AI within the core platform to create remedial workflows.
  • Influence the product roadmap and collaborate across departments to reduce SRE toil and improve hardware utilization.
  • Collaborate with organizations across NVIDIA to drive platform adoption and improve GPU utilization.
  • Define and run benchmarks for various subsystems to validate performance and stability.
  • Lead and deliver high-impact projects focused on quality, performance, stability, and low resource consumption.
  • Develop a robust feedback control system that analyzes system-health signals and automatically runs commands to remediate issues.
  • Program in modern languages such as Go and Rust (also C/C++ and Java are used within the org).

Requirements

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • Keen interest in driving Agent AI projects and integrating agentic workflows into operational platforms.
  • Approximately 10 years of equivalent experience.
  • Demonstrated ability building scalable and robust distributed systems.
  • Proven track record of product rollouts and collaborating with early adopters.
  • Proficiency in programming in C/C++, Java, Rust, or Go.
  • Technical stewardship experience driving projects across an organization.

Preferred / Ways to Stand Out

  • Deep understanding of multi-threading and distributed systems concepts.
  • Strong track record of delivering complex projects to production.
  • Expertise in optimizing SQL queries and database performance.
  • Expert-level knowledge of Go and/or Rust.

Compensation & Benefits

  • Base salary is determined by location, experience, and internal pay equity.
  • Base salary ranges provided in the posting:
    • Level 4: 184,000 USD - 287,500 USD
    • Level 5: 224,000 USD - 356,500 USD
  • Eligible for equity and benefits. See NVIDIA benefits for details.

Additional Information

  • Location (as listed): Santa Clara, CA, United States.
  • Applications accepted at least until September 12, 2025.
  • NVIDIA is an equal opportunity employer and values diversity in hiring and promotion practices.