Cloud Efficiency Architect

at Nvidia
USD 208,000-396,800 per year
MIDDLE
✅ On-site

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Software Development @ 3 Kubernetes @ 3 MySQL @ 1 GCP @ 2 CI/CD @ 3 Leadership @ 3 AWS @ 2 Azure @ 2 BI @ 3 Reporting @ 3 QA @ 3 Splunk @ 1 Power BI @ 3 GPU @ 3

Details

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. Today, the company is focused on AI and accelerated computing. Colossus Cloud is at the heart of GPU bring-up infrastructure strategy and is used for NVIDIA's software development and QA. The cloud service offers many resource types to support various use cases (baremetal for development, managed Kubernetes services for CI/CD, etc.). As Colossus grows and expands into new datacenters for product bring-up and scale, this role helps craft, implement, and maintain cost, utilization and TCO models to enable data-driven decisions across infrastructure, collaborators and finance.

Responsibilities

  • Design, build, and maintain comprehensive cost models for private cloud services, including compute, storage, network, and platform services.
  • Develop predictive models for Colossus resource consumption and demand using historical data and future projections to guide TCO predictions.
  • Create granular build/test job costing models, attributing costs to individual pipelines, projects, or teams.
  • Develop and refine organizational (OrgN) level cost allocation strategies to provide actionable cost breakdowns by organizational unit, department, or business function.
  • Analyze large datasets from Colossus to identify cost anomalies, optimization opportunities, and trends; develop and automate reports and dashboards to visualize key cost and utilization metrics for various collaborators.
  • Evaluate, implement, and leverage FinOps and cloud cost management tools to improve reporting, forecasting, and optimization capabilities; automate data collection and reporting processes where feasible.
  • Present utilization models and insights clearly and concisely to technical and non-technical audiences, including senior leadership.

Requirements

  • 12+ years of proven experience, including 5+ years in Cloud TCO (billing, utilization and TCO analysis).
  • Strong business and technical competence with cloud concepts and cloud-native product/services environments.
  • Experience generating Power BI or equivalent dashboards to drive actions.
  • Experience working in large-scale cloud environments.
  • Familiarity with AI/ML infrastructure, VIBE coding, and cloud/services.
  • MBA, MS, or equivalent experience.
  • Willingness to adapt quickly, learn new skills, and lead collaborative initiatives across departments.

Preferred / Ways to stand out

  • Expertise in optimizing cloud infrastructure for total cost of ownership (TCO).
  • Familiarity with cloud native cost tools such as AWS Cost Explorer, GCP Billing, and Azure Cost Management.
  • Strong collaborative and interpersonal skills with a proven track record of influencing stakeholders in dynamic environments.
  • Experience with mySQL and Splunk is a plus.
  • Deep knowledge and hands-on experience with one or more major cloud providers (AWS, Azure, GCP).

Compensation & Benefits

  • Base salary ranges provided by level:
    • Level 5: 208,000 USD - 333,500 USD per year
    • Level 6: 248,000 USD - 396,750 USD per year
  • Eligible for equity and benefits (see https://www.nvidia.com/en-us/benefits/).

Other details

  • Location: Santa Clara, CA, United States
  • Employment type: Full time
  • Applications accepted at least until August 12, 2025.

NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. The company does not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or other protected characteristics.