Staff Infrastructure Engineer, GroqCloud

at Groq
USD 226,000-305,000 per year
MIDDLE SENIOR
✅ Remote

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ?

Security @ 3 Go @ 3 Grafana @ 3 Kubernetes @ 3 Prometheus @ 3 VictoriaMetrics @ 3 Terraform @ 3 GCP @ 5 CI/CD @ 3 ArgoCD @ 3 Networking @ 3 Rust @ 3 Debugging @ 3 Swift @ 3 Compliance @ 3

Details

Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.

Responsibilities & Opportunities in this Role

  • Infrastructure Development: Design, build, and automate cloud infrastructure using Terraform to support a wide variety of needs.
  • Service Deployment & Orchestration: Build and manage robust deployment pipelines and GitOps workflows into Kubernetes-based environments. Continuously improve CI/CD processes to facilitate rapid, reliable rollouts of new features and services, ensuring minimal downtime and maximum velocity.
  • System Troubleshooting: Lead investigations to determine root causes of system failures and develop scripts to repair and automate the upkeep of infrastructure components.
  • Observability Enhancement: Implement comprehensive monitoring (tracing, metrics, logging, alerting) to swiftly pinpoint, diagnose, and resolve system issues.
  • Efficient Incident Response: Manage critical system incidents as a first responder, ensuring swift resolution and comprehensive post-incident analyses with implemented remediations.
  • Cross Functional Collaboration: Collaborate with software engineers, platform & networking engineers, product managers, and sales to enable feature delivery.

Requirements

  • 6+ years of experience in software engineering or a related field.
  • 4+ years experience with GCP (especially VPC, Hybrid Networking, IAM, and GKE).
  • Actively working with modern Infrastructure-as-Code technologies (Kubernetes, Terraform, Flux/ArgoCD, Kustomize, Crossplane).
  • Experience with open-source monitoring tools (Prometheus, Grafana, VictoriaMetrics, VictoriaLogging and Alert Manager).
  • Deep experience in cloud technologies, global scale applications, and automation.
  • Familiarity with multi-region deployments, including the associated networking, latency, and failover challenges.
  • History of debugging production issues, mitigating, and driving efficient resolution.
  • Comfortable reading, writing, and debugging software in multiple languages, especially Go and Rust.
  • Thorough understanding of cloud-security best practices and modern compliance controls.

Attributes of a Groqster

  • Humility: Egos are checked at the door.
  • Collaborative & Team Savvy: We make up the smartest person in the room, together.
  • Growth & Giver Mindset: Learn it all versus know it all, we share knowledge generously.
  • Curious & Innovative: Take a creative approach to projects, problems, and design.
  • Passion, Grit, & Boldness: No limit thinking, fueling informed risk taking.

If this sounds like you, we’d love to hear from you!

Compensation

At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits. For this role, the base salary range is $226,000 to $305,000, determined by your skills, qualifications, experience and internal benchmarks.

Location

Groq is a geo-agnostic company, meaning you work where you are. Exceptional candidates will thrive in asynchronous partnerships and remote collaboration methods.