Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 3 Go @ 3 Grafana @ 3 Kubernetes @ 3 Prometheus @ 3 VictoriaMetrics @ 3 Terraform @ 3 GCP @ 5 CI/CD @ 3 ArgoCD @ 3 Networking @ 3 Rust @ 3 Debugging @ 3 Swift @ 3 Compliance @ 3Details
Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we are on a mission to make high performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible. Build fast.
Responsibilities & Opportunities in this Role
- Infrastructure Development: Design, build, and automate cloud infrastructure using Terraform to support a wide variety of needs.
- Service Deployment & Orchestration: Build and manage robust deployment pipelines and GitOps workflows into Kubernetes-based environments. Continuously improve CI/CD processes to facilitate rapid, reliable rollouts of new features and services, ensuring minimal downtime and maximum velocity.
- System Troubleshooting: Lead investigations to determine root causes of system failures and develop scripts to repair and automate the upkeep of infrastructure components.
- Observability Enhancement: Implement comprehensive monitoring (tracing, metrics, logging, alerting) to swiftly pinpoint, diagnose, and resolve system issues.
- Efficient Incident Response: Manage critical system incidents as a first responder, ensuring swift resolution and comprehensive post-incident analyses with implemented remediations.
- Cross Functional Collaboration: Collaborate with software engineers, platform & networking engineers, product managers, and sales to enable feature delivery.
Requirements
- 6+ years of experience in software engineering or a related field.
- 4+ years experience with GCP (especially VPC, Hybrid Networking, IAM, and GKE).
- Actively working with modern Infrastructure-as-Code technologies (Kubernetes, Terraform, Flux/ArgoCD, Kustomize, Crossplane).
- Experience with open-source monitoring tools (Prometheus, Grafana, VictoriaMetrics, VictoriaLogging and Alert Manager).
- Deep experience in cloud technologies, global scale applications, and automation.
- Familiarity with multi-region deployments, including the associated networking, latency, and failover challenges.
- History of debugging production issues, mitigating, and driving efficient resolution.
- Comfortable reading, writing, and debugging software in multiple languages, especially Go and Rust.
- Thorough understanding of cloud-security best practices and modern compliance controls.
Attributes of a Groqster
- Humility: Egos are checked at the door.
- Collaborative & Team Savvy: We make up the smartest person in the room, together.
- Growth & Giver Mindset: Learn it all versus know it all, we share knowledge generously.
- Curious & Innovative: Take a creative approach to projects, problems, and design.
- Passion, Grit, & Boldness: No limit thinking, fueling informed risk taking.
If this sounds like you, we’d love to hear from you!
Compensation
At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits. For this role, the base salary range is $226,000 to $305,000, determined by your skills, qualifications, experience and internal benchmarks.
Location
Groq is a geo-agnostic company, meaning you work where you are. Exceptional candidates will thrive in asynchronous partnerships and remote collaboration methods.