Used Tools & Technologies
Not specified
Required Skills & Competences ?
Security @ 4 Ansible @ 4 Go @ 4 Terraform @ 4 Python @ 4 GCP @ 4 Java @ 4 Machine Learning @ 4 Data Science @ 4 Leadership @ 4 AWS @ 4 Bash @ 4 Communication @ 4 SRE @ 4 Compliance @ 4Details
Ready to be pushed beyond what you think you’re capable of?
At Coinbase, our mission is to increase economic freedom in the world. It’s a massive, ambitious opportunity that demands the best of us, every day, as we build the emerging onchain platform — and with it, the future global financial system.
Coinbase seeks a passionate individual who believes in the power of crypto and blockchain to update the financial system, eager to solve the company’s hardest problems and excel in a high-caliber, intense work culture.
We are looking for a Site Reliability Engineer (SRE) to join the IT AI Infrastructure team to deploy, manage, and optimize AI-powered productivity tools and in-house AI solutions that enhance employee efficiency at scale.
Responsibilities
- Deploy, configure, and manage AI-powered employee productivity tools and in-house AI solutions.
- Ensure high availability, reliability, and optimal performance of AI platforms and services, implementing monitoring, alerting, and incident response.
- Design and implement scalable infrastructure supporting AI tools and user base, optimizing resource utilization and capacity planning.
- Develop and maintain automation scripts and tools for deployment, monitoring, and maintenance.
- Collaborate with cross-functional teams (Machine Learning, HR, Security, Data Science, Developer Experience) for development and integration of AI solutions.
- Adhere to security and privacy policies; ensure compliance with regulatory requirements.
- Implement comprehensive monitoring and metrics; analyze data for improvement.
- Participate in incident response and troubleshooting, maintaining incident response plans.
- Contribute to backend development supporting AI tools integration and functionality.
- Deploy and manage AI solutions on public cloud platforms (AWS/GCP), using cloud-native services.
- Communicate technical information effectively to non-technical audiences including senior leadership.
Requirements
- Proven experience as a Site Reliability Engineer or similar role.
- Strong understanding of AI technologies and platforms.
- Experience with deploying and managing cloud applications (AWS/GCP).
- Backend development experience with Python, Java, or Go.
- Proficiency in managing public cloud services (AWS/GCP) for scalability and reliability.
- Experience with automation tools and scripting (Ansible, Terraform, Bash, Python).
- Excellent troubleshooting, problem-solving, communication, and collaboration skills.
- Strong security and compliance knowledge.
- Experience working in highly regulated, fast-paced, high-growth environments.
Benefits
- Medical, dental, vision plans with employee contributions.
- Health Savings Account with company contributions.
- Disability and life insurance.
- 401(k) plan with company match.
- Wellness stipend and mobile/internet reimbursement.
- Connections stipend and volunteer time off.
- Fertility counseling and benefits.
- Generous time off/leave policy.
- Option to be paid in digital currency.