Used Tools & Technologies
Not specified
Required Skills & Competences ?
Ansible @ 4 Consul @ 4 Docker @ 4 ElasticSearch @ 4 Grafana @ 4 Linux @ 4 Nginx @ 4 Redis @ 4 Python @ 6 Networking @ 4 SRE @ 4Details
The Infrastructure department is responsible for influencing and tracking change, providing frontline support, and delivering software-defined solutions.
Are you excited by the challenge of managing large-scale systems, automating infrastructure, and ensuring seamless service reliability? We’re seeking a Site Reliability Engineer (SRE) to play a key role in shaping the future of our global infrastructure.
Overseeing a global infrastructure of ~10,000 on-prem servers, you’ll tackle unique technical challenges, engineer scalable systems, and have a direct impact on the reliability and performance of our products.
Responsibilities
- Build Reliable Infrastructure: Design, develop, and maintain highly available, scalable systems.
- Automate Everything: Create and optimize automation workflows to streamline deployments, improve speed, and eliminate manual overhead.
- Ensure Observability: Build monitoring and alerting systems that provide deep visibility into performance, reliability, and health.
- Solve Complex Issues: Troubleshoot, debug and resolve critical issues in complex systems.
- Collaborate & Innovate: Work closely with QoS and operations teams to enhance reliability, develop new features, and drive technical excellence.
Requirements
- Linux Expertise: Good knowledge of Linux systems, particularly Debian-based distributions.
- Automation Skills: Hands-on experience with configuration management tools such as SaltStack/Ansible or similar solutions.
- Programming: Proficiency in Python for building automation scripts and tools.
- Observability Knowledge: Experience with monitoring tools and frameworks to enhance system observability.
- TCP/IP Networking Concepts: A solid understanding of TCP/IP networking protocols and concepts.
- Problem-Solving Skills: Proven ability to debug and troubleshoot complex systems effectively.
Tools You Will Use
- Operating Systems: Linux (Debian)
- Firewalls: NFtables
- Load Balancing & Proxying: HAProxy, NGINX
- Containers: Docker
- Automation Frameworks: SaltStack
- KV store: Redis, Consul
- Analytics: Elasticsearch, Victoria Metrics, Grafana
- Programming Languages: Python
Benefits
[Not Specified]