Staff Infrastructure Engineer

USD 132,000-215,000 per year
SENIOR
✅ Remote

Used Tools & Technologies

Not specified

Required Skills & Competences

Security @ 4 Ansible @ 4 Go @ 7 Grafana @ 4 Kubernetes @ 4 Prometheus @ 4 IaC @ 6 Terraform @ 4 Python @ 7 GCP @ 4 Java @ 7 GitHub @ 4 GitHub Actions @ 4 CI/CD @ 4 Distributed Systems @ 4 AWS @ 4 SRE @ 7 Thanos @ 4 Compliance @ 4 OpenTelemetry @ 4 Observability @ 4 AI @ 4

Details

At SentinelOne, we are driven by a clear purpose: to give the advantage to those who secure our future. As AI reshapes how organizations build, operate, and innovate, the responsibility to protect them becomes more critical than ever. When you join SentinelOne, your work helps protect global enterprises, critical infrastructure, and the technologies shaping tomorrow.

We are seeking a Staff Infrastructure Engineer to be a pivotal technical leader and architect within our Observability team. You will design, implement, and optimize observability solutions that underpin SentinelOne's global platform, enabling engineering teams across the organization to gain real-time visibility and actionable insights.

Due to Federal Government contract requirements, U.S. Citizenship is required for this position. FedRAMP staff may be subject to customer or third party background checks up to and including Secret Clearance if required by their role at SentinelOne.

Responsibilities

  • Architect and implement robust, scalable telemetry and observability platforms that enable rapid, safe delivery and monitoring of features.
  • Serve as the primary Subject Matter Expert (SME) and administrator for the core observability stack, including Grafana, Prometheus, Thanos/Mimir/Cortex, and OpenTelemetry (OTEL) pipelines.
  • Partner with engineering teams across the organization to define platform requirements and evolve the observability ecosystem ahead of stakeholder needs.
  • Take end-to-end ownership of critical features from architecture and requirements through production deployment and operational maturity.
  • Drive operational efficiency for observability services across AWS and GCP with attention to reliability and cloud cost-optimization.
  • Build automation and self-service tooling to reduce operational toil and minimize pager fatigue.
  • Deploy, maintain, and ensure compliance of observability systems in high-security environments, including FedRAMP and air-gapped deployments.
  • Implement and standardize Infrastructure as Code (Terraform/Ansible) and industry best practices to increase platform transparency and reliability.
  • Mentor engineers, lead technical design and code reviews, and provide guidance that elevates engineering quality.
  • Lead resolution of complex production incidents, perform root-cause analyses, and participate in on-call rotations.

Requirements

  • 8+ years experience in Infrastructure Engineering, Site Reliability Engineering (SRE), or a related systems-focused field.
  • 8+ years experience architecting, scaling, and managing enterprise-grade observability stacks using Prometheus, Grafana, Thanos (or Mimir/Cortex), and OpenTelemetry.
  • Experience designing cloud-native infrastructure in major cloud providers (AWS or GCP) and managing production Kubernetes environments (EKS, GKE).
  • Advanced proficiency with IaC and automation tools, specifically Terraform and Ansible.
  • Experience maintaining and optimizing high-throughput, large-scale distributed systems with focus on cost-efficiency, scalability, and disaster recovery.
  • Demonstrated ability to lead complex technical designs, mentor engineers, and collaborate cross-functionally.
  • US Citizenship and ability to work in a government-regulated environment.

Preferred Qualifications

  • 8+ years production-level programming experience in Go (highly desirable) or another mainstream language such as Python or Java, with willingness to adopt Go.
  • Experience with FedRAMP or other sovereign cloud / high-security compliance frameworks.
  • Familiarity with operational challenges of on-premises, hybrid, or air-gapped Kubernetes deployments.
  • Experience designing advanced CI/CD pipelines (e.g., GitHub Actions) and deployment strategies such as canary, blue-green, and rolling updates.

Compensation

  • Base salary range (U.S. role): $132,000 — $215,000 USD. The range may vary based on candidate location; different pay ranges for some locations may be provided during the recruiting process.

Benefits

  • Restricted Stock Units (RSUs) and Employee Stock Purchase Plan (ESPP)
  • Flexible time off, paid company holidays and sick time, gender-neutral parental leave, grandparent leave
  • Medical, dental, vision, 401(k) with company match, life and disability insurance, FSAs
  • Home office allowance, mobile phone reimbursement
  • Wellness programs, fertility coverage, adoption & surrogacy reimbursement

SentinelOne participates in the E-Verify Program for all U.S. based roles and is an Equal Employment Opportunity and Affirmative Action employer.