Member of Technical Staff - Data Platform

at xAI
USD 180,000-440,000 per year
MIDDLE
✅ On-site

Used Tools & Technologies

Machine Learning

Required Skills & Competences

Go @ 5 Kafka @ 3 Scala @ 5 Spark @ 3 Distributed Systems @ 3 Flink @ 3 Performance Optimization @ 6 Rust @ 5 Debugging @ 6 API @ 3 Hadoop @ 3 Experimentation @ 3 Trino @ 3 Observability @ 3 AI @ 3 Profiling @ 6

Details

About xAI

xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. The team is small, highly motivated, and focused on engineering excellence. The organization operates with a flat structure where all employees are expected to be hands-on, show initiative, and communicate effectively.

Role overview

The Data Platform team builds and operates infrastructure for large-scale data transport and processing across the company. Core systems include Apache Kafka, HDFS, Spark, Flink, and Trino. The team supports real-time ML pipelines, feed ranking, experimentation, analytics, and observability at petabyte scale and handles latency-critical, high-throughput streaming and distributed compute workloads that demand fault tolerance, performance, and reliability. As a software engineer on the team you will design, build, and operate distributed systems that process trillions of events daily and power product and ML workloads across the company.

Responsibilities

  • Design and implement high-throughput, low-latency data ingestion and transport systems.
  • Scale and optimize multi-tenant Kafka infrastructure supporting real-time workloads.
  • Extend and tune Spark, Flink, and Trino for demanding production pipelines.
  • Build interfaces, APIs, and pipelines enabling teams to query, process, and move data at petabyte scale.
  • Debug and optimize distributed systems, focusing on reliability and performance under load.
  • Collaborate with ML, product, and infrastructure teams to unblock critical data workflows.

Requirements

  • Proven expertise in distributed systems, stream processing, or large-scale data platforms.
  • Proficiency in Rust, Go, Scala, or similar systems languages.
  • Hands-on experience with Kafka, Flink, Spark, Trino, or Hadoop in production.
  • Strong debugging, profiling, and performance optimization skills.
  • Track record of shipping and maintaining critical infrastructure.
  • Comfortable working in fast-moving, high-stakes environments with minimal guardrails.

Compensation and Benefits

  • Base salary: $180,000 - $440,000 USD
  • Total rewards package also includes equity, comprehensive medical/vision/dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and other discounts and perks.

Additional details

  • The team operates at petabyte scale and processes trillions of events daily.
  • Emphasis on reliability, performance, fault tolerance, and low-latency processing.