Senior Staff Software Engineer, Indexing & Retrieval Platform

at Reddit
USD 279,200-390,900 per year
SENIOR
✅ Remote

Used Tools & Technologies

Not specified

Required Skills & Competences

Docker @ 4 ElasticSearch @ 4 Go @ 4 Kubernetes @ 4 DevOps @ 4 Python @ 4 Spark @ 3 GCP @ 4 Java @ 4 Airflow @ 3 Distributed Systems @ 4 Flink @ 3 Machine Learning @ 4 Leadership @ 6 AWS @ 4 SRE @ 4 Vespa @ 4 Technical Leadership @ 6 Observability @ 4 AI @ 4 GenAI @ 4

Details

Reddit is a community of communities built on shared interests, passion, and trust. Every day, Reddit users submit, vote, and comment on the topics they care most about. The ML Indexing & Retrieval Platform team at Reddit builds and scales the core infrastructure that powers machine-learning-driven recommendations. The team designs and maintains systems for ML data ingestion, low-latency retrieval services, and end-to-end lifecycle management of data, enabling real-time access to high-quality data for Content Understanding, Semantic and Lexical retrieval, and GenAI applications.

Responsibilities

  • Lead the technical strategy, architecture, and implementation of Reddit’s next-generation ML Indexing & Retrieval engine, integrating lexical and vector indexing, low-latency retrieval, and GenAI applications.
  • Own the full lifecycle from ideation to production and drive beyond incremental improvements to reimagine core platform capabilities.
  • Partner with product engineers across Content Understanding, Search, Feeds, Ads, Growth, and Safety to deliver high-quality experiences.
  • Define and enforce best practices for observability, reliability, and operational excellence in large-scale distributed systems.
  • Mentor and guide engineers in designing scalable infrastructure and adopting robust DevOps and SRE principles.
  • Collaborate with infrastructure and ML teams to ensure the platform meets the needs of Reddit’s growing user base and diverse content ecosystem.

Requirements

  • 10+ years of experience in software engineering, specializing in indexing and retrieval systems.
  • 3+ years in technical leadership, architecting and scaling distributed systems in production.
  • Deep expertise in large-scale data platforms, including batch indexing and stream processing.
  • Proven experience designing and operating large-scale, low-latency retrieval services.
  • Expertise in lexical and vector search/retrieval technologies (examples cited: Milvus, Vespa, Elasticsearch).
  • Experience with languages such as Go, Java, Python, or other object-oriented languages.
  • Familiarity with large-scale batch and stream processing frameworks: Flink, Airflow, Spark.
  • Skilled in designing cloud-native architectures and managing containerized workloads using Kubernetes and AWS/GCP; experience with Docker.
  • Strong communicator, effective mentor, and able to translate complex technical challenges into actionable strategies.

Tools & Technologies Mentioned

  • Languages: Go, Java, Python (object-oriented languages)
  • Frameworks: Flink, Airflow, Spark
  • Databases: Vector, Lexical, Key-Value databases (examples in posting: Milvus, Vespa, Elasticsearch)
  • Infrastructure/Tools: Kubernetes, Docker, AWS, GCP
  • Areas: batch indexing, stream processing, low-latency retrieval, observability, SRE/DevOps, GenAI integration

Benefits

  • Comprehensive healthcare benefits and income replacement programs
  • 401(k) with employer match
  • Global benefit programs for workspace, professional development, and caregiving support
  • Family planning support and gender-affirming care
  • Mental health & coaching benefits
  • Flexible vacation & paid volunteer time off
  • Generous paid parental leave

Pay Transparency

  • Base salary range for US-based candidates: $279,200 - $390,900 USD

Other Notes

  • Role is listed as Remote - United States. Interviews for select roles/locations may be recorded, transcribed and summarized by AI; candidates may opt out prior to scheduled interviews.