Software Engineer - Data Platform | NYC, Seattle, SF

at Perplexity AI

📍 New York City, United States
📍 San Francisco, United States
📍 Seattle, United States

USD 200,000-300,000 per year

SENIOR

✅ Hybrid

SCRAPED

Used Tools & Technologies

Not specified

Required Skills & Competences ^?

Go @ 6 Kafka @ 4 TypeScript @ 6 Python @ 6 Airflow @ 4 Kinesis @ 4 Data Science @ 4 Dagster @ 4 Streaming Data Processing @ 4 API @ 4 Experimentation @ 4

Details

About Perplexity

Perplexity is an AI-powered answer engine founded in December 2022. The company builds accurate, trustworthy AI that powers decision-making and assists users across many contexts. Perplexity handles hundreds of millions of queries per month and has raised significant venture investment. Full-time U.S. and international employees receive comprehensive benefits and equity is offered.

About the Role

Perplexity is looking for experienced Data Infrastructure Engineers to design, build, and scale the foundational data systems that power product, AI research, analytics, and decision-making at scale. You will develop and own critical infrastructure for batch and streaming data processing, data orchestration, reliability, and developer experience across the data stack. This is a high-impact, senior/staff-level role where you will shape architecture, set standards, and drive long-term technical direction for Perplexity's data ecosystem.

Responsibilities

Design and operate large-scale batch and streaming data pipelines supporting product features, AI training/evaluation, analytics, and experimentation.
Build and evolve event-driven and streaming systems (examples: Kafka/Kinesis/PubSub-style architectures) for real-time ingestion, transformation, and delivery.
Own batch processing frameworks for backfills, aggregations, and offline computation.
Lead the design and operation of data orchestration systems (e.g., Airflow, Dagster, or equivalent) including scheduling, dependency management, retries, SLAs, and observability.
Establish strong guarantees around data correctness, freshness, lineage, and recoverability.
Design systems that handle scale, partial failure, and evolving schemas.
Build self-serve data platforms that empower engineers, data scientists, and analysts to safely create and operate pipelines.
Improve developer experience for data work through abstractions, tooling, documentation, and paved paths.
Set standards for data modeling, testing, validation, and deployment.
Drive architectural decisions across data infrastructure for storage, compute, orchestration, and APIs.
Partner with engineering and data science teams to align data systems with evolving requirements.
Mentor engineers, review designs, and raise the technical bar across the organization.

Requirements

Minimum Qualifications

5+ years (Senior) or 8+ years (Staff) of software engineering experience.
Strong experience building production data infrastructure systems.
Hands-on experience with batch and/or streaming data processing at scale.
Deep familiarity with data orchestration systems (Airflow, Dagster, or similar).
Proficiency in Python and at least one additional backend language (examples: Go, TypeScript).
Strong systems thinking: understanding tradeoffs across reliability, latency, cost, and complexity.
Experience supporting ML/AI workflows, training pipelines, or evaluation systems.
Familiarity with data quality, lineage, observability, and governance tooling.
Prior ownership of internal platforms used by many teams.

Benefits

Compensation range listed: $200K–$300K plus equity.
Full-time U.S. employees: equity, health, dental, vision, retirement, fitness, commuter and dependent care accounts, and more.
International full-time employees: comprehensive benefits program tailored to region of residence.

Other Details

Department: Platform & Infrastructure
Workplace: Hybrid
Employment type: Full-time