Engineering Manager, Ads ML Efficiency

at Reddit

📍 United States

USD 230,000-322,000 per year

MIDDLE

✅ Remote

Used Tools & Technologies

Machine Learning

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

Distributed Systems @ 5 Communication @ 6 Load Testing @ 3 Performance Optimization @ 3 Debugging @ 3 PyTorch @ 3 GPU @ 3 Observability @ 3 AI @ 3 Profiling @ 3

Details

Reddit is building a dedicated Ads ML Efficiency function to make model training and inference faster, cheaper, safer, and more scalable. As the Engineering Manager for this team, you will lead engineers focused on model optimization, training efficiency, GPU enablement, load testing, model performance tooling, and efficiency guardrails across Ads ML. You will partner with ranking teams, ML Platform teams and serving owners to identify high-value bottlenecks, land measurable efficiency wins, and build tooling and operating mechanisms to make those wins repeatable.

Responsibilities

Hire, mentor, and retain a high-performing team of ML engineers / systems-oriented engineers working on model optimization and ML efficiency.
Define the roadmap for training optimization, inference optimization, launch-readiness tooling, and reusable efficiency primitives across Ads ML.
Drive measurable reductions in model training time, online latency, serving cost, and infra-driven launch risk.
Guide development of profiling, benchmarking, load testing, observability, cost analysis, debugging, and efficiency certification systems.
Partner with model owners and platform teams to accelerate high-priority launches and remove bottlenecks to production.
Balance near-term white-glove optimization work with medium-term platformization and automation.
Build cross-functional alignment with MLP, AMP, Ranking, and serving teams; establish engineering rigor around measurement, performance debugging, launch safety, and technical decision-making for efficiency work.

Requirements

Deep ML engineering experience with close familiarity of models: training, serving, debugging, and optimization.
Hands-on background improving training loops, serving systems, profiling workflows, model/inference efficiency, or GPU utilization.
Strong managerial ability: experience building and leading teams, coaching engineers, managing delivery, and prioritizing under ambiguity.
Fluency with distributed systems and production-scale ML systems tradeoffs (reliability, speed, cost, scale).
Customer and platform instincts: able to act as a service provider to modeling teams while building reusable systems rather than only one-offs.
Strong communication skills to explain technical tradeoffs to engineers, PMs, and senior stakeholders.
Ads experience (ads ranking, recommender systems, marketplace ML, or adjacent production ML domains) is strongly preferred.

Nice-to-have

Experience with GPU training and serving migrations.
Experience with PyTorch, distributed training frameworks, or kernel/performance optimization.
Experience building efficiency benchmarking or launch certification frameworks.
Experience working in organizations where ML platform and applied modeling responsibilities are split across multiple teams.

Benefits

Comprehensive healthcare benefits and income replacement programs
401(k) with employer match
Global benefit programs (workspace, professional development, caregiving support)
Family planning support
Gender-affirming care
Mental health & coaching benefits
Flexible vacation & paid volunteer time off
Generous paid parental leave

Pay Transparency

Base salary range (US): $230,000 - $322,000 USD
Role may be eligible for equity (restricted stock units) and, depending on position, commission
Final offers determined by skills, depth of experience, and other factors

Other notes

Remote - United States (Reddit supports remote work and has flexible workforce policies; applicants can apply to work remotely in any country where Reddit has a physical presence).
Interviews in select roles/locations may be recorded and transcribed by AI; candidates can opt out prior to scheduled interviews.