AI Training Infrastructure Engineer - Post Training
USD 220,000-290,000 per year
SCRAPED
Used Tools & Technologies
Not specified
Required Skills & Competences ?
Python @ 6 Algorithms @ 3 LLM @ 3 PyTorch @ 3 CUDA @ 1Details
Perplexity is an AI-powered answer engine founded in December 2022 and growing rapidly as one of the world’s leading AI platforms. Perplexity has raised over $1B in venture investment from a range of investors. The company builds accurate, trustworthy AI to power decision-making and assistive AI. The team’s in-house online LLMs are the Sonar models.
This role focuses on building a robust post-training framework (on top of Megatron/PyTorch) and owning the full stack data, training, and evaluation pipelines required to post-train LLMs and integrate them into product.
Responsibilities
- Build a post-training framework capable of running cutting-edge model training jobs at scale.
- Implement infrastructure and components to support latest models and algorithms such as SFT and RL (DPO/GRPO) and more.
- Own the full-stack data, training, and evaluation pipelines required to post-train LLM models.
- Work closely with engineering teams to integrate Sonar models into Perplexity’s products.
Qualifications
- Proven experience building large-scale LLM frameworks.
- Strong experience with Python and PyTorch.
- C++ and CUDA experience is a plus.
- Self-starter with willingness to take ownership of tasks.
- Passion for tackling challenging problems.
- Minimum of 6 years working on relevant projects.
Bonus
- PhD in AI/ML/Systems or related areas.
- Experience building LLM training frameworks, especially post-training.
Compensation and Benefits
- Cash compensation range: $220,000 - $290,000 per year.
- Final offer amounts are determined by multiple factors, including experience and expertise, and may vary from the amounts listed above.
- Equity may be part of the total compensation package.
- Benefits include comprehensive health, dental, and vision insurance for you and your dependents, and a 401(k) plan.