Senior Software Engineer, AI Eval

at Sentry

📍 San Francisco, United States

USD 240,000-280,000 per year

SENIOR

✅ Hybrid

Used Tools & Technologies

Not specified

Required Skills & Competences
Tag name is followed by "@" symbol and proficiency level value. About proficiency levels:

1-2 — basic awareness. Minimal hands-on experience, and a rudimentary understanding of the technology's purpose;

3-6 — daily use. Comfortable and regular usage, capable of handling common tasks and challenges related to the technology;

7-9 — you are an expert, you can teach others, you know all the pitfalls and tricks;

10 — exceptional knowledge, comprehensive understanding, and adeptness in all aspects of the technology, including advanced problem-solving. Think twice before claiming or demanding such level.

TypeScript @ 4 Python @ 4 Machine Learning @ 6 Debugging @ 4 Experimentation @ 4 Sentry @ 4

Details

About Sentry

Bad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster so we can get back to enjoying technology. With more than $217 million in funding and 100,000+ organizations using Sentry, the company builds performance and error monitoring tools used by companies like Disney, Microsoft, and Atlassian.

Sentry embraces a hybrid work model, with Mondays, Tuesdays, and Thursdays set as in-office anchor days to encourage meaningful collaboration.

About the role

As a Senior Software Engineer on Sentry’s AI/ML team, you will build evaluation infrastructure that measures the accuracy, reliability, and real-world performance of AI systems. This role ensures debugging agents and AI-powered features behave correctly, safely, and predictably at scale. You will design datasets, benchmarks, and test harnesses that turn ambiguous AI behavior into measurable signals to help the team ship AI with confidence.

Responsibilities

Design and build robust evaluation frameworks to measure accuracy, reliability, regressions, and edge cases in AI systems
Create and curate high-quality datasets, golden test cases, and benchmarks grounded in real production data
Build automated test harnesses and metrics pipelines to continuously evaluate models, prompts, and agentic workflows
Partner with applied AI engineers and product leaders to define what “good” looks like and translate it into measurable criteria
Own the evaluation lifecycle for major AI initiatives, from early experimentation through production monitoring

You’ll love this job if you

Care deeply about correctness, rigor, and measurement in AI systems
Enjoy turning fuzzy product goals and model behavior into concrete tests and metrics
Like building foundational infrastructure that unlocks faster iteration and higher confidence for the entire AI team
Thrive in cross-functional environments and enjoy influencing model design through better evaluation

Qualifications

Minimum 5+ years of professional experience with a Bachelor’s degree in computer science, machine learning, or a related field
Experience building testing, evaluation, or data infrastructure for complex systems (AI/ML experience strongly preferred)
Comfortable writing production-quality code (we use Python and TypeScript)
Experience working with structured and unstructured datasets, labeling workflows, or data quality pipelines
Familiarity with modern ML systems and evaluation techniques (e.g., offline metrics, online evaluation, regression testing for models or prompts)
Bonus: experience evaluating LLMs, agentic systems, or AI-assisted developer tools

Compensation & Benefits

The base salary range that Sentry reasonably expects to pay for this position is $240,000 to $280,000. A successful candidate’s actual base salary will be determined by factors including work location, education, relevant experience, skills, and job-related knowledge. Eligible candidates may participate in Sentry’s employee benefit plans/programs (including incentive compensation, equity grants, paid time off, and group health insurance coverage).

Workplace

Workplace type: Hybrid — in-office anchor days on Mondays, Tuesdays, and Thursdays

Equal Opportunity & Accommodations

Sentry is committed to providing equal employment opportunities and reasonable accommodations for employees and candidates with physical or mental disabilities. If you need assistance or an accommodation due to a disability, contact [email protected]. For details about applicant data handling, see Sentry's Applicant Privacy Policy.