posted Jun 03

Research Engineer, Model Evaluations

Benchmarks Python mid

Job Location: Manhattan, New York

Salary: $315,000 - $510,000 a year

Job Description

• Designing and running a new evaluation that tests Claude’s reasoning capabilities, and creating a compelling visualization that illustrates the results • Running experiments to determine how prompting techniques affect results on industry benchmarks • Improving the tooling that researchers use to implement evaluations • Explaining our evaluations and their results to internal decision makers and Stakeholders • Collaborating with a research team to develop a robust evaluation for a new model capability they are developing

Qualifications

• Have significant Python programming experience / machine learning research • Are excellent at data visualization • Have experience using Large Language Models such as Claude • Are results-oriented, with a bias towards flexibility and impact • Pick up slack, even if it goes outside your job description • Enjoy pair programming (we love to pair!) • Want to learn more about machine learning research • Care about the societal impacts of your work • Have clear written and verbal communication • You want to design and implement rigorous evaluations to deeply understand the capabilities, personality, and safety of large language models like Claude. • You're excited to turn fuzzy notions of "AI intelligence" into clear, well-defined metrics that provide insight to researchers, decision-makers and the public. • You're energized by the challenge of assessing and steering powerful AI to be safe and beneficial.

Benefits

• Optional equity donation matching • Comprehensive health, dental, and vision insurance for you and all your dependents • 401(k) plan with 4% matching • 22 weeks of paid parental leave • Unlimited PTO – most staff take between 4-6 weeks each year, sometimes more! • Stipends for education, home office improvements, commuting, and wellness • Fertility benefits via Carrot • Daily lunches and snacks in our office • Relocation support for those moving to the Bay Area

logo
Company
Scroll
Post Date
New
Title
Protocol Research Engineer
Location
Remote
logo
Company
Snorkel AI
Post Date
New
Title
Research Engineer, Computer Vision
Location
Bay Area, California
logo
Company
Twelve Labs
Post Date
New
Title
ML Research Engineer
Location
San Francisco, California
logo
Company
Second Dinner
Post Date
New
Title
Mid-Level Research Engineer
Type
$110,000 - $175,000 a year
Location
Remote
logo
Company
Cobalt
Post Date
New
Title
Principal Research Engineer, US Based
Type
$183,200 - $229,000 a year
Location
Remote