posted Jun 21

Software Engineer - ML Reliability

AWS Azure Clojure Cloud Google Cloud Platform Grafana Open Source Prometheus Python PyTorch Spark Tensorflow Go mid

Job Location: Remote

Salary: $155,000 - $190,000 a year

Job Description

• Design, implement, and maintain robust ML architecture • Implement monitoring tools and processes • Provide best practices and proof-of-concepts for automated model operations • Lead incident response efforts and conduct root cause analysis • Work closely with cross-functional teams to align on goals and deliverables • Contribute to an 'engineering excellence' culture

Qualifications

• 5+ years of software engineering experience • 3+ years of experience in machine learning, software engineering, or reliability engineering • Solid core CS fundamentals • Proficiency in Python, Go, or similar programming languages • Experience with ML frameworks and cloud platforms • Strong problem-solving skills • Excited to work on large scale ML and data systems

Benefits

• Full compensation package including equity and health/vision/dental benefits • Base compensation varies based on location and experience

logo
Company
PDT Partners
Post Date
New
Title
Summer 2025 Software Engineering Internship
Type
$130,000 - $165,000 a year
Location
Manhattan, New York
logo
Company
Tiger Analytics
Post Date
New
Title
Senior Tableau Developer
Location
Remote
logo
Company
Terakeet
Post Date
New
Title
Sr. Data Scientist
Type
$107,000 - $162,000 a year
Location
Remote
logo
Company
Esri
Post Date
New
Title
Sr. Application Developer
Type
$93,408 - $167,128 a year
Location
Unknown, California
logo
Company
Okta
Post Date
New
Title
Senior Engineer, Performance Tuning
Type
$136,000 - $204,000 a year
Location
San Francisco, California