posted Jul 04

Sr. Site Reliability Engineer I

Bash Cloud Flux Grafana Kafka Kubernetes MongoDB Postgres Python Redis Splunk Go senior

Job Location: Remote

Job Description

• Help define technology choices, best practices, and process for the team. • Develop and maintain documentation standard for the team. • Develop new tools and libraries for broader use by SaaS Operations and Engineering teams. Enable engineering teams to discover and understand problems quicker. • Work with product architects and make suggestions for architectural changes and design platform component roadmaps. • Act as a subject matter expert (SME) for components and functions desired. Develop the skill as required, to become an SME for components in need. • Assist engineering teams in deep troubleshooting and application code review to find opportunities to improve performance and scalability. • Work closely with Engineering and peer SRE teams to design and use Smarsh coding standards and best practices. • Respond to incidents coordinated by SRE and Incident Response teams. Act as a Incident Commander during incidents. • Participate in escalation and off-hours on-call schedule. • Adopt and embrace qualities of an SRE as defined in the team charter. Help set them for the rest of the team. • Mentor and train junior members of the team. Design a training curriculum for the team.

Qualifications

• BS in CS or equivalent combination of education and experience. • Minimum 7+ years industry experience • Strong experience operating Kubernetes in production environments – EKS Anywhere is preferred • Experience with middleware systems (Kafka, AMQ, Redis, Memcache, etcd) • Experience managing CI/CD systems (Flux, Concourse) • Experience deploying and/or operating Observability stack (Splunk, Datadog, Grafana) • Experience with large scale systems • Familiarity with working with PostgreSQL and MongoDB • Background working in a multi-platform environment (Linux, Windows) • Familiarity of programming/scripting languages (ie. Python, Bash, PowerShell, Go, etc.) • Familiarity with Agile/Scrum/Kanban methodologies • Strong interpersonal skills with a can-do attitude and sense of urgency for a high growth/fast-paced environment • Curious mind, wanting to learn new technologies and share with others. • The ability to think outside of the box to resolve issues and create solutions

Benefits

• 11 paid holidays • Generous Accrued Time Off increasing with years of service • Generous paid sick time • Annual day of service

Related Jobs

logo
Company
Henry Schein One
Post Date
New
Title
.NET Staff Software Engineer
Type
$120,000 - $160,000 a year
Location
Remote
logo
Company
KUBRA
Post Date
New
Title
Senior Security Architect
Location
Unknown, California
logo
Company
Okta
Post Date
New
Title
Staff Site Reliability Engineer (Customer Identity Cloud)
Type
$160,000 - $240,000 a year
Location
Remote
logo
Company
Kiddom
Post Date
New
Title
Senior Software Engineer, Infrastructure
Location
Remote
logo
Company
OwnBackup
Post Date
New
Title
Team Lead, Production Engineer
Type
$160,000 - $210,000 a year
Location
Unknown, California