posted Jul 04
Sr. Site Reliability Engineer I
Job Location: Remote
Job Description
• Help define technology choices, best practices, and process for the team. • Develop and maintain documentation standard for the team. • Develop new tools and libraries for broader use by SaaS Operations and Engineering teams. Enable engineering teams to discover and understand problems quicker. • Work with product architects and make suggestions for architectural changes and design platform component roadmaps. • Act as a subject matter expert (SME) for components and functions desired. Develop the skill as required, to become an SME for components in need. • Assist engineering teams in deep troubleshooting and application code review to find opportunities to improve performance and scalability. • Work closely with Engineering and peer SRE teams to design and use Smarsh coding standards and best practices. • Respond to incidents coordinated by SRE and Incident Response teams. Act as a Incident Commander during incidents. • Participate in escalation and off-hours on-call schedule. • Adopt and embrace qualities of an SRE as defined in the team charter. Help set them for the rest of the team. • Mentor and train junior members of the team. Design a training curriculum for the team.
Qualifications
• BS in CS or equivalent combination of education and experience. • Minimum 7+ years industry experience • Strong experience operating Kubernetes in production environments – EKS Anywhere is preferred • Experience with middleware systems (Kafka, AMQ, Redis, Memcache, etcd) • Experience managing CI/CD systems (Flux, Concourse) • Experience deploying and/or operating Observability stack (Splunk, Datadog, Grafana) • Experience with large scale systems • Familiarity with working with PostgreSQL and MongoDB • Background working in a multi-platform environment (Linux, Windows) • Familiarity of programming/scripting languages (ie. Python, Bash, PowerShell, Go, etc.) • Familiarity with Agile/Scrum/Kanban methodologies • Strong interpersonal skills with a can-do attitude and sense of urgency for a high growth/fast-paced environment • Curious mind, wanting to learn new technologies and share with others. • The ability to think outside of the box to resolve issues and create solutions
Benefits
• 11 paid holidays • Generous Accrued Time Off increasing with years of service • Generous paid sick time • Annual day of service
Related Jobs

- Company
- Henry Schein One
- Post Date
- New
- Title
- .NET Staff Software Engineer
- Type
- $120,000 - $160,000 a year
- Location
- Remote

- Company
- KUBRA
- Post Date
- New
- Title
- Senior Security Architect
- Location
- Unknown, California

- Company
- Okta
- Post Date
- New
- Title
- Staff Site Reliability Engineer (Customer Identity Cloud)
- Type
- $160,000 - $240,000 a year
- Location
- Remote

- Company
- Kiddom
- Post Date
- New
- Title
- Senior Software Engineer, Infrastructure
- Location
- Remote

- Company
- OwnBackup
- Post Date
- New
- Title
- Team Lead, Production Engineer
- Type
- $160,000 - $210,000 a year
- Location
- Unknown, California