posted Jun 01

Senior Site Reliability Engineer

Ansible AWS Cloud DNS GCP Go Jenkins Kubernetes Prometheus Puppet Rust Terraform TypeScript senior

Job Location: San Francisco, California

Salary: $150,000 - $200,000 a year

Job Description

• Maintain, improve, scale and secure our AWS/GCP infrastructure and Linux systems • Assist our development teams in running, packaging, deploying and troubleshooting applications • Work with developers on streamlining deployment processes with Jenkins and other CI/CD tooling • Build, maintain, monitor and improve our Kubernetes clusters • Work with development teams on migrating applications to Kubernetes • Be responsible for maintenance and improvements to multiple internal services, for example Kubernetes, Prometheus, ELK • Monitor, triage and respond to alerts in our high availability environments • Participate in design and code reviews, and ensure that the foundation for our services is best in class • Evaluate new technologies, design and implement as appropriate • Identify automation opportunities and implement by creating custom or by using off the shelf solutions

Qualifications

• 5+ years of experience of working in cloud-based systems operations, as a SRE or DevOps engineer • First-hand experience with configuration management and infrastructure as code (Ansible, Puppet, Terraform) • Proficient in utilizing SRE methodologies like capacity planning and disaster recovery testing to ensure the scalability, resilience, and availability of critical services • A strong understanding of computer networking, TCP/UDP, load balancing, distributed computing, web services, and the fundamental protocols used by the internet (HTTP, HTTPS, DNS, etc.) • Experienced in managing production workloads and skilled in using monitoring tools to detect issues early • Comfortable with participating in on-call rotations and conducting thorough root cause analyses to keep systems running smoothly • Proficiency in at least one programming language • Committed to supporting teammates, especially during challenging times, and excited about working in a close-knit, growing team. Approachable, empathetic, and proactive in promoting collaboration and innovation • Excels in working independently, demonstrating the ability to accomplish tasks without constant monitoring • Production experience building and maintaining Kubernetes clusters • Bonus: Ability to understand Go, Rust, C++ and TypeScript source code

Benefits

• Competitive health, dental & vision coverage • Flexible time off + 15 company holidays including a company-wide holiday break • Paid parental leave • Life & ADD • Short & Long term disability • FSA & Dependent Care Accounts • 401K (4% match) • Employee Assistance Program • Monthly gym allowance • Daily lunch and snacks in-office • L&D budget of $1,500/year • Company retreats

View all open positions atStellar

Apply for this position

Stellar

San Francisco, California

Apply for this position

🚀We believe this opportunity is eligible for visa sponsorship because the employer sponsored 742 visa applications in Q4, 2023

Related Jobs

See more DevOps jobs

Company: Henry Schein One
Post Date: New
Title: .NET Staff Software Engineer
Type: $120,000 - $160,000 a year
Location: Remote

Company: KUBRA
Post Date: New
Title: Senior Security Architect
Location: Unknown, California

Company: Okta
Post Date: New
Title: Staff Site Reliability Engineer (Customer Identity Cloud)
Type: $160,000 - $240,000 a year
Location: Remote

Company: Kiddom
Post Date: New
Title: Senior Software Engineer, Infrastructure
Location: Remote

Company: OwnBackup
Post Date: New
Title: Team Lead, Production Engineer
Type: $160,000 - $210,000 a year
Location: Unknown, California