posted Jun 01

Senior Site Reliability Engineer

Ansible AWS Cloud DNS GCP Go Jenkins Kubernetes Prometheus Puppet Rust Terraform TypeScript senior

Job Location: Brooklyn, New York

Salary: $150,000 - $200,000 a year

Job Description

• Interested in working on cutting-edge blockchain technology and creating equitable access to the global financial system since 2014 • Maintaining, improving, scaling, and securing AWS/GCP infrastructure and Linux systems • Assisting development teams in running, packaging, deploying, and troubleshooting applications • Working with developers on streamlining deployment processes with Jenkins and other CI/CD tooling • Building, maintaining, monitoring, and improving Kubernetes clusters • Working with development teams on migrating applications to Kubernetes • Being responsible for maintenance and improvements to multiple internal services, for example Kubernetes, Prometheus, ELK • Monitoring, triaging, and responding to alerts in high availability environments • Participating in design and code reviews, and ensuring that the foundation for services is best in class • Evaluating new technologies, designing, and implementing as appropriate • Identifying automation opportunities and implementing by creating custom or using off the shelf solutions

Qualifications

• 5+ years of experience of working in cloud-based systems operations, as a SRE or DevOps engineer • First-hand experience with configuration management and infrastructure as code (Ansible, Puppet, Terraform) • Proficient in utilizing SRE methodologies like capacity planning and disaster recovery testing to ensure the scalability, resilience, and availability of critical services • A strong understanding of computer networking, TCP/UDP, load balancing, distributed computing, web services, and the fundamental protocols used by the internet (HTTP, HTTPS, DNS, etc.) • Experienced in managing production workloads and skilled in using monitoring tools to detect issues early • Comfortable with participating in on-call rotations and conducting thorough root cause analyses to keep systems running smoothly • Proficiency in at least one programming language • Committed to supporting teammates, especially during challenging times, and excited about working in a close-knit, growing team. Approachable, empathetic, and proactive in promoting collaboration and innovation • Excels in working independently, demonstrating the ability to accomplish tasks without constant monitoring • Production experience building and maintaining Kubernetes clusters

Benefits

• Competitive health, dental & vision coverage • Flexible time off + 15 company holidays including a company-wide holiday break • Paid parental leave • Life & ADD • Short & Long term disability • FSA & Dependent Care Accounts • 401K (4% match) • Employee Assistance Program • Monthly gym allowance • Daily lunch and snacks in-office • L&D budget of $1,500/year • Company retreats

Related Jobs

logo
Company
Henry Schein One
Post Date
New
Title
.NET Staff Software Engineer
Type
$120,000 - $160,000 a year
Location
Remote
logo
Company
KUBRA
Post Date
New
Title
Senior Security Architect
Location
Unknown, California
logo
Company
Okta
Post Date
New
Title
Staff Site Reliability Engineer (Customer Identity Cloud)
Type
$160,000 - $240,000 a year
Location
Remote
logo
Company
Kiddom
Post Date
New
Title
Senior Software Engineer, Infrastructure
Location
Remote
logo
Company
OwnBackup
Post Date
New
Title
Team Lead, Production Engineer
Type
$160,000 - $210,000 a year
Location
Unknown, California