Site Reliability Engineer
Amsterdam, Noord-Holland · Booking.com · Booking.com
Omschrijving
The core premise for SRE lies in treating operations as a software problem, where operations are concerned with addressing availability, scalability, latency, and efficiency for Booking.com’s systems & services. At its core, the SRE is tasked with engineering efforts to solve complex problems, requiring a strong aptitude to develop software systems that will minimize (i.e., through automation) human labor and increase system & service reliability.
A Booking Reliability Engineering team has full vertical ownership of a system, from the server configuration up to the application interfaces. This enables the team to have full control of a service, avoiding situations where different teams own different areas of a system, causing some parts to fall between the cracks. SREs can wear several hats; at times an SRE might be part of the product development team themselves, and at other times will act as a consultant to support a product development team in implementing Booking Reliability Engineering best practices.
As systems & services grow in size and complexity, so too does the operational overhead. It is a fundamental principle of SRE to break this relationship between operational toil, system size, and complexity. This also requires the team to limit operations work, enforcing engineering development efforts that are at the heart of Booking Reliability Engineering. Ultimately, fundamental software engineering skills coupled with strong systems and networking knowledge will guide the SRE to create more reliable systems & services that are highly available, scale with growth, and are efficient and latency-sensitive.
Requirements:
In-depth knowledge, understanding, and experience (minimum 3 years) of Apache Kafka administration.
Strong software engineering skills with the ability to write robust code.
Decent experience with Java.
Decent experience with Kubernetes (Docker, Helm, Argo).
Experience building and using monitoring components for distributed systems.
Experience building and maintaining distributed multi-tenant systems.
Oriented towards automating tasks and working closely with the team.
Expected to participate in operational shifts during the day (reacting to outages) and providing customer support (ticket work).
Proven problem-solving capabilities in complex distributed environments
Understanding of the Confluent platform and Confluent cloud is a plus.
Experience with databases is a plus
Bachelor's degree
Key Responsibilities
Building software applications:
Responsible for building software applications using relevant development languages and applying knowledge of systems, services, and tools appropriate for the business area.
Responsible for writing readable and reusable code by applying standard patterns and using standard libraries.
Responsible for refactoring and simplifying code by introducing design patterns when necessary.
Responsible for ensuring the quality of the application by following standard testing techniques and methods that adhere to the test strategy.
Responsible for maintaining data security, integrity, and quality by effectively following company standards and best practices.
End-to-End System Ownership:
… lees de volledige omschrijving bij Booking.com.
Je wordt doorgestuurd naar de website van Booking.com. ZZPdock is geen tussenpartij.