Site Reliability Engineer

What is the impact you want to make?

Unleash your biggest strengths, apply skills & knowledge, learn new things, connect with your peers and build your career with us!

Why rinf.tech?

#EngineerOfTheFuture, #PeopleofManyTalents

  • At rinf.tech, you’ll encounter friendly people who are eager to explore and reinvent the world of technology.
  • We encourage ideas - we like to share and learn from each other. We’re all in for curious & ambitious people.

#GrowOpportunities

  • We continuously invest in developing core teams focused on technologies like Blockchain, AI, and IoT -  www.rinf.tech/careers/core-blockchain-and-ai-teams/
  • Our Technical Management team, possesses a robust technical background. Many of our team members have advanced to strategic roles through internal promotions.
  • In a state of mutual willingness to share & grow, our RINFers commit to a minimum tenure of 2.5 years on a project.

#EngineeringExcellence

  • Fail fast, learn fast: we experiment, we iterate, we know when to stop and we don't repeat the same mistakes.
  • The right technology stack for the right problem: we don't force technology choices just because we know them; our focus is on solving problems, not on pushing predefined stacks.

#Innovation

Why do we do what we do?

We inspire one another to share our tech-works in this amazing and abundant world. So we became developers, innovators, thinkers, software builders, and hardware makers!

Our Vision!

Founded in 2006 with 650+ engineers & global presence (8 delivery centers in Europe & North America) we strive to become a leading East-European technology partner for growing organizations in need of digital transformation of their products and services!

What you’ll do

  • Ensure the stability and reliability of cloud-native applications deployed on GCP, containerized with Docker and orchestrated via Kubernetes.
  • Define, implement, and monitor SLOs, SLAs, and SLIs to measure system performance and user experience.
  • Automate infrastructure provisioning using Terraform and manage Kubernetes configurations with Kustomize and Helm.
  • Develop and maintain monitoring and alerting systems using Datadog and GCP-native tools.
  • Conduct incident analysis and postmortems to drive continuous improvement.
  • Collaborate with development teams to integrate reliability practices into CI/CD pipelines using GitHub Actions.
  • Manage and troubleshoot database systems, particularly PostgreSQL and Cassandra.
  • Apply networking knowledge and Linux system administration skills to troubleshoot and optimize system connectivity and performance.

What you need to be successful

  • 5+ years of experience in Site Reliability Engineering.
  • Proven experience designing and operating elastic, resilient systems in cloud environments.
  • Strong understanding of GCP, Kubernetes, and container orchestration.
  • Proficiency in infrastructure as code and configuration management tools (Terraform, Helm, Kustomize).
  • Experience with monitoring and observability tools (Datadog, GCP Monitoring).
  • Solid scripting skills in bash and familiarity with automation frameworks.
  • Experience with CI/CD pipelines, especially using GitHub Actions.
  • Familiarity with networking fundamentals and troubleshooting.
  • Strong coding skills and ability to develop reliability-focused tooling.
  • Excellent communication skills in English (written and spoken).
  • Familiarity with monitoring tools (e.g., DataDog, Prometheus, GCP Monitoring).

Next Steps for you!

  • Apply
  • CV screening
  • HR Interview
  • Technical Interview
  • Offer presented by our CEO

Meet us!

Let's meet! We invite you to drop by anytime for a tour of our office, without any commitment.

Join the #PeopleofManyTalents #EngineerOfTheFuture