Jobs at RINF TECH

Site Reliability Engineer

What is the impact you want to make?

Unleash your biggest strengths, apply skills & knowledge, learn new things, connect with your peers and build your career with us!

Why rinf.tech?

#EngineerOfTheFuture, #PeopleofManyTalents

At rinf.tech, you’ll encounter friendly people who are eager to explore and reinvent the world of technology.
We encourage ideas - we like to share and learn from each other. We’re all in for curious & ambitious people.

#GrowOpportunities

We continuously invest in developing core teams focused on technologies like Blockchain, AI, and IoT - www.rinf.tech/careers/core-blockchain-and-ai-teams/
Our Technical Management team, possesses a robust technical background. Many of our team members have advanced to strategic roles through internal promotions.
In a state of mutual willingness to share & grow, our RINFers commit to a minimum tenure of 2.5 years on a project.

#EngineeringExcellence

Fail fast, learn fast: we experiment, we iterate, we know when to stop and we don't repeat the same mistakes.
The right technology stack for the right problem: we don't force technology choices just because we know them; our focus is on solving problems, not on pushing predefined stacks.

#Innovation

Adapta Robotics is a successful spin-off born through an R&D project within rinf.tech www.adaptarobotics.com/

Why do we do what we do?

We inspire one another to share our tech-works in this amazing and abundant world. So we became developers, innovators, thinkers, software builders, and hardware makers!

Our Vision!

Founded in 2006 with 650+ engineers & global presence (8 delivery centers in Europe & North America) we strive to become a leading East-European technology partner for growing organizations in need of digital transformation of their products and services!

What you’ll do

Ensure the stability and reliability of cloud-native applications deployed on GCP, containerized with Docker and orchestrated via Kubernetes.
Define, implement, and monitor SLOs, SLAs, and SLIs to measure system performance and user experience.
Automate infrastructure provisioning using Terraform and manage Kubernetes configurations with Kustomize and Helm.
Develop and maintain monitoring and alerting systems using Datadog and GCP-native tools.
Conduct incident analysis and postmortems to drive continuous improvement.
Collaborate with development teams to integrate reliability practices into CI/CD pipelines using GitHub Actions.
Manage and troubleshoot database systems, particularly PostgreSQL and Cassandra.
Apply networking knowledge and Linux system administration skills to troubleshoot and optimize system connectivity and performance.

What you need to be successful

5+ years of experience in Site Reliability Engineering.
Proven experience designing and operating elastic, resilient systems in cloud environments.
Strong understanding of GCP, Kubernetes, and container orchestration.
Proficiency in infrastructure as code and configuration management tools (Terraform, Helm, Kustomize).
Experience with monitoring and observability tools (Datadog, GCP Monitoring).
Solid scripting skills in bash and familiarity with automation frameworks.
Experience with CI/CD pipelines, especially using GitHub Actions.
Familiarity with networking fundamentals and troubleshooting.
Strong coding skills and ability to develop reliability-focused tooling.
Excellent communication skills in English (written and spoken).
Familiarity with monitoring tools (e.g., DataDog, Prometheus, GCP Monitoring).

Next Steps for you!

Apply
CV screening
HR Interview
Technical Interview
Offer presented by our CEO

Meet us!

Let's meet! We invite you to drop by anytime for a tour of our office, without any commitment.

Join the #PeopleofManyTalents #EngineerOfTheFuture