Site Reliability Engineer

Posted 1 May by Context Recruitment Limited
Easy Apply

Register and upload your CV to apply with just one click

Site Reliability Engineer / DevOps Engineer (Azure)

Opportunity to join one of the top UK Insurers who are on a mission to become the leading 'digital first' insurer in the UK. As a Site Reliability Engineer, you will be the backbone of their Azure environment, ensuring it's **scalability, reliability, and operational excellence**. You will work closely with cross-functional teams to build and maintain a robust infrastructure that supports their dynamic needs.

Key Responsibilities:

  • Assume responsibility for the observability suite, encompassing tools for monitoring, logging, and alerting, to guarantee a thorough and integrated understanding of system functionality and health.
  • Set up and oversee APM tools like Dynatrace or New Relic, leveraging their features to effectively monitor application performance and resolve problems.
  • Employ extensive DevOps expertise to establish and uphold infrastructure as code (IaC) methodologies, streamlining the processes of deployment, scaling, and management through automation.
  • Actively track and pinpoint issues related to performance and reliability in APIs and applications, and devise strategies to address these concerns.
  • Work in tandem with development teams to fine-tune application performance, enhance the efficiency of resource use, and improve scalability.
  • Develop and sustain comprehensive incident response and review protocols to reduce system downtime and avert the repetition of problems.
  • Propel ongoing enhancement efforts to boost the dependability, scalability, and operational efficiency of Ageas' infrastructure and services, staying ahead of client expectations.
  • Engage in the on-call schedule, offering support for resolving incidents and conducting necessary troubleshooting.

Qualifications:

  • Experience in a DevOps / Site Reliability Engineer ( SRE ) position, dedicated to ensuring the high availability, reliability, and scalability of live systems.
  • Proficient in observability tools like Prometheus, ELK stack, Grafana, and Azure Monitor, capable of fully managing the suite for optimal system oversight.
  • Skilled in operating APM tools such as Dynatrace or New Relic, with a track record of using these tools to effectively monitor and enhance application performance.
  • A thorough grasp of DevOps methodologies, including the use of Terraform for infrastructure as code (IaC), and expertise in automated deployment and configuration management.
  • Hands-on experience with programming environments such as Node.js, Java, and various JavaScript frameworks.
  • Familiarity with cloud platforms, especially Azure, and adept at administering cloud-based infrastructures.
  • Demonstrated ability to anticipate and rectify issues impacting the performance and reliability of APIs and applications.
  • Excellent teamwork and communication abilities, ensuring productive collaboration with diverse functional groups.

Remote based.

Paying up to 75k, depending on experience.

Reference: 52561311

Please note Reed.co.uk does not communicate with candidates via Whatsapp, and we will never ask you to provide your bank, passport or driving licence details during the application process. To stay safe in your job search and flexible work, we recommend visiting JobsAware, a non-profit, joint industry and law enforcement organisation working to combat labour market abuse. Visit the JobsAware website for information and free expert advice for safer work.

Report this job