Site Reliability Engineer

SITA

Posted 30+ days ago

Experience

4 - 9 Years

Job Location

Cairo - Egypt

Education

Bachelor of Technology/Engineering

Nationality

Any Nationality

Gender

Not Mentioned

Vacancy

1 Vacancy

Job Description

Roles & Responsibilities

ABOUT THE ROLE & TEAM
The Site Reliability Engineer is responsible for the proactive support of products to ensure high product performance, with a continuous focus on improvement. The role involves identifying and resolving the root causes of operational incidents, implementing solutions to enhance stability, and preventing recurrence. The Site Reliability Engineer manages the creation and maintenance of the event catalogue to trigger events and develops both manual remediation approaches and automated workflows to address alerts. Additionally, they oversee the deployment of IT services and solutions, ensuring seamless integration with minimal disruption. WHAT YOU LL DO Design, build, and maintain support systems to ensure high availability, scalability, and performance of critical infrastructure. Lead incident response and root cause analysis for system failures, including problem investigations and coordination with relevant teams. Implement and manage automation for system provisioning, deployment, self-healing, and performance monitoring to increase operational efficiency. Establish and monitor SLIs/SLOs, proactively identify performance issues, and drive continuous improvements in service reliability. Collaborate with development and operations teams to embed reliability best practices and evolve toward zero-downtime architecture. Manage and optimize an event catalog, including event definitions, thresholds, remediation actions, and relevance across products. Develop event response protocols, provide training, and ensure efficient handling of incidents across teams. Drive post-incident reviews and feedback loops to enhance event definitions and service reliability. Oversee quality and readiness of deployments, ensuring clear processes, assigned responsibilities, and minimal disruption. Maintain deployment schedules and conduct risk assessments to ensure operational stability and deployment readiness. Coordinate and execute deployment plans, manage resources, and incorporate feedback for continuous process improvement. Manage CI/CD pipelines and infrastructure as code, ensuring seamless integration between development and operations. Support and evolve DevOps practices, automating operational tasks and maintaining tools to drive ongoing efficiency. Qualifications ABOUT YOUR SKILLS Bachelor s degree in computer science, Information Technology, Engineering, or a related field. Several years of experience in IT operations, service management, or infrastructure management, including roles such as Site Reliability Engineer, Problem Manager, or DevOps Manager. Proven experience in managing high-availability systems and ensuring operational reliability. Extensive experience in root cause analysis (RCA), incident management, and developing permanent solutions for recurring service disruptions. Hands-on experience with CI/CD pipelines, automation, system performance monitoring, and the implementation of infrastructure as code. Strong background in collaborating with cross-functional teams (development, operations, engineering, etc.) to improve operational processes and service delivery. Experience in managing deployments, risk assessments, and optimizing event and problem management processes. Familiarity with cloud technologies, containerization, and scalable architecture, including experience with zero-downtime deployment strategies. NICE-TO-HAVE Master s degree or professional certifications in Service Management, ITIL, or related fields.

Company Industry

Department / Functional Area

Keywords

  • Site Reliability Engineer

Disclaimer: Naukrigulf.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@naukrigulf.com