Senior Site Reliability Engineer (SRE) Salla

Posted 30+ days ago

Send me Jobs like this

Experience

8 - 13 Years

Job Location

Saudi Arabia - Saudi Arabia

Education

Bachelor of Science(Computers)

Nationality

Any Nationality

Gender

Not Mentioned

Vacancy

1 Vacancy

Job Description

Roles & Responsibilities

You ll be hands-on with Kubernetes, observability, GitOps, automation, and cloud infrastructure, while partnering closely with application, platform, and data teams to deliver a highly reliable and self-healing environment.

Design, deploy, monitor, and maintain production workloads across Kubernetes (EKS/AKS/GKE) clusters.

Build self-healing, auto-scaling systems that minimize toil and manual intervention.
Optimize networking, ingress/egress traffic control, and service mesh for secure & performant communication.
Design and operate reliable database and storage platforms (SQL, NoSQL, and object stores) in Kubernetes environments.
Own backup, disaster recovery, replication, and failover strategies to meet RPO/RTO targets for critical data services.
Optimize storage performance and cost through multi-tier strategies, hot/cold data separation, and S3/offloading lifecycle policies.
Troubleshoot and recover Kubernetes Persistent Volumes confidently during incidents (StorageClasses, CSI drivers, PVC issues).
Secure and scale object storage platforms (e.g., MinIO/S3-compatible) and integrate with workloads for high-throughput data pipelines.
Work with block storage (EBS/io2/gp3) and shared file systems (EFS, NFS) to balance performance, resiliency, and cost.

Automation & Delivery

Champion GitOps and CI/CD best practices (ArgoCD, Flux, GitHub Actions).br> Build automation for infrastructure provisioning and upgrades using Terraform, Helm, and Kubernetes Operators.
Reduce release risk through progressive delivery strategies (blue/green, canary, spot instance rolling updates).

Observability & Incident Response

Own the monitoring and alerting stack (Prometheus, Grafana, Loki, VictoriaMetrics, OpenSearch).
Lead incident management and postmortems to prevent recurrence.
Provide real-time visibility into system health, performance, and cost metrics.

Security & Compliance

Implement least-privilege IAM policies, secure service-to-service communication, and network ACLs/firewalls.
Enforce Kubernetes RBAC, secret management, and secure image supply chain.
Participate in audit readiness and compliance efforts.

Performance & Cost Optimization

Analyze and tune system performance under scale (CPU/memory/IO).
Partner with product and platform teams to right-size clusters, databases, and storage tiers.

Introduce cost visibility dashboards for engineering leadership.

Desired Candidate Profile

Bachelor s degree in Computer Science, Engineering, or a related field or strong>equivalent work experience/strong>.

8+ years in SRE / DevOps / Infrastructure Engineering roles.

Deep Kubernetes expertise (multi-cluster, Helm chart development, advanced networking).

Strong GitOps workflows using ArgoCD/Flux.

Expertise with AWS (preferred) or Azure/GCP, plus Infrastructure-as-Code (Terraform, Pulumi, CloudFormation).

Advanced knowledge of SQL & NoSQL databases (MySQL/Aurora, PostgreSQL, MongoDB, Redis).

Scripting/automation skills in Python, Bash, or Go.

Solid background in monitoring/observability (Prometheus, Grafana, Loki, ELK/Opensearch, VictoriaMetrics).

Experience with CI/CD at scale and managing production incidents.

Experience with streaming/messaging (Kafka, RabbitMQ, or similar).

Strong communication skills and teamworker able to collaborate across engineering, DevOps, security, and product teams.

Company Industry

Internet
E-commerce
Dotcom

Department / Functional Area

IT Software

Keywords

Senior Site Reliability Engineer (SRE)

Disclaimer: Naukrigulf.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@naukrigulf.com

Salla

https://apply.workable.com/salla/j/CC7B11DB9D/

Similar Jobs

Devops Engineer

Dicetek LLC

5 - 10 Years
Abu Dhabi - United Arab Emirates (UAE)

Site Reliability Engineer (SRE)

Dicetek LLC

5 - 10 Years
Dubai - United Arab Emirates (UAE)

Devops Engineer

Confidential Company

2 - 7 Years
Dubai - United Arab Emirates (UAE)

Senior DevOps Engineer

Starlink WLL

10 - 16 Years
Doha - Qatar

Site Reliability Engineer (SRE)

Dicetek LLC

5 - 10 Years
Dubai - United Arab Emirates (UAE)

View All

Home
Jobs in Saudi Arabia

Senior Site Reliability Engineer (SRE) Salla

People Looking for Senior Site Reliability Engineer (SRE) Jobs also searched