Send me Jobs like this
Experience
5 - 7 Years
Job Location
Education
Bachelor of Science(Computers)
Nationality
Any Nationality
Gender
Not Mentioned
Vacancy
1 Vacancy
Job Description
Roles & Responsibilities
- Lead and manage a team of Site Reliability Engineers, providing guidance, coaching, and support.
- Develop and implement best practices for system reliability, performance, and scalability.
- Collaborate with cross-functional teams, including development, operations, and infrastructure teams, to ensure the smooth operation of services.
- Define and track key performance indicators (KPIs) to measure and monitor system reliability and performance.
- Establish incident response processes and lead the team in resolving critical incidents in a timely manner.
- Conduct post-incident reviews and implement preventive measures to avoid future incidents.
- Manage and prioritize multiple projects and tasks, ensuring adherence to timelines and deliverables.
- Collaborate with vendors and service providers to ensure the availability and reliability of external systems.
- Drive continuous improvement initiatives, identifying opportunities for automation, process optimization, and efficiency gains.
- Stay updated with emerging technologies and industry best practices to drive innovation and enhance system reliability.
Desired Candidate Profile
- Bachelor's or Master's degree in computer science, information technology, or a related field (or equivalent experience).
- Professional certifications, such as AWS Certified DevOps Engineer or Google Cloud Professional DevOps Engineer, are a plus.
- Proven experience in a Site Reliability Engineering role, including experience in managing and leading a team.
- In-depth knowledge of system architecture, networking, and infrastructure components.
- Experience with configuration management tools, such as Ansible, Puppet, or Chef.
- Strong leadership and communication skills, with the ability to collaborate effectively with cross-functional teams and stakeholders at all levels.
- Expertise in ArcGIS Enterprise, ArcGIS Server, ArcGIS Portal, ArcSDE.
- Strong experience with FME Desktop & FME Server (automation workflows, data modeling, attribute mapping). Oracle Server administration and performance tuning for GIS applications. Geodatabase design, management, replication, and backup.
- Development and deployment of enterprise geospatial databases and server software.
- Support and service delivery management in multi-location environments.
- Experience in implementing technical support operations and automation of internal processes.
- Strong understanding of cloud platforms, such as AWS, Azure, or Google Cloud.
- Proficiency in scripting and automation, using languages such as Python, Bash, or PowerShell.
- Familiarity with monitoring and logging tools, such as Nagios, Prometheus, or ELK stack.
- Excellent problem-solving and troubleshooting skills, with the ability to analyze complex issues and provide effective solutions.
Company Industry
- IT - Software Services
Department / Functional Area
- IT Software
Keywords
- Manager - Site Reliability
Disclaimer: Naukrigulf.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@naukrigulf.com
Group 42
ABOUT US For more information visit: www.space42.ai , follow us on X and Instagram @Space42ai
Space42 (ADX: SPACE42) is a UAE-based AI-powered SpaceTech company that integrates satellite communications, geospatial analytics and artificial intelligence capabilities to enlighten the Earth from space. Established in 2024 following the successful merger between Bayanat and Yahsat, Space42 s global reach allows it to address the rapidly evolving needs of its customers in governments, enterprises, and communities.
Our vision is to pioneer beyond today for humanity to experience a better tomorrow. Space42 challenges traditional approaches with advanced AI and cutting-edge satellite technology, making space more accessible and redefining how data from space can be used on Earth. We aim to achieve this by connecting people to rewire potential, informing decisions to reimagine impact and enabling action to redefine tomorrow.
ROLE SUMMARY
As a Site Reliability Manager, your role is to oversee the operations and reliability of the organization's IT infrastructure and systems. You will lead a team of Site Reliability Engineers (SREs) and collaborate with cross-functional teams to ensure the availability, performance, and scalability of the organization's services. Your responsibilities will include developing and implementing best practices, managing incident response and resolution, and driving continuous improvement initiatives. This role requires strong leadership skills, technical expertise, and the ability to effectively manage and prioritize multiple projects and tasks.