Senior Observability Engineer

Devsinc

Employer Active

Posted 5 hrs ago

Experience

6 - 11 Years

Education

Any Graduation

Nationality

Any Nationality

Gender

Not Mentioned

Vacancy

1 Vacancy

Job Description

Roles & Responsibilities

Key Responsibilities:

  • Own and evolve the end-to-end observability architecture across applications, infrastructure, and cloud environments
  • Centralize metrics, logs, traces, and events with high reliability and scalability
  • Design and enforce SLOs, SLIs, and error budgets for critical financial systems
  • Build advanced real-time dashboards and business-aligned KPIs for engineering and leadership
  • Develop intelligent alerting frameworks to minimize noise and enable faster incident resolution
  • Ensure observability pipelines are resilient, scalable, and cost-optimized
  • Collaborate with DevOps and engineering teams to implement instrumentation, distributed tracing, and logging standards
  • Integrate observability systems with incident management, on-call, and escalation workflows
  • Support compliance, audit, and forensic analysis through structured logging and traceability
  • Drive root cause analysis (RCA) and continuous improvement of system reliability
  • Automate monitoring, alerting, and data enrichment workflows
  • Strong hands-on expertise with:
    • Monitoring: Dynatrace, Prometheus, Grafana
    • Logging: Elastic Stack (ELK), Splunk, Fluentbit, Logstash
    • Alerting & Correlation: Dynatrace, ELK, Splunk Alertmanager
  • Proficiency in PromQL, SPL, KQL for advanced log/metric analysis
  • Experience developing high-performance, scalable dashboards in Grafana and Kibana, integrating application, infrastructure, and business KPIs for end-to-end observability.
  • Deep understanding of distributed systems observability and performance monitoring
  • Experience with high-throughput, low-latency systems
  • Experience with enterprise monitoring tools such as Riverbed and SolarWinds for network performance monitoring (NPM), application visibility, traffic analysis, and infrastructure health tracking across distributed systems.

Core Expertise:

  • Observability pillars: metrics, logs, traces, events
  • Golden signals: latency, traffic, errors, saturation
  • SLO/SLI-driven reliability engineering
  • Alert design with high signal-to-noise ratio
  • Telemetry standardization and instrumentation strategies
  • Mapping technical metrics to financial/business KPIs

Desired Candidate Profile

Key Responsibilities:

  • Own and evolve the end-to-end observability architecture across applications, infrastructure, and cloud environments
  • Centralize metrics, logs, traces, and events with high reliability and scalability
  • Design and enforce SLOs, SLIs, and error budgets for critical financial systems
  • Build advanced real-time dashboards and business-aligned KPIs for engineering and leadership
  • Develop intelligent alerting frameworks to minimize noise and enable faster incident resolution
  • Ensure observability pipelines are resilient, scalable, and cost-optimized
  • Collaborate with DevOps and engineering teams to implement instrumentation, distributed tracing, and logging standards
  • Integrate observability systems with incident management, on-call, and escalation workflows
  • Support compliance, audit, and forensic analysis through structured logging and traceability
  • Drive root cause analysis (RCA) and continuous improvement of system reliability
  • Automate monitoring, alerting, and data enrichment workflows
  • 6 to 10 years of experience in Observability, SRE, or Monitoring Engineering roles
  • Mandatory experience in fintech, banking, or highly regulated environments
  • Strong hands-on expertise with:
    • Monitoring: Dynatrace, Prometheus, Grafana
    • Logging: Elastic Stack (ELK), Splunk, Fluentbit, Logstash
    • Alerting & Correlation: Dynatrace, ELK, Splunk Alertmanager
  • Proficiency in PromQL, SPL, KQL for advanced log/metric analysis
  • Experience developing high-performance, scalable dashboards in Grafana and Kibana, integrating application, infrastructure, and business KPIs for end-to-end observability.
  • Deep understanding of distributed systems observability and performance monitoring
  • Experience with high-throughput, low-latency systems
  • Experience with enterprise monitoring tools such as Riverbed and SolarWinds for network performance monitoring (NPM), application visibility, traffic analysis, and infrastructure health tracking across distributed systems.

Preferred Qualifications and FinTech Alignment:

  • Proven experience supporting audit, compliance, and regulatory requirements within fintech, banking, or other regulated environments
  • Strong familiarity with industry frameworks such as:
    • PCI DSS
    • ISO 27001
    • SAMA / NCA
  • Solid understanding of data sensitivity, traceability, and audit logging standards for financial systems
  • Experience working on large-scale fintech or digital banking platforms
  • Exposure to CI/CD-integrated observability and DevSecOps practices
  • Proficiency in scripting and automation (Python, Bash)
  • Hands-on experience with incident management and on-call frameworks (e.g., PagerDuty, Opsgenie)

What We re Looking For:

  • A proactive engineer with a strong strong reliability and performance mindset
  • Ability to translate observability data into actionable insights
  • Experience working cross-functionally with SRE, DevOps, and product teams
  • Ownership-driven individual focused on continuous improvement of monitoring systems

Company Industry

Department / Functional Area

Keywords

  • Senior Observability Engineer

Disclaimer: Naukrigulf.com is only a platform to bring jobseekers & employers together. Applicants are advised to research the bonafides of the prospective employer independently. We do NOT endorse any requests for money payments and strictly advice against sharing personal or bank related information. We also recommend you visit Security Advice for more information. If you suspect any fraud or malpractice, email us at abuse@naukrigulf.com