Full-time

Site Reliability Engineer, Observability

Posted by Confidential • toronto, on, Canada

📍 toronto, on 🕒 May 24, 2026

About the Role

Role Overview

This role is eligible for our hybrid work model: Two days in-office. As a Site Reliability Engineer – Observability, you will play a key part in maturing our observability capabilities by standardizing instrumentation, improving telemetry quality, and enabling faster root cause analysis that directly impacts MTTR and MTTD.

Responsibilities

  • Support and evolve end-to-end observability solutions for collecting, shipping, storing, and querying OpenTelemetry signals (metrics, logs, and traces) across infrastructure, containers, and Kubernetes environments.
  • Administer and operate core observability platforms (Splunk, New Relic, ClickHouse, Grafana, Lightrun), including service onboarding, access management, configuration, upgrades, and ongoing platform health.
  • Contribute to building and advancing a modern OpenTelemetry-based observability ecosystem that supports multiple telemetry types at scale.
  • Improve and standar...

Ready to Apply?

Submit your application today and take the next step in your career journey with Confidential.

Apply Now