Employee

Senior Service Reliability Engineer

Posted by ThoughtWorks • Singapore, Singapore, Singapore

📍 Singapore, Singapore 🕒 June 21, 2026

About the Role

Job responsibilities

  • You will improve site reliability by building mechanisms/architectures that enable fault tolerance and faster median time to respond and median time to detect.

  • You will drive the integration of observability automation into the CI/CD pipeline.

  • You will handle production incidents, manage incident communication with clients and draft root cause analysis documents.

  • You will monitor performance of production systems and improve their scaling to ensure business goals are met within expected SLA and SLO metrics.

  • You will work closely with application development teams as advisors on improving system reliability and assisting in implementation for reliability improvements.

  • You will improve system observability across multiple facets such as logging and metrics, reducing false alarms to eliminate unnecessary toil and improving process efficiency.

  • You will implement chaos engin...
  • Ready to Apply?

    Submit your application today and take the next step in your career journey with ThoughtWorks.

    Apply Now