Sr. Site Reliability Engineer (SRE), Cloud Incident Response
Posted by SS&C Technologies Holdings • Bangkok, Bangkok, Thailand
About the Role
Job Description
Overall job purpose:
Be part of a global team that ensures the performance, scalability, and reliability of critical cloud-based applications. As part of the Global Investor and Distribution Solutions (GIDS) Platform Services team, you’ll play a key role in keeping our systems running smoothly and efficiently—while helping shape the future of our platform.
What You’ll Do:
Collaborate with global teams as part of a follow-the-sun support model.
Respond to, troubleshoot, and resolve Level 2 application incidents.
Ensure critical applications are effectively monitored using tools like Prometheus and Grafana.
Create and maintain dashboards and alerts to enhance visibility into application health.
Define, implement, and track key SRE metrics (SLOs, SLIs, error budgets).
Partner with development teams to improve ap...
Ready to Apply?
Submit your application today and take the next step in your career journey with SS&C Technologies Holdings.
Apply Now