About the Role
- Manage, monitor, and improve application reliability, scalability, and performance.
- Implement and maintain monitoring, alerting, and observability tools (Dynatrace, Kibana, CloudWatch).
- Troubleshoot production issues and drive root cause analysis (RCA) for incidents.
- Automate operational processes using scripting (Python, Shell, or similar).
- Collaborate with development and DevOps teams to improve CI/CD and infrastructure reliability.
- Ensure high system uptime through proactive performance tuning and incident management.
- Work with AWS services (EC2, ECS, EKS, Lambda, S3, CloudWatch, etc.) for deployment and monitoring.
- Participate in on-call rotation and production support as required.
- Support Java / Microservices-based environments, ensuring efficient scaling and health monitoring.
- Maintain documentation for SRE processes, runbooks, and automation workflows.
Skills Required
Monitoring Tools...
Ready to Apply?
Submit your application today and take the next step in your career journey with Pathfinders Global P Ltd.
Apply Now