Full-time
Site Reliability Engineer
Posted by Krisvconsulting Services • Cyberjaya, Selangor, Malaysia
About the Role
Responsibilities:
- Ensure high availability and performance of systems
- Analyze performance metrics and resolve incidents (P0P3)
- Involve in system design and set reliability goals
- Continuously optimize and innovate for better user experience
- Improve and maintain the full lifecycle of services: development to deployment
- Observability, monitoring, and troubleshooting of distributed cloud systems
- Proficient in debugging and automating tasks in OS, networking, databases, and applications
Requirements:
- Programming in Java, Python, or Go, Scripting with Shell, Terraform, Ansible, Chef, or Puppet
- Strong understanding of Linux/Unix, containers, VMs, and cloud platforms
- Experience with DevOps processes, Automation using SaltStack, Spinnaker, or StackStorm
- Experience with big data, chaos engineering, auto-scaling container platforms
- Background in data science, cyb...
Ready to Apply?
Submit your application today and take the next step in your career journey with Krisvconsulting Services.
Apply Now