About the Role
The Site Reliability Engineer (SRE) at iCareManager (iCM) is responsible for ensuring the reliability, scalability, performance, and availability of production systems. The role blends software engineering and operations, with a strong focus on automation, observability, incident management, and proactive reliability engineering.
SREs enable engineering teams to deliver features rapidly without compromising system stability, especially in regulated healthcare environments.
Key Responsibilities
1. Reliability Engineering
Define, implement, and enforce Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets
Design and review resilience patterns (redundancy, failover, graceful degradation)
Perform capacity planning, load modeling, and scalability analysis
Conduct chaos testing and failure injection to identify system weaknesses
Reduce Mean Time to Recovery (MT...
Ready to Apply?
Submit your application today and take the next step in your career journey with iCareManager, LLC..
Apply Now