Full-time

Software Engineer, Kernel Reliability

Posted by Cerebras • remote, remote, Canada

📍 remote, remote 🕒 May 26, 2026

About the Role

About The Role

We’re looking for a deeply technical, hands-on software engineer to join our on-field Kernel Reliability team. You'll help tackle a critical challenge: improving the reliability of our advanced compute clusters and the underlying inference, training, and internal production services. In this role, you'll work close to the code and design solutions that will scale with our rapidly growing system production and software service offerings. If you have strong fundamentals in systems, debugging, and failure analysis—and enjoy building tools and solving hard reliability problems—we want to hear from you. New college graduates are welcome.

Responsibilities

  • Contribute to the technical roadmap and execution for kernel-centric reliability of our internal and customer-facing systems.
  • Partner with System and Cluster Operations teams to reduce system and service downtime after failure through tooling, analysis, and...

Ready to Apply?

Submit your application today and take the next step in your career journey with Cerebras.

Apply Now