Full-time

Deployment Engineer, AI Inference

Posted by Cerebras Systems Inc. • Toronto, ON, Canada

📍 Toronto, ON 🕒 February 19, 2026

About the Role

About Cerebras Systems

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer‑scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of one device. This enables industry‑leading training and inference speeds and lets machine learning users run large‑scale ML applications without the hassle of managing hundreds of GPUs or TPUs.

Our customers include global corporations, national labs and top‑tier healthcare systems. In 2024 we launched Cerebras Inference, the fastest generative AI inference solution, over 10 times faster than GPU‑based hyperscale cloud inference services.

About The Role

We are seeking a highly skilled Deployment Engineer to build and operate cutting‑edge inference clusters on the world’s largest computer chip, the Wafer‑Scale Engine (WSE). You will play a critical role in ensuring reliable, efficient, and scalable deployment of...

Ready to Apply?

Submit your application today and take the next step in your career journey with Cerebras Systems Inc..

Apply Now