Full-time

Staff Research Engineer, Model Efficiency

Posted by Cohere • montreal, montreal (administrative region), Canada

📍 montreal, montreal (administrative region) 🕒 June 04, 2026

Apply for this Job Similar Jobs

About the Role

Staff Research Engineer, Model Efficiency Join to apply for the Staff Research Engineer, Model Efficiency role at Cohere. 
Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI. 
Why this role? Large Language Models (LLMs) continue to push the boundaries of what AI systems can do — but inference is still the bottleneck. The Model Efficiency team is responsible for pushing the limits of LLM inference efficiency across our foundation models. We explore and ship breakthroughs across the model execution stack, including: 
model architecture and MoE routing optimization 
decoding and inference-time algorithm improvements 
software/hardware co-de...
                

Job Details

Location montreal, montreal (administrative region)
Job Type Full-time
Category Other-General
Posted June 04, 2026
Deadline July 14, 2026

Ready to Apply?

Submit your application today and take the next step in your career journey with Cohere.

Apply Now