Full-time

Lead Inference Platform Support Engineer - AI I

Posted by Refinitiv • toronto, on, Canada

📍 toronto, on 🕒 May 29, 2026

About the Role

# **Our Privacy Statement & Cookie Policy*** Optimize LLMs and ML models for high-performance inference using techniques such as quantization, pruning, distillation, and hardware specific tuning* Deploy and scale inference workloads on GPUs across AWS, Azure, GCP and internal Kubernetes clusters, ensuring predictable performance during peak traffic hours, especially during business hours* Implement routing and failover strategies for OpenAI/Anthropic/Vertex AI traffic* Integrate models into production grade APIs supporting TR products and enterprise workflows.* Develop highly optimized environment and eliminate performance bottlenecks to reduce latency* Collaborate with Platform Engineering teams (Landing Zones, Network, Storage, Compute, AI) to ensure inference workloads align with TR’s cloud native patterns (AWS, Azure, GCP, OCI)* Build and optimize containerized inference pipelines using Kubernetes for large‐scale distributed workloads* Ensure compliance with TR’s AI standards for d...

Ready to Apply?

Submit your application today and take the next step in your career journey with Refinitiv.

Apply Now