Full-time

AI Inference Systems Engineer NVIDIA Careers

Posted by NVIDIA • toronto, on, Canada

📍 toronto, on 🕒 June 04, 2026

About the Role

Join NVIDIA as an AI Inference Systems Engineer to revolutionize model efficiency and scalability. Focus on optimizing GPU performance while collaborating with top-tier teams in AI development.

In this role, you'll leverage your extensive programming background to build high-performance inference frameworks and tools, lead optimizations in GPU kernels, and benchmark methodologies. Your role is crucial in shaping the future of ML Systems and improving deployment across clouds.

Key Responsibilities: • Enhance vLLM with the latest NVIDIA GPU features • Benchmark and optimize GPU kernels for peak performance • Contribute to industry-leading MLPerf Inference submissions • Architect scheduling for large-scale GPU inference • Push boundaries of ML research and system integration

Requirements: • Master’s in CS/CE/SE with 5+ years of experience • Strong skills in Python, C/C++, and ML frameworks • Familiar with CUDA and GPU performance tools • Experience in container ...

Ready to Apply?

Submit your application today and take the next step in your career journey with NVIDIA.

Apply Now