Full-time

AI Inference Senior Engineer at NVIDIA

Posted by NVIDIA • toronto, on, Canada

📍 toronto, on 🕒 June 19, 2026

About the Role

Join NVIDIA as a Senior Engineer focusing on high-efficiency AI inference systems to propel innovation. Leverage cutting-edge GPU capabilities to optimize large-scale AI model performance.
In this senior role, you will apply your deep technical expertise in software engineering while driving advancements in AI frameworks. Your contributions will enhance inference stacks, underpin groundbreaking research, and empower developers to make full use of GPU features effectively.
Key Responsibilities:
• Develop advanced AI model features with vLLM
• Optimize GPU kernels and compilers for performance
• Create inference benchmarking strategies
• Oversee inference deployment orchestration
• Research and integrate novel ML ideas
Requirements:
• PhD or 7+ years in AI or related field
• Proficient in Python, C/C++, and performance systems
• Knowledge of GPU programming and profiling
• Experience with Docker and Kubernetes orchestration
• Strong analytical and e...

Ready to Apply?

Submit your application today and take the next step in your career journey with NVIDIA.

Apply Now