Full-time

Architect - AppDev

Posted by Red Hat • New Delhi, India, India

📍 New Delhi, India 🕒 February 23, 2026

About the Role

**About The Job:**

We are seeking a visionary and hands-on Senior AI Technical Lead to spearhead our Generative AI initiatives. While many can build a prototype, you are the expert who can take it to production. This role focuses on the end-to-end lifecycle of GenAI: from high-performance inference hosting and automated MLOps pipelines to rigorous model benchmarking and safety guardrails.
You will lead a high-performing team to design systems that are not only intelligent but are scalable, cost-optimized, and ethically governed.

**What Will You Do:**

MLOps & High-Performance Inference

+ Inference Server Management: Architect and optimize model serving using high-throughput engines like vLLM, NVIDIA Triton Inference Server, or TGI (Text Generation Inference).
+ Scalable Hosting: Deploy and manage LLMs on Kubernetes (K8s), implementing auto-scaling based on concurrency and token throughput.
+ MLOps Pipelines: Build robust CI/CD/CT (Continuous ...

Ready to Apply?

Submit your application today and take the next step in your career journey with Red Hat.

Apply Now