Full-time

AI Infrastructure Engineer

Posted by HCLTech • Noida, Uttar Pradesh, India

📍 Noida, Uttar Pradesh 🕒 March 01, 2026

About the Role

AI Infrastructure Engineer- L3
The Role
The AI Infrastructure Engineer (L3) provides advanced engineering and architectural expertise for high‑performance AI and ML infrastructure. This role focuses on building, optimizing, and scaling GPU/accelerator environments and distributed systems for large‑scale training and inference workloads.
Competency Focus: High‑performance computing (HPC), distributed systems, Kubernetes, GPU orchestration, cloud optimization
Keywords: Nvidia GPU Infrastructure, Kubernetes, GPU Cluster Administrator, Infrastructure SME, RCA
Responsibilities:
Deploy, configure, and manage GPU and AI accelerator platforms (NVIDIA A100/H100/L40, AMD Instinct, TPU).
Troubleshot GPU hardware and software issues, including failures, thermal throttling, PCIe/NVLink topology, and driver conflicts.
Install, upgrade, and maintain GPU software stacks, including drivers, CUDA, cuDNN, TensorRT, and firmware.
Perform capacity planning and resource optimiza...

Ready to Apply?

Submit your application today and take the next step in your career journey with HCLTech.

Apply Now