Full-time

Program Manager

Posted by Yantran LLC • Austin, Texas, United States

📍 Austin, Texas 🕒 February 20, 2026

About the Role

We are mainly looking for a ML Engineer who is experienced and ready to take on this role. The candidate should have a strong background in ML and be capable of handling the tasks and responsibilities that come with the position.

ML Infrastructure

Performance Engineer

Focus:

This role focuses on the "serving plane." The engineer will integrate high-speed inference runtimes with streaming loaders and take ownership of the performance benchmarking mandate.

Key Responsibilities:

Integrate

SGLang

with the

Run:ai Model Streamer

to enable concurrent tensor streaming directly to GPU memory, reducing model "cold start" times.

Optimize SGLang s backend runtime, leveraging features like

RadixAttention

for prefix caching and compressed finite-state machines for faster decoding.

Design and execute rigorous

performance benchmarking

suites to...

Ready to Apply?

Submit your application today and take the next step in your career journey with Yantran LLC.

Apply Now