Full-time

Program Manager

Posted by Yantran LLC • Austin, Texas, United States

📍 Austin, Texas 🕒 February 20, 2026

Apply for this Job Similar Jobs

About the Role

                    We are mainly looking for a ML Engineer who is experienced and ready to take on this role. The candidate should have a strong background in ML and be capable of handling the tasks and responsibilities that come with the position.
 
ML Infrastructure
 
Performance Engineer
 
Focus:
 
This role focuses on the "serving plane." The engineer will integrate high-speed inference runtimes with streaming loaders and take ownership of the performance benchmarking mandate.
 
Key Responsibilities:
 
Integrate
 
SGLang
 
with the
 
Run:ai Model Streamer
 
to enable concurrent tensor streaming directly to GPU memory, reducing model "cold start" times.
 
Optimize SGLang s backend runtime, leveraging features like
 
RadixAttention
 
for prefix caching and compressed finite-state machines for faster decoding.
 
Design and execute rigorous
 
performance benchmarking
 
suites to...

Job Details

Location Austin, Texas
Job Type Full-time
Category other-general
Posted February 20, 2026
Deadline April 01, 2026

Ready to Apply?

Submit your application today and take the next step in your career journey with Yantran LLC.

Apply Now