Full-time

Cybersecurity Benchmark Engineer

Posted by Pilotcrew AI • remote, remote, India

📍 remote, remote 🕒 June 06, 2026

Apply for this Job Similar Jobs

About the Role

Cybersecurity Benchmark Engineer  
Location: Remote  
Company: Pilotcrew AI  
Type: Contract (monthly basis)  
Experience: 3+ yrs  

 
About Pilotcrew AI  
Pilotcrew AI builds infrastructure for AI Agent Evaluation. We benchmark large language models, run automated agent evaluations, power human-in-the-loop assessments, and host AI arenas for competitive testing. Our mission is to make AI agents measurable, reliable, and production-ready through structured, scalable evaluation systems.  

 
Role Overview  
We are building a large-scale benchmark for evaluating the cybersecurity capabilities of frontier AI LLMs. To grow this benchmark, we need hands-on security engineers who can craft real-world vulnerability tasks that are genuinely difficult for state-of-the-art LLMs and agentic systems.  
Your core output: carefully designed benchmark instances  real s...
                

Job Details

Location remote, remote

Job Type Full-time

Category C++,automation,bash,c++,ctf,cybersecurity,data,docker,engineering,git,infrastructure,mobile,null,oop,porting,python,red,search,sed,startup,test

Posted June 06, 2026

Deadline July 16, 2026

Ready to Apply?

Submit your application today and take the next step in your career journey with Pilotcrew AI.
Apply Now