CDI
AI Evaluation Engineer
Posted by Openchip And Software Technologies SL • Ghent, Flanders, Belgium
About the Role
The Role:
We are seeking an exceptional AI Evaluation Engineer to design, implement, and scale frameworks for assessing the performance, reliability, and trustworthiness of advanced AI systems. This individual will be responsible for developing methodologies and tools to measure model quality across diverse dimensions, such as accuracy, robustness, reasoning, safety, and efficiency.
Key Responsibilities:
- Design and Develop Evaluation Frameworks: Create scalable, reproducible evaluation pipelines for large-scale AI systems, including LLMs and multi-agent architectures, covering both automated and human-in-the-loop testing strategies.
- Metric Innovation: Define and implement novel evaluation metrics that capture model capabilities beyond traditional benchmarks.
- Benchmarking & Performance Analysis: Conduct benchmarking of AI models across domains, tasks modalities, anal...
Ready to Apply?
Submit your application today and take the next step in your career journey with Openchip And Software Technologies SL.
Apply Now