Evaluation Archives

Towards Responsibility Evaluation of Generative Language Models

Sabrina Hartung, Marina Tropmann-Frick and Boštjan Brumen
Conference Paper
February 15, 2026May 17, 2026

An evaluation of the responsibility of generative AI models presents unique challenges that require holistic and practical solutions. This paper introduces an enhanced version of the VERIFAI framework, which extends beyond classification models to assess generative language models as well… Read More »Towards Responsibility Evaluation of Generative Language Models

Bridging the Gap between Theory and Practice: Towards Responsible AI Evaluation

Sabrina Hartung and Marina Tropmann-Frick
Paper
September 1, 2023May 17, 2026

The growing integration of artificial intelligence (AI) in diverse sectors underscores the need for comprehensive and standardized approaches to ensure AI responsibility. However, the absence of a holistic framework to evaluate the fairness, privacy-preserving, secure, explainable, and human-centered facets of… Read More »Bridging the Gap between Theory and Practice: Towards Responsible AI Evaluation