Exploring GenAI and machine learning in external quality assurance

The ENQA Workshop on the Responsible use of Artificial Intelligence in Quality Assurance Agencies hosted by the Estonian Quality Agency for Education (HAKA) in Tallinn on 11-12 June provided a timely platform to share practices and engage with other agency colleagues on the integration of generative artificial intelligence (GenAI) and machine learning in quality assurance processes. ENQA will follow up on the outcome of the workshop and publish a set of principles for quality assurance agencies. Two case studies are presented below gathering the experience of Quality and Qualifications Ireland (QQI) and the Catalan University Quality Assurance Agency (AQU Catalunya, Spain) in exploring the use of these tools.

GenAI pilot projects at QQI

By Marie Gould, Head of Tertiary Education Monitoring and Review, Quality and Qualifications Ireland (QQI)

Aligned with QQI’s Statement of Strategy 2025-2027, which commits to using GenAI where appropriate, two pilot projects using GenAI, aimed at enhancing QQI’s external quality assurance (EQA) processes, have been recently implemented:

  • Supporting QA Evaluation Processes: Leveraging GenAI within a European Approach evaluation.
  • Thematic Analysis of EQA Outcomes: Utilising GenAI to support thematic analysis.

These projects have demonstrated the potential of GenAI to enhance efficiency and effectiveness in EQA processes.

Using a hybrid approach that combined GenAI (using Microsoft Copilot) with manual, human-led methods, we successfully identified and organised key themes for a published Thematic Analysis on Reviews of Independent and Private HEIs. This pilot project highlighted GenAI’s strengths and limitations, as well as the challenges that EQA agencies face due to limited practice and experience. For example, a notable challenge was transparently describing the methodology in the final thematic analysis. Sharing these insights with other agencies through the ENQA workshop provided valuable discussions and explorations, including on the development of core principles for using GenAI in EQA.

QQI are at the early stages of exploring GenAI and it is challenging to keep up with the pace of the technological advances.  We are continuing with thematic analysis pilots and will be trying MAXQDA AI Tools in our next pilot. To ensure effective quality assurance, EQA must remain responsive and efficient.

Piloting machine learning to support quality assurance at AQU Catalunya

By Dani Torrents, Anna Prades, Sandra Nieto, Núria Mancho, and José Luis Mateos, Catalan University Quality Assurance Agency (AQU Catalunya, Spain)

AQU Catalunya is currently developing a strategy to integrate GenAI and machine learning techniques into its internal operations and QA activities. While still in its early stages, this strategy focuses on two main lines of action:

  • Establishing internal use criteria for freely available GenAI tools and exploring licensed options, with a strong focus on confidentiality, security, and privacy.
  • Exploring the potential of GenAI and machine learning tools to enhance QA processes, with support from field experts.

An example that belongs to the latter line was presented in Tallinn: piloting the use of machine learning techniques to identify quality risk in HE degree programmes.

The aim of the pilot was to evaluate whether these emerging technologies could assist evaluators in identifying higher education programmes at risk of not achieving the desired accreditation outcomes. Additionally, we explored whether machine learning could help us make better use of the large set of indicators that we collect—many of which often remain underutilised—and whether it could offer insights into the link between peer review processes and indicators. We also saw this as a valuable opportunity to assess how AI might help address KPI overload, a common issue where excessive indicators lead to diminished attention and clarity.

We trained several models using the accreditation outcomes of over 1,400 evaluations and their associated indicators. The results showed that the models performed well in identifying at-risk bachelor’s programmes, although accuracy was significantly lower for master’s programmes. In designing the model, we prioritised sensitivity to risk, meaning that it was more likely to flag borderline programmes—even some well-performing ones—to avoid false positives.

This pilot yielded two important insights for QA in Catalonia and potentially elsewhere:

  • QA processes are inherently multidimensional and cannot be reduced solely to quantitative indicators.
  • Machine learning and AI techniques can provide valuable support in flagging risks, but they cannot replace expert panel judgment.

This website uses cookies to improve your experience. Read More