Commentary

AI-supported and human-led: Rethinking humanitarian evaluation systems

Yellow sticky note with 'AI' text at center
Photo: Shutterstock/ | Karina Mate

Elias Sagmeister’s recent ALNAP piece asks a question many evaluators are asking: if Artificial Intelligence (AI) is changing other sectors, why do humanitarian evaluations still look so familiar? I agree with his central point. AI should push us to rethink the evaluation model, not just make the old model faster. The real question is not whether AI can reduce costs or speed up analysis. It is whether AI can help make humanitarian evaluations more useful, more ethical and more accountable to people affected by crisis.

What AI can and cannot do

AI can take on many labour/resource intensive tasks that take too much evaluator time: desk reviews, document synthesis, transcript coding, translation and routine analysis. Used responsibly, this is not a shortcut. It can be good evaluation practice. The purpose, though, should be clear. AI should free evaluators to spend more time on work that needs human judgement.

The harder question is what AI cannot reliably do. In crisis-affected settings, evidence is not only found in words. It is also found in silence, hesitation, body language, fear, trust and social relationships. In my evaluation work across Ethiopia, the Horn of Africa and wider East Africa region, I have seen how people often communicate carefully, indirectly or selectively. A displaced woman may avoid naming a protection concern in a mixed group. A young person may withhold criticism if local authorities, elders or aid providers are nearby. These are not simply “data gaps”. They are part of the evidence.

An AI-moderated interview may capture the spoken answer, translate it and summarise it neatly. But without human contextual judgement, it may miss why the answer was given in that way, what risks shaped it, and what has been left unsaid. This is why human-led qualitative inquiry remains essential. Trust-based engagement is not just a method. In many humanitarian settings, it is an ethical requirement.

The ethical non-negotiable

Other sectors are not just automating old tasks; they are redesigning workflows and feedback loops. In health, AI is helping redesign diagnostic pathways; in agriculture, it supports climate advice for farmers; in social protection, it improves targeting and delivery.

Humanitarian evaluation can learn from other sectors, but carefully. We often work with people facing displacement, conflict, exclusion and protection risks. In these contexts, mismanaged information can cause harm. UNEG's ethical principles for harnessing AI in evaluations reminds us that AI use must be ethical, valid, inclusive and rights based.

Many evaluations are still designed mainly for upward accountability to donors. They produce reports and recommendations but not always learning or change. As ALNAP’s latest evaluation guide reminds us, the value of evaluation lies not in the report itself, but in whether it improves decision-making, learning and performance. If AI only helps us produce the same underused reports more quickly, we have missed the point.

Decentralise the design, not just the tools

If AI is introduced badly, it may recentralise evaluation power in global headquarters, international firms and English-language systems. That would be a step backward. National and local evaluators should be involved from the start, as co-designers, analysts and interpreters, not only as enumerators. AI tools must also be secure, multilingual and accessible to researchers working in local languages such as Amharic, Somali, Oromiffa, Yoruba, Arabic, French, Hausa and other languages.

Future terms of reference should state which AI tools may be used, what data must not be uploaded, how outputs will be verified, and how AI use will be disclosed to stakeholders.

Responsible AI should become a standard part of evaluation procurement, inception reporting and quality assurance.

Humanitarian evaluation must be AI-supported and human-led. Otherwise, it risks being neither.

Author's bio

Dr. Amha Ermias Teferi is CEO and Lead Consultant of SEGEL Consulting. He is a senior evaluator, MEAL and accountability specialist, CHS approved and Sphere accredited trainer, with over 18 years of experience across humanitarian, development and peace contexts in Ethiopia, the Horn of Africa and wider African settings. His current work spans evaluation systems, organisational learning, quality and accountability, and responsible AI use in evaluation and governance.