Much of what passes for “evidence” in learning evaluation persists not because it is strong, but because it is administratively convenient. In many organisations, evidence is selected not to inform design decisions but to satisfy assurance, compliance, or audit requirements; purposes that quietly shape what counts as “acceptable” data.
Metrics such as satisfaction scores, completion rates, or brief post‑course surveys are frequently privileged because they are easy to gather, compare, and report, not because they meaningfully inform design decisions. More demanding forms of evidence, those that involve observation over time, contextual judgement, or engagement with learners’ actual practice are often excluded precisely because they resist standardisation.
These are not neutral omissions. They reflect value judgements about what institutions are willing to see and support. Treating evidence as a design decision rather than a data collection problem makes these trade‑offs explicit, and forces a shift from asking what can be measured efficiently to asking what ought to be understood.
What data is collected should follow from what the programme is trying to understand, and that understanding is shaped by assessment design. Choices about what to capture, when, and from whom determine what learning can plausibly be evidenced.
Designing for Evidence
Decisions about what evidence to collect shape what can later be defended, challenged, or obscured as well as what can be known about learning.
Effective data collection begins by clarifying what would count as meaningful evidence. This requires translating learning intentions into observable or interpretable indicators, while accepting that no single measure will be sufficient.
Where learning objectives remain vague, for example, framed around learners “understanding the importance of” a concept, the evidence available is inevitably weak. Assessments designed against such objectives often default to low‑value proxies that signal exposure rather than learning.
Some evaluation approaches intentionally separate learning design from measurement in an attempt to reduce confirmation bias, though this separation brings its own trade‑offs when evidence no longer reflects what learning was actually designed to change. The problem is not a lack of evaluation frameworks, but the quiet normalisation of evidence that is easy to defend rather than useful to learn from.
Meaningful evidence depends on objectives that make an intended change in knowledge, thinking, or practice explicit, and assessments that are capable of eliciting that change. When objectives and assessments are poorly aligned, data collection becomes an exercise in confirmation rather than inquiry.
Pre‑course surveys, knowledge checks, reflective prompts, performance observations and follow‑up interviews all serve different purposes. Their usefulness depends not on sophistication, but on alignment with the questions being asked.
Surveys, Assessments, and Context
Surveys are flexible and efficient, but they are also highly sensitive to wording, timing, and context. The persistence of weak instruments is often less about ignorance than about risk management: surveys are attractive precisely because they are easy to administer, analyse, and defend.
Asking learners whether they feel confident, satisfied, or informed produces data, but it does not demonstrate competence or capability. Such questions capture perception framed by the prompt itself, shaped by social desirability, expectations, and the immediate learning context.
Assessments are often treated as more objective alternatives, yet they too embed strong design assumptions. What an assessment makes visible depends on what it asks learners to do, under what conditions, and using which criteria. Many commonly used recall and recognition tests privilege short‑term retention over judgement or application, while scenario‑based tasks, simulations, and applied activities surface judgement, decision‑making, and transfer.
Neither is inherently superior, but each constrains the kind of evidence produced. Treating surveys and assessments as neutral instruments obscures the fact that both actively shape responses. Designing for evidence therefore requires explicit consideration of what each method invites learners to reveal, and what it systematically leaves unobserved.
Designing for Use, Not Volume
Collecting more data does not improve evaluation. It often obscures it.
Data that cannot be acted upon is noise.
Effective collection focuses on sufficiency rather than completeness: enough evidence to support informed judgement, gathered with minimal friction for learners and educators. This often means fewer instruments, not more.
The Role of Timing
When evidence is gathered matters as much as what is gathered. Immediate post‑course feedback captures reaction, not impact. Behavioural evidence collected too early may miss transfer entirely.
Designing collection points across time recognises learning as a process rather than an event. It also requires realistic expectations about what can be observed and when.
Data collection is not neutral infrastructure. It is part of the learning design.



Leave a Reply