[Evaluating with Purpose is a multi-post series that makes a case for development organizations to adopt better evaluation strategies and methods. Part 1: The Evaluation Charade unmaks the gap between what organizations think they are doing about evaluation and what most are really doing. Prospective posts are Part 2: what makes a good impact evaluation, Part 3: what makes a good evaluation strategy, and Part 4: transformative evaluation.]
For someone whose job description is built around doing evaluation work, I can be remarkably hostile toward project evaluations. They tend to be expensive, distracting, and mostly useless. They can be surprisingly costly, often requiring projects to set aside 5-10% of the project funds that could otherwise have been budgeted to activities that further project objectives. They eat up a lot of staff time, not so much during the evaluation itself, but in the set up for it – designing indicators, measuring baselines, managing monitoring activities, etc. But most critically, very few ever accomplish the stated purpose of the evaluation – to provide rigorous evidence about whether the project was any good or not.
To be clear, there are different kinds of project evaluations and I’m not out to pillory them all. You can classify evaluation types by either purpose or method, but since form follows function, the basic typology is based on the evaluation’s purpose. Here are some of the main types.
- Needs Assessment: evaluations that look at a potential site or community to assess the needs that justify a project design. These often utilize third party data and community interaction to define the situation and build a case for directing a specific intervention toward a target population.
- Design Analysis: evaluations that assess the logic of the project design, asking if the theory of change makes any sense. They assess if there is a complete, coherent, causal chain from the project inputs and activities to the project objectives as well as the validity of the assumptions made in the chain’s links.
- Process Evaluation: evaluations that determine if the project was implemented as planned. Did the money get spent as budgeted, did the services get delivered as designed, did the benefits reach the right people, was the project participatory and culturally sustainable, did anything unexpected happen?
- Impact Evaluation: evaluations that look at whether or not the project produced the desired effects. The project happened and lives were changed – impact evaluation asks if these changes can be causally attributed to the project.
- Cost-Benefit Analysis: efficiency evaluations that compare different interventions and determine what each costs to produce an equivalent impact. In other words, they identify the biggest bang for the buck.
However, 95% of the time when people say, “We are going to evaluate a project”, what they’re talking about is impact evaluation.
At least that’s what they think they’re talking about.
95% of the time, what they then go on to do is not actually an impact evaluation, and this is where my big beef about project evaluations being expensive, distracting, and mostly useless comes in.
When someone decides to evaluate a project, what they generally have in mind is answering the question, “Was this a good project?” And by “good” they mean that it achieved the change objectives it was designed for. A literacy project was good if people learned to read; an economic development project was good if people’s wealth increased. Since only good projects are worth doing, repeating or scaling-up, organizations evaluate in order to learn what works and to demonstrate it to stakeholders. In sum, mot project evaluations are done for learning and accountability.* These are noble reasons.
Learning: Organizations may want to know if an innovative project design or new intervention is effective. Perhaps they want to know if something that had worked well in one context also works well given new conditions in a different context. It could even be that they want to know if a new less-costly way to implement a project is as effective as the old expensive way. You can only learn these things with confidence if you carefully evaluate project impact.
Accountability: Most organizations also evaluate their projects because they have to. Donors and Boards generally require at least some of their projects to be evaluated. They want to see evidence that previous projects were good before they’ll agree to fund or support more of them. Accountability ties consequences to performance; bad projects get killed, good projects repeated or expanded.
At least this is what most organizations think they get out of their project evaluations.
Evaluation and Self-Deception
The truth is that most project evaluations fail to provide either learning or accountability. Here’s three reasons why.
1: They don’t inform decisions.
Evaluation is an empty exercise if it can’t trigger change. In theory, they are backward looking, but forward informing. Nevertheless, the same stakeholders who demand project evaluations often refuse to be guided by the results. What leaders believe about a project can easily override what an evaluation report says about it, and the evaluator’s recommendations must compete with organizational constraints, personal loyalties, political considerations, and economic incentives that have at least as much sway in determining what happens next.
As a result many evaluation reports fail to influence decisions they were designed to inform. Evaluation reports get filed away quickly with little real reflection on their implications. There may be some perfunctory discussion and finger wagging around negative findings, but this is largely just a charade – everyone pretends the evaluation mattered a great deal when in fact it matter not at all.
I think the root problem is that few stakeholders really believe the evaluation reports they get. The evidence in most just isn’t that compelling and often the findings are little more than opinions, and the professional recommendations more like personal suggestions. If the project manager is well liked and trusted, the evaluator’s take simply counts for less. The reports may be used to bolster a position already taken, but they’ll rarely sway someone off their position. They’re just too easy to disregard when they don’t say what you want them to say. The truth is, few evaluation reports have evidence too compelling to dismiss. And if this is the case, then maybe its quite appropriate that few evaluations actually inform decisions. But now I’m getting ahead of myself.
2: They aren’t assimilated.
Many development organizations simply don’t have the capacity to absorb the “lessons learned”, which ends up being a misnomer then anyway since they aren’t really learned. Assimilation becomes particularly problematic in organizations with policies that require all projects to be evaluated. For two reasons: First, compliance replaces learning as the organization’s de facto evaluation objective. Organizations end up focusing more on ensuring the standards were adhered to, the templates used, and the deadlines hit then on figuring out how to achieve better development results.
Second, the diminishing returns on evaluation information approach zero as the number of evaluated projects increases. Consider what a comprehensive evaluation policy implies for large organizations. My own estimations are that the largest of these can produce up to 1000 evaluation reports a year. How can any organization handle that quantity of information in a meaningful way? It’s like drinking water from a fire hose – most of it just sprays every which way but in. The marginal value of the nine-hundredth evaluation report pales to its cost. Of course, somewhere in all those reports are some really important evaluation findings that the organization should take to heart. But hearing those findings above the noise of all the others is nearly impossible, and so the organization fails to learn its own lessons “learned”.
3: They don’t answer the critical question.
Recall, the critical question of an impact evaluation is whether or not the project was good (i.e. Did it achieve its outcome objectives?) We can assume that most development projects have a genuine intention to improve the lives of the beneficiaries; the problem is that there is a gap between the intention and the result. In Spanish they say “del dicho al hecho hay un trecho”. (Thanks to Dan Levy for that one.) In order to narrow this gap, you must first have a reliable measurement of the project’s impact. The problem is that most evaluations utilize methods that are too weak to produce a reliable measurement.
The most commonly used technique for estimating the impact of a project is to take measurements of the key indicators before the project was implemented and compare them to measurements of the same indicators afterwards. This pre and post technique may initially sound like a good idea, but it is profoundly flawed. In nearly every project you can imagine, there are myriad variables external to the project design that influence the results the project is trying to achieve. The effects of these other variables obscure the impact of the project such that you can mistakenly conclude that a good project is bad and a bad project good.
In order to isolate the impact of the project from the effects of these other variables, an evaluation must have a good estimate of the counterfactual – what would have happened to the beneficiaries had the project never happened. Obviously, the counterfactual can never be measured directly – that would require conducting an evaluation in two parallel universes. However, there are many evaluation techniques to estimate the counterfactual; some are more reliable than others, but all have one thing in common – a control group. A control group is a set of people that is similar to the group of project beneficiaries, but the control group doesn’t benefit from the project. I’ll go into detail about this in the next installment of this series “Evaluating with Purpose – Part 2: What Makes a Good Impact Evaluation.” But for now, the critical point is this: nearly all project “impact” evaluations fail to create a control group, which means they’re not really impact evaluations at all!
So what are they? And what question do they answer? Well, I don’t know what they are apart from a misnomer. But the question they answer is not what project implementers and stakeholders think they answer. They only answer the question of what happened, but they can’t tell you why. Now some organizations recognize this, but instead of changing their evaluation methods, they sophisticate their interpretation instead by drawing a distinction between “contribution” and “attribution.” But what does an organization learn by looking at the results and saying “We think we contributed to these”? Remember, a project can have a positive impact despite negative results and vice versa. And what kind of accountability is there in this equivocating stance? The organization can take credit for contributing to positive results while side-stepping responsibility where negative results occur.
The Great Pantomime
Development organizations spend millions of dollars every year – millions! – on evaluations that don’t answer the fundamental question of whether or not their projects are good. They utilize evaluation techniques that they know (or should know) can’t produce reliable measurements of impact. As a result, what they think they learn from project evaluations may be perniciously inaccurate. Good projects get killed, bad projects get scaled-up. As a result, what accountability they think they have is eroded to a charade – a vain gesture of professionalism and responsibility. If organizations were really concerned about learning and accountability, they wouldn’t allow such sloppy evaluation work to be done. They would demand a rigorous demonstration of impact. They would rethink the real cost of evaluation strategies that value quantity over quality. They would stop sophisticating and be more transparent about what they really get from their evaluations.
Stop the charade.
* There is a third purpose – transformation – which I’ll save for a future blog post
Mask image by Simon Howden / FreeDigitalPhotos.net
Gauges images by Michal Marco / FreeDigitalPhotos.net