What could the politician at the party claim credit for?

I was at a party the other night talking with a group of people about what I do in the outcomes area. The normal reaction I get when I tell them that I’m a psychologist is straight forward.  However, when I tell them that I’m  an outcomes strategist I usually get the following reaction – they look at me, gesticulate, roll their eyes and say, ‘Oh, it’s so  hard to prove that what you did  changed high-level outcomes’. Of course,  this is what happens in the capital city where I work  because just about everyone here is either a policy wonk, or in a relationship with one. And we all know that the whole international wonkery is  obsessed with measuring outcomes.

In the rest of the country I usually get blank stares and people tend to quickly move on to the next guest to talk about something that makes sense. But sometimes I get people who just don’t perceive that there’s any problem to be solved in measuring outcomes. It’s always a little disturbing to have someone implying that there’s no real basis for a whole area of work you’ve involved in. I got this some time ago from a taxi driver on the way to an evaluation conference. I also got it again the other day the other night at the party.

A guest, who I later found out was a local government politician, heard me talking about being an outcomes strategist. He launched into something along the lines of: ‘I would have thought it was very easy, just measure the dollars’. Initially presuming he worked in the private sector, I gave my usual speel about the private sector and outcomes. In comparison to the public sector, it has the huge advantage that its outcomes are always measured (well the ones that people mostly focus on) and the measure is a common one (the dollar) which is used right across the entire sector, regardless of the type of work people are involved in. There’s also some more complicated stuff about the sector tending to have a more relaxed attitude towards attribution (proving exactly what caused what) than the public sector. I’ll blog about that second part sometime in the future.

When I introduced the point that non-financial outcomes, rather than financial outcomes, are at the heart of what’s done in the public sector, he then said something like: ‘well you just measure all that in surveys’. He thought that the whole problem of outcomes was simply solved by tracking outcomes over time. I pointed out that whether things were getting better in the district where he was in charge  said nothing about whether this was caused by his work. Things might be getting better in every city in the world because of general positive trends affecting everyone.

Up until this point, in my view, he was simply committing the basic outcomes fallacy of thinking that measuring a not-necessarily controllable indicator somehow shows that one has improved it. (see Duignan’s Six Type of Evidence That a Program Works diagram).

When I told him as politely as I could that I though he was not actually proving anything about what he was personally making happen, he introduced a more sophisticated argument which cannot be dismissed so easily. This argument was that he ‘hears from the people all the time’ and that he gets feedback from the different encounters he has with the people who live in his district. He also added that ultimately they would tell him if he wasn’t doing a good job.

Our conversation got interrupted about this time so I didn’t  get to continue talking to him. However, thinking in formal outcomes theory terms, in this second part of the conversation, he could have been making two somewhat different arguments. One is that his immersion in the nitty-gritty of working with the people in his district brought him into direct contact with the lower-levels of the outcomes model he was seeking to achieve (the model of the steps needed to achieve high-level outcomes – which can be operationalized in the form of a visual DoView). Being able to directly ‘see’ that the lower-level steps were being put in place (e.g. new environmental regulations), and having a sound logic of the intervention at hand (environmental regulation leading to a better environment), plus a measure that environmental issues were improving,  it was reasonable for him to claim that he had established he was having an impact. In Duignan’s Types of Impact Evaluation Designs, this is the seventh type of design: Intervention Logic (Program Theory/Theory of Change) Based Designs. It can be accepted as a credible impact design by stakeholders in some situations. Of course there’s always the question of who the observer is who is making the claim that lower-level steps have been achieved. But presumably we could get some independent assessment as to whether the lower-level steps were, as he was claiming, happening, so the logic of the design makes theoretical sense as a way of attempting to prove impact.

An alternative argument he could have been mounting, if the wanted to be very pragmatic, is that the fact that he keeps getting re-elected is what ‘hearing from the people all the time’ means in practice. Looking at it this way, he would be defining his outcomes as not changing things in his community (which he may well wish to do) but just as a matter of him getting re-elected. If this is the case, then the fact that he is regularly re-elected means that, by definition, he is achieving his ‘outcome’. And this outcome could be translated into something like ‘keeping the people satisfied’. The argument then would be that keeping the people satisfied was the best way of achieving outcomes for the community within a democracy. I think that this is an example of pulling the ‘outcome’ you’re trying to show you changed back down the outcomes model so they get to some lower-level where its easier to prove attribution.

So while, in my view, his initial claims about it being easy to figure out what is causing outcomes were weak and did not establish anything actually about him having an effect on outcomes, his second round of argument had more substance to it.

Want to know more? http://About.Me/PaulDuignan

Are expert and key informant judgment evaluation designs types of ‘impact evaluation’

Up on the American Evaluation Association Linkedin group, I’ve started a discussion about what are the range of evaluation designs which can be regarded as impact evaluation designs.

I have a typology of seven major impact evaluation design types used in Duignan’s Impact Evaluation Feasibility Check. http://doview.com/plan/impact.html.

At least two of those design types – expert judgment and key informant judgment design types – are not seen by some as being appropriate to be called ‘impact evaluation’ designs. Some want to restrict the definition of impact evaluation designs to types such as Randomized Controlled Trials. Key informant designs are where groups of people ‘in the know’ about a program are asked questions about the program.

My definition of an impact evaluation design is one where someone is making a claim that they believe a program has changed high-level outcomes. In my Types of Evidence That a Program ‘Works’ Diagram (http://outcomescentral.org/outcomestheory.html#4), impact evaluation is conceptually distinguished from implementation evaluation on the basis of it making such a claim.

In contrast, non-impact, implementation evaluation (where you do evaluation for program improvement even in situations where you cannot measure impact) is not trying to make such a claim. I am not saying here that every type of key informant or expert design is impact evaluation, just ones where a question is asked along the lines of: ‘In your opinion did the program improve high-level outcomes’.

I think that if this question is asked, then the evaluation is trying to ‘make a claim about whether a program changed high-level outcomes’. The question of whether particular stakeholders believe this to be a credible claim in a particular situation is a conceptually different questions. And there are many stakeholders who would not regard it as such. However, this does not detract from the conceptual point that, if you can find stakeholders who in some situations would regard key informant or expert judgement designs as sufficiently credible for their purposes, then these designs can be regarded as a type of impact evaluation.

My broader purpose with this thinking within outcomes theory is to get the full list of possible impact evaluation designs considered in the case of any program so that we don’t just get obsessed with a limited range of impact evaluation designs, useful though things like Randomized Controlled Trials (RCTs) may be in some circumstances.

Putting the Planning back into M&E – PME or PM&E what’s the acronym going to be?

In a posting on Linkedin, Leslie Ayre-Jaschke talked about the growth of PME – or maybe it will end up being called PM&E, or something else. Regardless of the acronym, it’s the movement to put planning back into monitoring and evaluation. ‘Putting the P back into M&E’ was the subtitle of a workshop I ran in South Africa for UNFPA several years ago. I think that it’s a concept that’s going to get a lot more traction over the next few years.

It’s consistent with what evaluators like Michael Patton, and many of us in the evaluation community, have been talking about for years. We’ve been talking up the key role of formative evaluation – evaluation aimed at making sure that programs are optimized. And formative evaluation is all about making sure that programs are well planned.

The point of this approach within evaluation is that it’s often pointless to evaluate a badly planned program. Evaluation resources would be better spent on making sure that the program is better planned than on measuring the fact that it often will not achieve its outcomes due to the fact that planning has been poor.

The new PM&E movement is not just about evaluators and evaluation, it is much broader than that taking in people from a range of disciplines. This new integrated approach which is emerging needs an underlying theory which will appeal to all of the different disciplines involved – strategic planners, performance managers, evaluators, contract managers, policy analysts etc. The work I’ve been doing in outcomes theory has been designed to meet this need.

The purpose of outcomes theory is to provide an integrated conceptual basis for PM&E-type approaches. A common conceptual basis is needed if people across the different disciplines and sectors are going to be able to share conceptual insights about how they identify, measure, attribute and hold parties to account for outcomes when doing planning, monitoring and evaluation. Good theory is needed to help them quickly sort out the type of conceptual confusion that current characterizes much of the discussion of outcomes related issues. As the famous social scientist Kurt Lewin said – ‘there’s nothing so practical as a good theory’.

This aspiration of outcomes theory is summarized in the diagram below showing how it’s a meso-level theory reaching across strategic planning, monitoring, evaluation etc.

d131-2
(see http://www.outcomescentral.org/outcomestheory.html for more on this)

For people just working out in the field, who don’t  need to know much theory, outcomes theory principles have been hard-wired into the DoView Visual Planning, Monitoring and Evaluation approach http://doview.com/plan. Using the approach means that they will avoid many of the technical problems which are highlighted by outcomes theory.

Large-scale visual models of a program (drawn in the correct way, for instance as ‘DoViews’) provide the ideal foundation for the new fully integrated approach to planning, monitoring and evaluation which many are now seeking. http://doview.com/plan/draw.html.

Outcomes theory Unequal Input Principle – Op Ed applied to school national league tables

Just written an Op Ed on the question of school national league tables. It tries to move the argument away from a political argument to a technical one. It’s part of my initiative to show how outcomes theory (outcomestheory.org) can provide a technical lens on debates that are current seen as largely political.

The outcomes theory principle is the Unequal Inputs Principle (the ‘level playing field’ principle – in the article called the Equivalent Input, Equivalent Outcome Principle) which states: Where the inputs of units being compared are equivalent, the raw outcomes from such units can be used to measure the relative performance of different units. Where inputs to units differ, the amount of improvement is a better measure of unit performance than just raw outcomes.

The Op Ed argues that if you are wanting to improve student performance you should look to see if you can develop a measure of value-added rather than just raw academic performance. This is because of the differing academic level of kids entering different schools. Whether you can develop a value-added measure is another question, but the clear problem with a raw score approach when it is used to incentivize teachers is that they will just move to  schools which have pupils entering who are already functioning at a high academic level.

Here’s the Op Ed. http://www.stuff.co.nz/dominion-post/comment/7753957/Outcome-theory-and-education

Paul Duignan, PhD. More at OutcomesCentral.org and on Twitter at Twitter.com/paulduignan.

 

 

Stop the terminological madness now! ‘Outcomes’, ‘impact’, ‘results’, ‘goals’ and the Buffalo Dung Problem

All I can ask is ‘when will it stop’? As we speak I’m burning up bandwidth on an EVALTALK (the evaluators list) discussion about the distinction between a ‘goal’ and a ‘mission’. I’m on Linkedin where people are arguing about the distinction between a ‘result’ and an ‘outcome’ and I’ve someone emailing me from Europe preoccupied about why I don’t draw a distinction between an ‘outcome’ and an ‘impact’ in my outcomes theory work.

I think that Karyn Hicks on EVALTALK has come up with the best term for these debates, calling them the Buffalo Dung Problem! This stems from her being in a meeting involving one of these endless debates and her Director hollering ‘Well #!@ we can just call it buffalo dung for all I care’! From then on she’s called it the Buffalo Dung Problem.

Most of these Buffalo Dung discussions are a total waste of time and we can think about this in terms of there being two underlying issues:

1. These terms are all used in a common sense way by stakeholders to mean roughly the same thing: ‘the stuff we’re trying to achieve’. It’s ultimately futile to try and force the rest of the world to use them in very specific ways that suit us for our technical work. If we were physicists and no one had any common sense uses for our terms – like Boson Particles and Quarks – we could define them how we liked and insist that the people using them use them in a very precise technical way. We simply do not have the power to insist that people use terms in the way we want because we work amongst a wide variety of lay stakeholders who will use terms in whatever way they want to.

2. When we insist on using terms in a particular way we are usually trying to pack into the one term a number of technical distinctions which it is better to tease apart. These distinctions include things such as: 1) where something fits within a causal pathway; 2) whether it’s measurable or not; 3) whether it’s controllable or not; 4) whether it’s going to be used for accountability or not.

For instance in one of the discussions I’m involved in at the moment, it’s being suggested that maybe the term goal should be restricted to: 1) something below a thing called a ‘mission’ within a causal pathway; 2) something that is measurable; and, 3) something that is controllable. The problem is that when we ask an unsuspecting lay person to give us their ‘goals’, they have no way of knowing from just this word that we want a very specific thing from a technical point of view. We want something which has three specific technical characteristics. It’s far clearer to forget the word goal and tell them that we want something that is measurable and controllable by them (distinctions 2 and 3 above). We can achieve our first distinction – the position in the causal pathway – much more elegantly by just doing the whole thing in the form of a visual outcomes model.

A fully visual approach gets rid of a lot of the terminological madness which stems from trying to specify a particular location within a causal pathway, e.g. having to insist that a process is before an immediate outcome and that is before an intermediate outcome and that is before an impact.  When you try to do it in this way you inevitably get people then asking you where a result, goal, mission and vision fit into the schema.

You can eliminate this entire debate by simply working in a totally visual way. You can do the whole work of building an outcomes model visually just by talking about boxes within the model and the end-box(s).  Being a little less extreme, I normally talk about steps and at the end of the steps there are final outcomes.  But I couldn’t care less what people want to call the boxes at the end of the visual model. The visual approach eliminates the need to use words to describe particular positions within the causal pathway – you can just point at them (or if you are not physically present color them up, e.g. the green boxes).

Having eliminated this major cause of terminological stress by working visually you can then next deal with distinction 2, measurement. This is best though of in terms of a measurement being an object you put onto a visual model next to a box. It is something that measures that box. I happen to call these indicators but again couldn’t really care less what you call them as long as you maintain the idea of measuring things.

Then you need to deal with the 3rd distinction – controllability. This is best done by simply marking up the indicators that are controllable in some way. Make them red, put a letter next to them, whatever you like. But just think of it in terms of a particular type of indicator being controllable.

Lastly you need to deal with distinction 4 – whether a party is going to be held accountable for something. This is best dealt with by simply marking up the indicators which a party will be held accountable for. In the public and non-profit sector, these usually are exactly the same as the controllable indicators you’ve just marked up.

It’s as easy as that, you simply do not need the terminological madness so many people are currently involved in. I would love someone to work out the sum total of human time, effort and bandwidth (and hence dollars) which is currently going into these endless terminological debates.

William of Occam was a medieval philosopher who came up with Occam’s Razor – ‘do not multiply entities beyond necessity’. He was trying to stop the the type of madness where people in his time used to make dozens of distinctions between different types of angels. We have the same problem on our hands at the moment with the Buffalo Dung problem. I’m an Occam’s Razor fan myself – let’s just stop the madness!

To see how to work practically in this way as I do and those who use DoView Visual Planning and Management do all the time, see: http://doview.com/plan/draw.html that link shows you the 13 rules for building elegant but accessible visual models that you can use in the way described above. This url:  http://doview.com/plan shows you how you can used the whole process for accountability, evaluation, reporting etc.

Want more detail and references to this thinking? The following is a technical article about this issue (read the summary referenced at the start of it if you do not have time to read the whole article): Duignan, P. (2009). Simplifying the use of terms when working with outcomes. Outcomes Theory Knowledge Base Article No. 236. ( http://outcomestheory.wordpress.com/article/simplifying-terms-used-when-working-2m7zd68aaz774-73/ ). The substance of this article formed the basis for Duignan, P. (2009) Rejecting the traditional outputs, intermediate and final outcomes logic modeling approach and building more stakeholder-friendly visual outcomes models. American Evaluation Association Conference, Orlando, Florida, 11-14 November 2009.)

And the following article talks about the different dimensions we get mixed up in our outcomes and evaluation work:

Duignan, P. (2009). Features of steps and outcomes appearing in outcomes models. Outcomes Theory Knowledge Base Article No. 208. ( http://outcomestheory.wordpress.com/article/features-of-steps-and-outcomes-2m7zd68aaz774-20/ ).

Paul Duignan, PhD. Follow me on this OutcomesBlog.org; Twitter.com/PaulDuignan; or via my E-newsletter and resources at OutcomesCentral.org.

Theory of Change Versus Theory of Action

What’s the difference between a Theory of Change and a Theory of Action? I’m just clarifying my thoughts on this issue and how it relates to my work thinking about how we conceptualize outcomes models (logic models) within outcomes theory. In summary, at the moment – apart from a Theory of Action just being an outcomes model drawn at a lower level – I can’t see a major difference. However I’m happy to be contradicted on this and will change my view if there are convincing arguments for making the distinction. My current thinking is as set out below. Continue reading

Can an exhaustive list of impact evaluation designs be developed, or is my mission on this futile?

I have set out on a mission as a part of outcomes theory to attempt to develop an exhaustive list of impact/outcome evaluation designs – evaluation designs which make a claim that changes in high-level outcomes can be attributed to a particular intervention. If we could pull off developing such a list that most people are happy with, it would be very powerful. First it could be used in evaluation planning to work out if all of the possible impact evaluation designs had been assessed for their appropriateness, feasibility and/or affordability. At the moment I think that almost every evaluation planner walks around wondering if there is some sort of impact evaluation design they have not considered.
Continue reading

Formative evaluation versus impact/outcome evaluation

In response to a posting on one of my outcomes theory articles by Marcus Pilgrim who ran the recent YEN Evaluation Clinic in Damascus, I have worked up an article on the difference between formative, process and impact/outcome evaluation. As Marcus points out in his posting, the term formative (or developmental) evaluation is not one which is widely known in all sectors. Formative evaluation is directed at optimizing program implementation. Process evaluation attempts to describe the course and context of a program. Impact/outcome evaluation looks at the intended and unintended, positive and negative outcomes of a program and whether they can be attributed to the program. Continue reading

Mapping indicators onto a logic model is obvious – but why haven't we always done it?

I was running a workshop today teaching policy analysts the basics of my approach to program evaluation (Easy Outcomes). One of the participants, when I talked about the importance of always mapping indicators back onto a visual model, commented that when you do it, it’s so obviously the right approach that you can’t understand why we’ve not been doing it for years.

The idea behind this approach is that the way we almost always approach indicator work is to eye-ball a list or table of indicators and ask the question of a group of busy people sitting around a table – ‘does this list of indicators look any good?’
Continue reading

Randomistas Rule

Just read and commented on an interesting article referred to on the 3IE site – a site dedicated to improving evidence about what works in international development. The article was by Martin Ravallion and was about the rise of the Randomistas in international development economics. Randomistas are those who promote much more use of randomized trials to try and work out what works in international development. It is a good article which points out the fact that randomized trials are not feasible in many important types of development interventions. This debate is the same one which is occurring in many sectors at the moment and one which has been debated on and off in the evaluation field for many years. My take on it is that we need to develop some underlying principle which we can debate and generally agree on so that we don’t need to have this debate endlessly without seemingly making much progress on it.
Continue reading