Development and
Cooperation

Elasticsearch Mini

Elasticsearch Mini

Evaluation

Dubious precision

The results of statistical analyses can only be as good as the data they are based upon. Quantified assessments often seem more precise than they actually are. To determine the effects of developmental interventions, it is advisable to draw upon a range of methods. As the target groups are the core constituents of development, they should be involved in all evaluation efforts.

[ By Susanne Neubert and Rita Walraf ]

The call to apply rigorous methods when evaluating the impact of developmental interventions can no longer be ignored. The Network of Networks on Impact Evaluation (NONIE) and the International Initiative for Impact Evaluation (3IE) are working towards standardisation of evaluation methods, and both emphasise rigorous approaches. That is significant. A few years ago, such methods tended to be side-lined in development circles. In this essay, the term “rigorous methods” is used for experimental and econometric approaches.

Experimental methods are based on the comparison between intervention and control groups. Econometric methods analyse statistical data sets (see related article by Jörg Faust on page 14). The two approaches differ considerably. However, in practice they are often combined, and they do have some common features.

It seems questionable, however, whether rigorous methods really take us much further. The main advantage is that with their help it is possible to close the so-called “attribution gap” between the development intervention and its impact. They thus allow to determine and weigh causal relations. On top of that, econometric approaches have the advantage that their quantified results can be aggregated.

However, there are also numerous disadvantages to rigorous methods. These should be considered before making a decision on what method to use. While rigorous methods do prove causal relations, they do not help to explain such relations. For that purpose, one needs qualitative methods. Rigorous methods, in themselves, do not result in useful policy advice. That is quite unsatisfying – particularly in view of the fact that randomised experiments normally incur very high costs.

Another disadvantage is limited applicability. Randomised experiments are suitable for relatively simple interventions, such as vaccination campaigns. Such analyses may indeed be relevant, for instance if they prove a vaccination to be effective. Nonetheless, rigorous methods often do little more than confirm what common sense would have suggested anyway.

In international development policy, the programmes and their scope are becoming increasingly more complex, moreover, so that the question of success cannot normally be answered in terms of merely yes or no. An additional challenge is that it is often difficult to form useful control groups. In many cases, doing so is impossible – just consider ambitious nation-wide sector reforms, budget support or expert advice to a national government.

Econometric approaches do not have to deal with the control group problem. In principle, they make sense when there are sufficiently large data bases with large, statistically representative sets (“large N”). In addition, the progress under review must be quantifiable. However, econometrics cannot be used if “N” is small or the phenomenon at stake is unique (the United Nations, for example).

It is true that some progress has been made in recent years concerning econometric evaluation methods. For instance, we now have indices for multi-dimensional phenomena (governance, human development et cetera), which were not assessable in quantified terms previously. The snag, however, is again that the methods, while linking causes to effects, do not explain such links. In addition, the picture tends to blur the more factors come into play.

Most fundamentally, however, the validity of econometric results depends on the validity of the data sets on which they are based. Unfortunately, however, such data sets tend to be rather unreliable. Even in rich nations, official statistics are often distorted and misleading. In poor countries, such statistics – if there are any at all – are rarely more than rough estimates. Econometric methods thus tend to give a vastly overstated impression of objectivity.

Qualitative potential

Despite these defects, the call for rigorous methods of impact analysis can no longer be ignored. We believe this is due to failings in how systematic qualitative and participatory designs have been made use of in practice. A number of such concepts exist (IMA, MAPP, MSC, PIM, PCA), but they have not become commonplace in development contexts.

One common point of criticism is that these
methods are geared too much towards the target groups whereas modern development-policy approaches aim to have an overall structural impact. This argument, however, misses the fact that even “target-group remote” programmes at the macro-level must ultimately serve local people. If administrative decentralisation, for instance, does not lead to positive results at the grass-roots level, something has
obviously gone wrong. Accordingly, it makes sense to involve the target group.

Any experimental method, on the other hand, will be futile unless it considers suitable indicators of success indicators at this level. Experimental methods, by the way, are only applicable to specific groups too, and do not provide evidence of any systemic, macro-level effects.

Advocates of experimental methods are fond of complaining that participatory evaluation methods only consider target groups that are affected in a positive way (beneficiaries), but not those affected negatively or not at all. This shortcoming, however, can be rectified incrementally. Most systematic qualitative methods already provide for the inclusion of different stakeholders. The design can easily be extended to non-affected people, this is merely a question of budget.

A further widespread objection to qualitative methods is that the results cannot be aggregated. However, various approaches today use quantified grades, which make it possible to compare results.

What those who criticise participatory approaches overlook, however, is that inclusion of target groups is a worthy aim in its own right. These are the people who matter. They are the ones to most keenly feel the impacts (whether positive or negative), and they are the ones to demand or even initiate change. The target groups are the subjects of reform, and they should be included in impact assessments as a matter of course.

Academic debate over methodology has its intellectual appeal. In practice, however, it is never possible to satisfy all wishes. What method to choose will always depend on what is to be evaluated, who will use the results, what capacities there are and other similar issues. The choice should be based on a profound understanding of every method’s pros and cons.

A “roadmap” will soon be published with two dozen tried and tested, qualitative and participatory but also rigorous methods for impact evaluation in developmental practice. This is most welcome. The roadmap was designed by the Impact Analysis Working Group of the German Evaluation Society (DeGEval). In principle, a well-considered mix of methods with qualitative and quantitative elements is always to be recommended.