Evaluation Guidelines

Colophon
Title: Evaluation Guidelines
Publisher: Ministry of Foreign Affairs of Denmark
Responsible institution: Evaluation Department, Ministry of Foreign Affairs
Author: Evaluation Department, Ministry of Foreign Affairs
Other contributors: Graphic Production: Designgrafi k A/S, Copenhagen; Print and Web: Schultz Grafi sk
Language: English
URL: http://www.netpublikationer.dk/um/7571/index.htm
ISBN: 87-7667-616-1
Digital ISBN: 87-7667-618-8
ISSN: 1399-4972
Version/edition: 08-12-2006
Data formats: html,htm,jpg,gif,pdf,css,js
Publisher category: statslig
Table Of Contents
PREFACE
INTRODUCTION
CHAPTER 1 EVALUATION DEFINITION AND PARTIES
CHAPTER 2 DANIDA EVALUATIONS
CHAPTER 3 DANIDA’S EVALUATION PROCESS
CHAPTER 4 EVALUATION QUESTIONS
CHAPTER 5 EVALUATION TEAM TASKS
ANNEX 1 BIBLIOGRAPHY
ANNEX 2 DAC EVALUATION QUALITY STANDARDS
ANNEX 3 DANIDA’S EVALUATION POLICY
PREFACE
Evaluation has been used systematically since the early days of Danish development assistance. In 1982 Danida established a special unit responsible for evaluation. The use of evaluations has developed success- 3 PREFACE ively in three main stages:
Prior to 1982 evaluations focussed essentially on individual projects and programmes. Most of these were mid-term or phase evaluations conducted as the project moved from one phase to the next. Only a few end-evaluations were conducted, and only occasionally were ex-post evaluations carried out to study the long-term effect of projects.
In the period 1982-87, after Danida’s Evaluation Unit was established, it was agreed to use evaluations more systematically to improve the quality of Danida financed development activities. Also in this period most evaluations were mid-term or phase evaluations of individual projects. The tendency was to replace mid-term evaluations with internal reviews and increase the number of end-evaluations. The use of evaluations was more systematic in the sense that it was guided by an annual evaluation programme to ensure that the sample of evaluated projects and programmes were representative for Danish bilateral assistance.
In the period 1987-97 the number of individual project evaluations was reduced and the number of thematic and sector evaluation increased. As a policy, all evaluation reports were made public.
In 1992, informing the public was included as a distinct objective for evaluation in accordance with DAC principles. In this period evaluations became more experimental and included a small number of impact evaluations as well as use of participatory methods. In general, all evaluations were – and still are – conducted by independent consultants.
In 1997 a formal evaluation policy was promulgated. At the same time the Evaluation Secretariat was established as a separate, independent entity within the Ministry of Foreign Affairs. Since then the development evaluation scene has changed with joint evaluations becoming much more prominent. The Millennium Development Goals, the Rome Declaration on Harmonisation and Alignment and the Paris Declaration on Aid Effectiveness have wide implications for the ways evaluations are carried out. Consequently the evaluation policy was revised in 2006 (see Annex 2).
The first evaluation guidelines were adopted in 1986. They were substantially revised in 1994 and again in 1999. These guidelines build on the 1999 guidelines and reflect the new policy as well as the DAC Guidance for Conducting Effective Joint Evaluations and the DAC Evaluation Quality Standards, both issued in 2006.
The primary purpose of the Evaluation Guidelines is to communicate to partners and external consultants Danida’s expectations of the quality of evaluations. They constitute a framework built on principles, criteria, standards, good practices and information about Danida’s evaluation process. Particular emphasis has been put on defining the roles of various stakeholders in evaluations and on developing codes of conduct for these stakeholders.
The guidelines are not intended as an evaluation manual. Danida requires that its external consultants have the competencies expected of evaluation professionals: a sound grasp of evaluation methodologies, the skills and abilities to carry out evaluations as well as up to date knowledge of developments in the field of evaluation, in particular development evaluation.
Many people have contributed to these guidelines. First and foremost the staff of the Evaluation Department with support from Ian Davies. Feed-back from consultants, colleagues and partners on how we do business has been invaluable as has been comments on drafts from Ray Rist and Linda Morra of IPDET.
Niels Dabelstein
Head of Evaluation Department
INTRODUCTION
Pupose The primary purpose of the Evaluation Guidelines is to communicate to partners and external consultants Danida’s expectations of the quality of 7 Purpose evaluations carried out on its behalf.
Quality Because quality touches on different aspects of the evaluative enterprise such as use, methodology and management, the guidelines explain Danida’s approach to evaluation of development interventions and identify those attributes it considers important to quality.
The Evaluation Guidelines apply to evaluation activities financed fully or partly by Danida: they constitute a framework for achieving quality that is built on principles, criteria, standards, good practices and information about Danida’s evaluation process.
External consultants The guidelines are not intended as an evaluation manual. Danida requires that its external consultants have the competencies expected of evaluation professionals: a sound grasp of evaluation methodologies, the skills and abilities to carry out evaluations as well as up to date knowledge of developments in the field of evaluation, in particular development evaluation.
As such, the guidelines incorporate concepts, methods, tools and issues basic to professional evaluation practice without discussing them at length; in some cases they provide a reference to more information. Interested parties The guidelines are also intended as a source of information on Danida’s evaluation practice for partner countries, colleagues outside of the Evaluation Department (EVAL), staff of the Ministry of Foreign Affairs and of the government of Denmark in general, Danida’s technical assistance personnel in the field, bilateral and multilateral assistance organisations, and the broader evaluation community.
Contents The guidelines are presented in five chapters.
Chapter 1 provides key definitions and distinctions used by Danida in its evaluation activities, identifies main parties to the evaluation process and discusses joint evaluation in Danish development assistance.
A discussion of differences and similarities in evaluating different types and aspects of development cooperation is included in Chapter 2.
Chapter 3 presents the steps and procedures taken by the Evaluation Department to prepare the evaluation, oversee its implementation and disseminate its results.
The evaluation criteria that serve as a basis for developing evaluative questions are presented and discussed in Chapter 4. Chapter 5 addresses the implementation of the evaluation by the external evaluator.
CHAPTER 1 EVALUATION DEFINITION AND PARTIES
What is evaluation?
In general terms evaluation is used in society at large to establish achievements with some accuracy. All major development agencies involved in international development assistance undertake formal evaluations of part of their development activities each year.
Definition Danida uses the OECD Development Assistance Committee’s1 (DAC) definition of evaluation.
An assessment, as systematic and objective as possible of an on-going or completed project, programme or policy, its design, implementation and results. The aim is to determine the relevance and fulfilment of objectives, developmental efficiency, effectiveness, impact and sustainability. An evaluation should provide information that is credible and useful, enabling the incorporation of lessons learned into the decision-making process of both recipients and donors.
(DAC 1991)
Evaluation Danida carries out or participates in evaluations at different stages of development interventions and defines their purpose as either primarily knowledge or accountability uses, or both.
Evaluation of ongoing development activities with a primary purpose to generate information to improve the quality of the intervention typically focus on implementation issuesand and operationel activities, but may also take a wider perspective and consider effects. As they are performed usually about midway in the cycle of the intervention, these evaluations may also be referred to as mid-term evaluations by other organisations.
Other evaluations are undertaken after completion of the aid intervention to understand the factors that affected performance, to assess the sustainability of results and impacts, and to draw lessons that may inform other interventions. The terms ’formative’ or ’summative’ evaluations are also utilized to distinguish the evaluation types.
Management’s operational tools
Although there is increasing agreement in the evaluation community on the use of basic terms2, differences exist and it is useful for clarity to explain the terms used in evaluation and various assessment tools related to the operational preparation and monitoring of the implementation of development interventions:
Appraisal Some agencies use the term evaluation for prospective studies conducted prior to a project’s approval. For example the European Commission conducts ex-ante evaluations of intended programmes3.
Such studies are termed “appraisals” by Danida: “The aim of the appraisal is to provide quality assurance at an advanced stage of the preparation process of a programme or a component, but early enough for the recommendations provided by the appraisal to influence the final preparation of programme and component documents.” (Guidelines for Programme Management)4 Danida’s Technical Advisory Services is responsible for all major appraisals
Review DAC defines “review” as an assessment of the performance of an intervention, periodically or on an ad hoc basis. Danida refers to the same definition in its Aid Management Guidelines. The main distinction is that a review is regarded as an internal management tool for operational monitoring of the implementation of the targets of the interventions, while the evaluation is an independent, in-depth external assessment of the objectives, implementation and results of the interventions. Major reviews are undertaken by the Technical Advisory Services, frequently as part of joint partner reviews.
Performance monitoring The trend of public administrations worldwide to reform their inputdriven and highly procedural administrative practices in favour of resultsbased approaches to management and accountability, and the movement in the international donor community towards providing a increasingly greater proportion of funding in the form of sector or general budget support to governments of partner countries, has underlined management’s responsibility to measure and report on the achievement of agreed results. Performance measurement of set targets has become another term for the management’s monitoring function. Danida’s Quality Assurance Department undertakes regular performance reviews of the Danish Embassies.
Performance audit The distinction between performance audit and evaluation can be thin as typically both deal with the “what” and “why” questions of programme and organisational performance.
A useful way to distinguish performance audit from evaluation generally is to highlight that performance audit is carried out primarily to provide assurance, i.e. accountability, on a direct or attest basis, with knowledgegeneration a secondary spin-off. Evaluation on the other hand is carried out usually to inform policy and decision-making as well as to produce knowledge.
Main parties to an evaluation
Successful evaluations are based on collaboration between key participants and due consideration to the interest of the different parties in the evaluation. These include the commissioning organisations, evaluators, users, and stakeholders.
Main parties to an evaluation
Commissioner The organisation that initiates and funds the evaluation such as Danida, other donor agencies partner countries or these jointly.
Evaluators Independent expertise from private companies or public institutions.
Stakeholders Agencies, organisations, groups or individuals with direct or indirect interest in the development intervention or its evaluation.
Users A sub-set of stakeholders with formal or direct involvement in the development intervention or its evaluation: decisionmakers with a formal role in following up the evaluation, e.g. policy-makers, auditors, etc., and managers of development interventions.
Evaluations are commissioned by Danida’s Evaluation Department, alone or jointly with other donors, and with the partner country.
Increasingly, joint evaluations are used to rationalise the evaluation process, reduce transaction costs for partner countries, improve quality of the work undertaken, and increase weight and legitimacy of the evaluation. Usually one of the collaborating agencies is responsible for commissioning the joint evaluation. (See next section on joint evaluation).
Evaluators are the organisations or individuals that carry out the evaluation, i.e. collect and analyse data, judge the value of the assistance intervention and produce the evaluation report. They are selected usually among international and Danish consulting firms and research institutions.
Stakeholders are those agencies, organisations, groups or individuals that have an interest in the development intervention or its evaluation but not necessarily a formal role in it.
Stakeholder and user participation in the evaluation process promotes con sensus and ownership in relation to development activities. Participatory approaches are capacity building processes that develop evaluative thinking and contribute to indigenous knowledge and sustainability.
Users of evaluations are stakeholders with a specific relationship to the intervention and its evaluation. They include policy makers, Danida policy and programme managers and staff, managers of assistance interventions, developing-country partners and other parties with a formal or direct role in relation to the development activities under evaluation.
To impact decision-making, evaluations address questions that are relevant to users and responsive to their information needs, i.e. evaluative information is meaningful, reliable and timely.
Inclusion of users and stakeholders in the evaluation process contributes to useful evaluations. Dialogue with stakeholders and users improves the evaluator’s understanding of, and the evaluation’s responsiveness to, their needs and priorities. The views and expertise of groups affected by the intervention and its evaluation are considered and integrated to the evaluation process whenever appropriate through mechanisms such as steering or advisory groups.
Joint evaluation
Joint evaluations of development assistance have been conducted by various groups of donors and countries since the late 1980’s and a significant number have been undertaken since 2000. Danida has participated in joint evaluations since 1989 (see box).
What is joint evaluation? According to the DAC Network on Development Evaluation5: “Joint evaluations are development evaluations conducted collaboratively by more than one agency. The focus here is not on participatory evaluation with its techniques for bringing stakeholder communities into the process, but on evaluations undertaken jointly by more than one development cooperation agency. Joint evaluations vary considerably in the number of participating agencies and in their focus, purpose and approach. Methodologies used can also differ widely, ranging from desk reviews of existing information to fieldwork in developing countries.”
Danida and joint evaluation The changing modes of Danida’s, and the global donor community’s, development assistance increase the significance of joint evaluations. Sector programme support and national development programmes rely on collaborative, multi-donor assistance efforts with shared common objectives, joint strategies and on partner country leadership roles.
Effective evaluations of such new types of assistance call for more collaboration. Clearly, it is a waste of resources on both the donor and partner side when several donors perform individual evaluations of their support to the same sector.
Key objectives of a joint evaluation process are to ensure that the evaluation becomes an efficient learning tool, helps promote good governance, enables the partners to be fully accountable and is cost-effective. Joint evaluations can focus on vital areas and help consolidate international responses and development policy.
Joint evaluations with Danida leadership or participation
- The Nordic Agricultural Programme in Mozambique (MONAP) was evaluated jointly by Denmark, Norway and Sweden in 1991 which resulted in a shift of priorities of the three countries.
- UNICEF’s technical program was evaluated jointly by Australia, Canada, Denmark and Switzerland in 1993. It contributed to major changes in UNICEF’s mode of operation.
- The international response to conflict and genocide in Rwanda was evaluated jointly by 38 bilateral donors under Danish leadership, UN agencies and NGOs in 1996. It identified key issues and problems of international humanitarian assistance and has induced changes of policies and strategies of humanitarian agencies.
- The road sub-sector programme in Ghana was evaluated jointly by eight development agencies and the Ministry of Road & Transport in 2000. A follow-up study was conducted in 2005.
- The Basic and Primary Education Programme in Nepal was evaluated jointly by five development agencies and the Ministry of Education & Sports in 2004.
- The humanitarian and reconstruction assistance to Afghanistan was evaluated jointly by five development agencies in 2004-05.
- The Ugandan Plan of Modernisation of Agriculture (PMA) was evaluated jointly under the overall coordination and guidance of the PMA Steering Committee in 2005.
- The Tsunami Evaluation Coalition with over 50 member agencies from across the humanitarian sector evaluated jointly the response to the Asian earthquake and tsunamis in 2005-06.
- The joint evaluation of general budget support was carried out in 2003-06. More than 20 development agencies were represented in the steering committee together with seven partner countries.
Types of joint evaluations Joint evaluations, like single agency evaluations, come in different sizes and shapes: their focus spans a continuum ranging from project to policy including topics such as assistance modalities, themes, delivery mechanisms, etc. They have different purposes, cover single countries, regions or more, use a variety of approaches and methodologies, and involve developing country partners to varying degrees.
In joint evaluations including larger groups of contributors, the involvement of the majority is typically limited to a few key stages such as review and approval of the design and final product. Management and coordination is typically delegated to one or a few agencies taking a lead responsibility or to an external organisation designated to manage the study.
The following table describes the main characteristics of three broad categories of joint evaluations, based on their degree of “jointness”, i.e. the extent to which individual partners cooperate in the evaluation process.
Acknowledging the existence and typology of many complex forms of joint evaluations contributes to reducing confusion and misunderstanding when partners work together.
Benefits and challenges of joint evaluations Joint evaluations provide a number of significant benefits to donors and recipient countries alike particularly in terms of reduced transaction costs for developing country partners and improved harmonisation of development interventions among donors; however, they also present their own set of particular challenges due primarily to the increased complexity of managing and conducting this type of evaluation.
A key advantage of joint evaluations is that they have greater credibility and broader ownership of findings among the larger development community than would be the case with single agency evaluations.
Their value lies in the comprehensiveness of findings and in the validity of lessons learned. They generally yield useful best practises, focus policy discussions among development agencies, and achieve an impetus for joint action that is out of reach of smaller evaluations.
| Type of joint evaluation |
Mode of work / examples |
| 1. Classic joint evaluation |
Participation is open to all stakeholder agencies. All partners participate and contribute actively and on equal terms. Examples of classic joint evaluations include: the Rwanda Evaluation, the tripartite evaluation of WFP, the CDF Evaluation and the GBS Evaluation. |
| 2. Qualified joint evaluation |
Participation is open only to those who qualify – through membership of a certain grouping (e.g. DAC, EU, Nordics, UNEG, ECG, Utstein) or through active participation in the activity (e.g. SWAp, basket funding mechanism, jointlyimplemented programme) that is being evaluated. Examples include EU aid evaluations, the evaluation of the Road Sub-sector in Ghana, the Basic Education Evaluation, the ITC Evaluation, and the evaluation of Uganda’s Plan on Modernisation of Agriculture (PMA). |
| 3. Hybrid joint evaluation |
Includes a wide range of more complex ways of joint working: (a) responsibilities are delegated to one or more agencies while others take a ’silent partnership’ role; (b) some components of the evaluation are undertaken jointly while other parts are delivered separately; (c) various levels of linkage are established between separate but parallel and interrelated evaluations; (d) the joint activity is agreeing a common evaluation framework and responsibility for implementation of individual evaluations is devolved to different partners; (e) research, interviews and team visits are undertaken jointly but each partner prepares a separate report. Examples include Support to Internally Displaced Persons and the Tsunami Evaluation. |
On the other hand, just as co-ordination of development interventions is notoriously difficult, the same applies to evaluations. Donors prepare their evaluation programmes in response to agency needs for lesson learning and accountability, and these are geared to the planning and programming cycle of the agency. For evaluation to develop into a joint management and accountability tool, the donors need to be flexible, to adjust to the planning cycle of the partner countries and to enable them to take the lead in evaluations.
The following is a summary6 of the key benefits and particular challenges of joint evaluations seen from the perspective of commissioners and organisers:
Benefits
- Diversity, Legitimacy and Influence. Joint evaluations enable a diverse range of perspectives and talents on the evaluation. This can lead to high quality, transparent and credible evaluation reports, with broad ownership of findings and greater legitimacy and influence on decisionmakers.
- Broader Scope. Joint evaluations enable a broader number of evaluation questions to be addressed, given extra (jointly-shared) resources.
- Harmonisation and Alignment. Donors are increasingly prioritising partnership-based assistance modalities. Sector-wide approaches, budget support and country-led poverty reduction strategies all stress greater participation and leadership roles for partner countries. Effective evaluations of such assistance modes call for – even require – greater collaboration. Joint evaluations therefore have distinct advantages as development agencies shift their strategies away from isolated projects towards programmatic approaches. Joint evaluations also foster greater consensus among the partner agencies on upcoming development priorities and needs, thus stimulating improved coordination of future programming.
- Rationalisation and reduced transaction costs for developing countries: Joint evaluations reduce the burden and associated costs of multiple donor evaluation efforts on partner country institutions, including overlapping team visits and duplicative data collection. Joint evaluations also help to avoid conveying to partner countries too many different and sometimes conflicting evaluation messages.
- Mutual capacity development: Joint evaluations provide opportunities for agencies to learn from each other and to share their evaluation processes and techniques. They can also lead to enhanced use of local consultants, with consequent local capacity development.
- Cost-sharing: Joint evaluations are normally undertaken with shared or pooled financing, and the cost-burden is therefore divided among the various partners.
- Applicability: Joint evaluations have additional advantages and applicability in certain specific circumstances: if there are issues in the evaluation that are too sensitive or controversial for one agency alone to tackle; if the activities being evaluated have been jointly financed or implemented; when a meta-evaluation is being undertaken; and when an evaluation of the work of a multilateral organisation is being undertaken.
Challenges are often related to difficulties in harmonising differences among agencies’ objectives, operational processes and organisational cultures.
- Planning joint evaluations: A particular problem is that by the time agencies share or make public their evaluation work programmes, it may be too late in the planning cycle of any given agency to agree to undertake it jointly. Also, all agencies have limited time and resources, and getting one’s ’own’ work completed can take priority over ’joint’ efforts.
- Consensus on purpose: Agreeing the overall and political objectives for the evaluation, and ensuring that no partners are driving hidden agendas.
- Consensus on Terms of References: Agreeing comprehensive, yet manageable, Terms of Reference that accommodate the particular issues and interests of all participating agencies.
- Management: Developing joint management structures and communication processes that work effectively.
- Suitability of evaluators: Selecting evaluation teams that are mutually acceptable to all participants.
- Logistics: Coordinating schedules and travel logistics amongst the various partners.
- Evaluative process: Reaching agreement on methodologies, recommendations and reporting formats.
As well, joint evaluations create particular conditions for the external evaluators charged with carrying out a joint evaluation and some principles have been developed for organizing and managing the external evaluation team during joint evaluations.7
1OECD (1991), Principles for Evaluation of Development Assistance. Paris.
2 OECD (2002), Glossary of Key Terms in Evaluation and Results Based Management. Paris.
3 European Commission (2001), Guide to Ex Ante Evaluation. Bryssels.
4 Danish Ministry of Foreign Affairs. Guidelines for Programme Management. Copenhagen.
5 OECD (2006), DAC Guidance for Managing Joint Evaluations. Paris.
6 Breier, H. (2005), Joint Evaluations: Recent Experiences, Lessons Learnt and Options for the Future, DAC Evaluation Network Working Paper, OECD, Paris.
7 Freeman, T. (2004). Managing the External Evaluation Team During Joint Evaluations: Challenges and Responses. Ottawa, Canada: Goss Gilroy Inc.
CHAPTER 2 DANIDA EVALUATIONS
Evaluating development assistance
The emergence of new types of development interventions and the focus on specific priority themes in development assistance poses new challenges to evaluators.
Some of those challenges are methodological: how to evaluate support for good governance, human rights, civil service reform and private sector support.
Some are concerned with policy: how to move from project assistance to more varied and flexible modes of assistance within sector frameworks designed to address policy and institutional problems; how to improve development co-operation and partnership; and how to evaluate the increasing volume of humanitarian relief operations.
Danida’s evaluation programme includes single intervention evaluations (although these are increasingly rare), impact evaluations, sector evaluations, country programme evaluations, evaluations of forms of development cooperation, multilateral evaluations and thematic evaluations.
As the breadth, depth and diversity of Danida’s evaluation topics and collaborations expand, including joint evaluations with other donors or partner governments, their growing complexity poses increasing methodological and logistical challenges to the oversight and conduct of evaluations.
Over time Danida has shifted the focus of its evaluations from projects to more complex modes of development assistance, e.g. sector programme, country programme or thematic evaluations, and these have proved to be increasingly cost effective and with more impact on policy decisions.
Focus on Danida Evaluations
Single intervention Evaluation of an individual development intervention designed to achieve specific objectives within specified resources and implementation schedule, often within a broader program.
Sector Evaluation of a cluster of development interventions in a sector within a country or across countries, all of which contribute to the achievement of a specific development goal.
Country Programme Evaluation of one or more donor’s or agency’s portfolio of development interventions, and the assistance strategy behind them, in a partner country.
Modes of development co-operation Evaluation of a specific instrument or channel for development assistance (research, NGOs, humanitarian assistance, balance of payment support, general budget support, technical assistance, etc.).
Thematic Evaluation of a selection of development interventions, all of which address a specific development priority that cuts across countries, regions and sectors.
Impact Evaluation of a project, programme or policy to assess the intended or unintended effects, positive or negative, of a specific intervention on people’s welfare.
Humanitarian Evaluation of humanitarian action to draw lessons to improve policy and practise and enhance accountability.
Single intervention evaluations
A single intervention, traditionally called a development project, is defined as a time-bound intervention designed to achieve specific objectives with specified resources, usually within a short to medium term implementation schedule, and often within a broader program.
The limited scope of the single intervention makes it relatively easy to focus the evaluation.

Although overall project success is commonly determined by whether outputs are achieved on time and on budget, single interventions in development contexts are evaluated usually from the broader perspective of attainment of objectives and impact, in order to learn from the experience and to improve the design of similar future projects.
To the extent that objectives are specifically expressed and in measurable terms, single intervention evaluations present few methodological challenges.
However, a typical complicating factor is that the objectives are often complex, confounding, unrealistic, ill defined, or change as the project develops. It follows that a single intervention is not only judged on the basis of what has formally been agreed to, rather the assessment takes into account what can be achieved realistically with the resources available.
A single intervention evaluation takes into consideration, yet distinguishes between, the assessment of the project’s outputs, i.e. the quality and quantity of what it is expected to produce, and its related, broader and more complex, effectiveness and impact questions.
Sector programme evaluations
Sector programmes are a well established mode of support for Danida in which the organisational performance of relevant institutions is a key factor for successful development.
As such, sector programme evaluations focus on questions of institutional performance, processes, changes and interrelationships, and on their effects on the sector.
Questions of organisational capacity of institutions, partner country ownership and responsibility are central to evaluating the success of sector programme support interventions and their sustainability.
A key success factor in sector programme evaluations is the involvement of the partner country and its ownership of the evaluation results.
Consistent with the emphasis on development partnerships, local ownership and good governance, donors increasingly program sector assistance jointly and assist in developing an evaluation culture by incorporating joint evaluations into sector programmes.
Ultimately the goal is for partner countries to coordinate joint evaluations in their sectors.
Country programme evaluations
Country programme evaluations focus on the entire Danish assistance to one of the programme countries.
They provide an assessment of past interventions to Danida and partner countries to improve cooperation strategies, country programmes and sector interventions. As well, they generate knowledge to improve future assistance to the country and other national country programmes.
A country programme evaluation looks at the relevance of Danish assistance against Danida’s overall policy and country strategy, as well as the development policy of the partner country. It reviews the instruments used in the bilateral cooperation, the modalities, and the relative weight given to assistance for economic and social development.

Country programme evaluations are important for policy planning at the highest level and provide a basis for bilateral negotiations. They represent an opportunity to focus on specifi c relevant issues such as the country’s dependency on development assistance, institutional capabilities, democracy, human rights, gender, and environment in a wide context.
Because of the importance of these evaluations, involvement of the partner country in the evaluation process and its ensuing acceptance, ownership and use of the evaluative information is paramount.
Country programme evaluations are truly cross-cutting, i.e. they cover all sectors and modes of cooperation, and, as a result, focussing the evaluation poses difficult challenges. The evaluation team is usually interdisciplinary, with expertise reflecting the key issues the evaluation will focus on.
To focus the country programme evaluations within the resources available, the scope is limited typically to the development issues and the strategic choices made at the national and overall levels; the economic, political and social context of the country; and the individual development agency’s strategy or the joint assistance strategy of the involved agencies.
The evaluation will take its departure from other available sector or programme evaluations and programme documentation. However, the fieldwork will cover observations and interviews with government officials, programme staff, beneficiaries and interested parties as appropriate.
Evaluations of modes of assistance
These evaluations focus on the specific instrument or channel for development assistance, e.g. research, NGOs, balance of payment support, general budget support for poverty reduction, technical assistance. Such evaluations are usually cross-sectoral and cross-country and designed to assess the efficiency of a particular mode of assistance as a means of promoting development.
Evaluations of assistance instruments are initiated to provide accountability information on past activities and to generate knowledge extract to improve future performance. The focus is usually on the efficiency of assistance delivery, but also to some extent on the effect of assistance delivered under a given mode of cooperation.
The methodological approach varies according to the form of development assistance considered. Evaluating humanitarian assistance requires a significantly different approach from evaluating research assistance or balance of payment support.
A typical problem in evaluating assistance instruments is that the individual cases being assessed often have different objectives. This makes it difficult to compare systematically the effects of different modes of development assistance and their impact. As a result, these evaluations tend to focus more on the performance of the various parties involved in development assistance and the use of the resources made available.

Evaluation of assistance instruments has a narrow perspective yet allows lessons to be drawn from larger samples of development interventions, that in turn can have wide ranging implications for policy and to improve bilateral co-operation.
Thematic evaluations
Thematic evaluations deal with selected aspects or themes in a number of development activities. There are several such themes that have been highlighted in Danish development assistance over the years. These themes are often borne out of policy statements and often termed “crosscutting issues”.
This means that they are systematically integrated in all sectors of cooperation and in all modes of assistance. Some of the factors that have been highlighted in recent years are gender aspects, institutional development, human rights, and environmental aspects.
Over the years, such cross-cutting issues have been explored on a project-by-project basis by including specific questions in the Terms of Reference. This provides a wealth of useful information when thematic evaluations are undertaken.

Thematic evaluations cut across countries, regions and sectors and have usually the greatest complexity of the evaluations. The topic is analysed in the relation to Danida’s development policy, but also in the context of international conventions and the partner’s priorities and strategies. As in all complex evaluations, a key methodological challenge lies in distinguishing between the impact of Danish assistance and the impact of national activities and policies and of other donor initiatives.
Another common challenge in thematic evaluations is, in the preparatory phase, to draw a sample of development activities that reflects Danish development assistance in countries, sectors and forms of assistance.
Thematic evaluations are usually based on relatively large samples of development activities implemented over a relatively long period of time. Much of the study contends with older documents on file that are incomplete or marginal to the theme; as a result, fieldwork concentrates on the more recent development activities, aims at verifying information and acquiring a developing an understanding that is meaningful in current contexts.
For all their difficulties, thematic evaluations have proved useful instruments in generating specific knowledge and recommendations at the highest level of aggregation, i.e. the policy level.
Impact evaluation
Impact evaluations do not focus on the type of intervention that is being evaluated, i.e. the evaluand, but rather on results different from outputs or outcomes. Impact evaluations assess the positive and negative changes in terms of people’s welfare, produced by a development intervention, directly or indirectly, intended or unintended.
By examining whether changes in the well-being of individuals can be attributed to a specific intervention, an impact evaluation can help understand whether or to what extent an intervention has produced the intended effect, e.g. increased incomes, decreased maternal mortality, increased literacy, etc.
An impact evaluation can also help to judge the value of an intervention in relation to its cost and provide information to decide whether it should be continued, scaled-up, altered or otherwise.
CHAPTER 3 DANIDA’S EVALUATION PROCESS
Evaluation programme
Danida’s evaluations are planned on the basis of a rolling two-year programme that is updated annually by the Evaluation Department. All relevant departments and representations are consulted and invited to propose evaluations to be included in the programme.
The Evaluation Department ensures that the programme reflects the distribution of Danish development assistance in countries, sectors, assistance instruments and priority areas.
The evaluation programme is posted on the Evaluation Department’s website http://www.evaluation.dk.
The main steps in Danida’s evaluation process, from the initial decision through to dissemination, use and follow up are described in the following sections.
Launching an evaluation
The Evaluation Department ensures that all involved parties are informed about the evaluation in due time before it is initiated. In the case of larger evaluations this is done through an approach paper that provides information on:
1. The background for the evaluation 2. The main objective 3. Outputs (reports etc. to be produced) 4. Major issues (themes or aspects to be covered) 5. Approach (methodological considerations) 6. Organisation and management 7. Preliminary work plan with a time schedule
The approach paper indicates the extent to which the partners will be involved in preparation and implementation of the evaluation, as well as possible co-operation with other donors. It is distributed for information and comments to relevant Danida departments and representations as well as partner institutions.
Where there are many interests involved, a workshop may be organised as a basis for the approach paper, and a reference group for the evaluation appointed. The composition of the reference group reflects the topic and purpose of the evaluation and is decided in consultation with relevant Danida departments.
Steps, roles and responsibilities in preparing the evaluation
|
| Evaluation step |
EVAL |
Other MFA staff |
Consultant |
Others |
| 1. Approach paper |
a. Prepares draft. b. Ensures involvement of relevant MFA staff and other stakeholders. c. Finalises paper. |
Embassy and BFT: Consultation. |
|
Partners, reference group and possibly other donors: Consultation. |
| 2. Terms of Reference |
a. Prepares draft. b. Ensures involvement of relevant MFA staff and other stakeholders. c. Finalises Terms of Reference. |
Embassy and BFT: Consultation. |
|
Partners, reference group and possibly other donors: Consultation. |
| 3. Tender process |
a. Requests support from ERH. b. Prepares Terms of Reference for tender consultant. c. Participates in tender committee. |
ERH a. Announces pre-qualifi cation. b. Announces tender. c. Hires tender consultant. d. Chairs the tender committee. e. Enters contract with consultant. |
Prepares and submits bid. |
Tender consultant: a. Prepares specifi cations. b. Assesses Letter of Interest. c. Assesses proposals. d. Participates in tender committee Partners:
Participate in tender committee. |
BFT is Danida’s Technical Assistance Services ERH is Danida’s Contracts Department
Terms of Reference
The Terms of Reference provide the background for the evaluation, define its objectives and scope, the composition of the evaluation team and the timing.
The Evaluation Department drafts the Terms of Reference based on the approach paper, and ensures that the evaluation is consistent with Danida’s evaluation policy and practices.
It is essential to consider the information needs of all involved parties. To this end a workshop or seminar may be organised to bring together the Evaluation Department, the representation involved, Danida’s Technical Assistance Services (BFT), the partner country (the line ministry), the project’s management (in single intervention evaluations) and other development agencies (in sector programme evaluations). Since the partner should be involved as a matter of priority, such workshops may be organised in the partner country.
The Terms of Reference include the following chapters:
1. Background 2. Objective 3. Output 4. Scope of work 5. Methodology 6. Work plan 7. Composition of evaluation team 8. Documents available
Evaluations are usually carried out in several phases such as desk study, fieldwork, and synthesis. In such cases overall Terms of Reference provide the framework for the evaluation and one of the objectives of the desk study is to formulate detailed fieldwork plan for the evaluation.
Assigning the evaluation team
Danida, with the participation of the partner country, establishes the evaluation team by contracting with individual consultants or, as is typically the case, with consulting organisations. Competence in the field and experience relevant to the task are the key selection criteria.
All Danida evaluations are tendered according to EU procurement directives and invitations to bid are disseminated internationally, i.e. beyond the EU. Tenders are appraised on the basis of topical content, professional composition and competence of the evaluation team, and price. This tender procedure requires a preparation period that may take three to six months more than the process for evaluations below that threshold.
In cases of partner-led joint evaluations, the lead partner’s procurement policies and processes may apply.
The success of an evaluation depends on the composition of the evaluation team and the competence and personal abilities of the team members. This applies in particular to the team leader who should be the one concerned with the overall perspective, able to organise and co-ordinate the work of the team members, assess the quality and relevance of their contributions and act as a spokesperson for the team.
Members of the evaluation team are selected to represent relevant professional areas and to include professional expertise from the programme country whenever possible.
To safeguard impartiality, members of the evaluation team may not have been personally involved in the activities to be evaluated; as well, organisations conducting evaluations may not have been involved in the preparation or implementation of those activities.
Overseeing the evaluation
The Evaluation Department is responsible for managing the contractual relationship with the selected evaluation team to make sure that the evaluation is consistent with its Terms of Reference Danida’s evaluation policy, good evaluation practise and that it is carried out in cost-effective and timely fashion.
As well, the Evaluation Department is responsible for ensuring that the evaluation process and products meet the DAC Evaluation Quality Standards.8
The DAC Standards are intended to:
- Provide standards for the process (conduct) and products (outputs) of evaluations;
- Facilitate the comparison of evaluations across countries (metaevaluation);
- Facilitate partnerships and collaboration on joint evaluations;
- Better enable member countries to make use of each others’ evaluation findings and reports (including good practice and lessons learned); and streamline evaluation efforts.
The Evaluation Department may set up a reference group for each evaluation that includes representatives from the main users of the evaluation and serves to provide guidance and feedback to the external evaluation team. In the case of joint evaluations the structure and mandate of the reference group may differ.
Furthermore, to strengthen quality assurance, the Evaluation Department may use external peer reviewers.
Once the evaluation team has been assigned, the Evaluation Department monitors the conduct of the evaluation in each of the following phases:
Inception During this first phase the evaluation team works on the basis of the Terms of Reference with the Evaluation Department to develop a detailed plan for the field study phase.
The evaluation team discusses any methodological adjustments with the Evaluation Department, whose role it is to consider significant proposed changes to what was described in the Terms of Reference and reach agreement with the evaluation team on the course of action.
At the end of this first phase, the evaluation team produces an inception report, i.e. a detailed operational plan for the conduct of the evaluation fieldwork that is submitted for approval to the Evaluation Department.
The inception report is reviewed by the Evaluation Department and the reference group and, where appropriate, consultations take place with other key partners and stakeholders before the Evaluation Department signs off on the inception report.
Fieldwork This second phase encompasses typically data collection, analysis, discussion of findings, conclusions and recommendations through to the drafting of the final report.
The role of the Evaluation Department is to facilitate the gathering of information and general conduct of the fieldwork by the evaluation team, while monitoring that the evaluation plan is being followed and changes discussed and agreed to with the team.
As well, the Evaluation Department is available to provide guidance and assistance to the evaluation team in dealing with management, logistic and methodological issues as they arise. It acts to facilitate coordination of the evaluation with the country partner, Danish representation in the country, collaborating organisations, users and stakeholders.
Together with the reference group, the Evaluation Department considers and discusses with the team the evaluation’s findings and the preliminary conclusions and recommendations before these are included in the draft final report.
It makes sure that all relevant users and stakeholders are included in the draft discussion process as appropriate, while supporting the evaluation team’s independence.
Use of evaluation
The users of evaluations are Danida, partner country authorities, stakeholders, the media, politicians, the public, and external resources (researchers, consultants, professional agencies, etc.).
Making information available is essential to general awareness, interest and support for development assistance. The evaluation report provides a status of strengths and weaknesses of development interventions and suggests solutions to major problems. To the extent that it does so, evaluation is a piece of technical assistance in itself which is appreciated as a test of the viability of development activities and the bilateral co-operation between the partner authorities and Danida.
Follow-up of evaluations
At the conclusion of an evaluation, a follow-up memo is prepared, taking note of Danida’s position on the conclusions and recommendations as well as identifying which departments are responsible for the agreed follow-up activities. The follow-up memo is discussed in the Programme Committee and signed off by the State Secretary. The Evaluation Department undertakes to monitor the implementation of the follow-up activities at regular intervals.
The Evaluation Department contributes actively to the dissemination of Danida’s own as well as other organisations’ evaluation experience via workshops and seminars for staff in co-operation with the Ministry’s education section. Further, the Evaluation Department assists Danida’s Centre for Competence Development in the dissemination of evaluation experience.
On the basis of Danida’s own and others’ evaluation experiences, “best practices” will be compiled and formulated. Furthermore, the Evaluation Department will contribute to the incorporation of evaluation experience in policies, strategies and guidelines, etc.
After the final report has been received, Danida decides what actions to be taken in the light of conclusions and recommendations presented. Salient issues are brought to the attention of Danida’s senior level management. This may eventually be used to further develop and improve overall policies and methods of work. It is the task of the Evaluation
Department to systematise such experience and formulate practical proposals on that basis.
Summaries of evaluation reports are distributed to all Danida staff members. It is the responsibility of Danida’s operational departments and Embassies to ensure that relevant past experience is built into the design and preparation of future activities.
The significance of an evaluation report is different for different users. It is essential that the final users of the information are able to utilise the evaluation system for their own needs. This implies that for the evaluation system to function properly, the demand for information needs to be carefully analysed. In an evaluation system designed to respond to demands rather than having a control function, the information is more likely to be utilised.
In order to ensure this linkage, the Evaluation Department has a role in assessing the needs of users and developing evaluation to respond to shifting policy and modes of development co-operation. Also, it is vital for the Department to assess continuously the quality of evaluation activities and explore mechanisms for feedback and improved learning.
The main feedback linkages are to make use of evaluation in planning of new development interventions, to manage existing activities, to develop policy and strategies, and to train staff members and external resources.
Apart from the evaluation report itself, feedback is provided through the summary of evaluation, the follow-up memorandum, specific studies and reports produced by the Evaluation Department, seminars and workshops, and guidelines issued by the Evaluation Department. Brief summaries of evaluation reports are submitted to the database of the OECD/ DAC Network on Development Evaluation, and all Danida’s evaluations are published and made public in the form of printed reports, summaries, and electronically on http://www.evaluation.dk.
Dissemination
Whether evaluation reports should be made public has been a matter of debate. Evaluation activities were originally not meant to be a means of accountability for development assistance. In the early 1980s, however, the policy was that the activities to inform the public to some extent should make use of the results of evaluations. The policy has since changed successively, and since 1989 all evaluations reports are available to the public.
The reaction has been generally positive. The availability of information has lifted the professional debate regarding Danish development assistance. Researchers, students, companies and individuals have benefited. Some of the debate in mass media has been more informed with the availability of evaluation reports.
Apart from distributing the evaluation report itself, the most common ways to disseminate evaluation information are through the evaluation summaries, annual reports, bibliographies, thematic reports, the web, seminars, press releases, and public debate.
Whatever channel is preferred, the best way to ensure dissemination of lessons learned and knowledge gained in evaluations is to improve both the content of reports and the presentation of material. A key benefit of good dissemination practises is transparency of development interventions and public insight into their value.
Code of conduct
Danida has developed a code of conduct to ensure the independence and credibility of evaluations by clarifying the responsibilities, proper conduct and mutual relations of consultants, the Evaluation Department, other Danida staff, and other parties responsible for the activities under evaluation. Other Danida staff includes both Embassy staff and Headquarters staff, while other parties responsible for the activities under evaluation include partner organisations and institutions in partner countries (e.g.line ministries, private sector actors, and civil society organisations), organisations in Denmark (e.g. Danish NGOs), and multilateral organisations and other international institutions supported by Danida.
Evaluators have cited two particularly difficult phases in the evaluation process where uncertainty about roles may threaten their independence: organisation of fieldwork and discussion of the draft report. The code, therefore, concentrates on issues relating to these two phases in the evaluation process, while also addressing a range of other potential conflicts.
Other key actors in the evaluation process are the target group of the activities under evaluation, other persons interviewed and contacted during the process, reference groups and peers. When the evaluation is undertaken as a joint evaluation involving several donors and partners, the described role of the Evaluation Department is also applicable to the role of other donors (the lead agency).
The code for consultants can be found in Chapter 5 – Evaluation team tasks.
Code for Evaluation Department staff
The Evaluation Department is the client, and in the case of joint evaluations with other partners: organises the assignment, controls the quality of the evaluation method and the quality of the report and is responsible for dissemination and for monitoring the follow-up of evaluation results. The Evaluation Department prepares approach papers and Terms of References, identifies consultants, facilitates contacts to Danida staff and other relevant resource persons, facilitates the collection of documents, ensures thorough assessment of the evaluation method, approves the inception report, assesses the quality of the draft report and approves the final report.
- The Evaluation Department is responsible for briefing the consultants at the inception stage on the operations, the expected role of all parties involved in the process, relevant documents and data sources and as often as necessary during the evaluation process. Contact between the Evaluation Department and consultants may take place at the initiative of either party.
- While consultants have chief responsibility for preparing the fieldwork, the task of the Evaluation Department is to facilitate the process and provide quality control of the final evaluation methodology and work plan. Consultants suggest sites for field visits, samples of interview persons, etc., but final decisions on methodology are subject to approval by the Evaluation Department. After the inception report has been approved, adjustments of the methodology described are subject to agreement between the Evaluation Department and the consultants.
- Evaluation Department staff participating in a scoping mission, the initial part of the fieldwork or in post-fieldwork debriefing meetings are responsible for facilitating the evaluation and assuring that the evaluation is proceeding correctly with regard to methodology, timing, etc. The participation and role of the Evaluation Department, and possible stakeholders, in meetings, interviews and field visits must be agreed with the consultant team leader. Stakeholders may include Embassy staff, Danida advisers and company advisers, partner representatives, other parties responsible for the activities under evaluation, and others.
- The Evaluation Department provides quality assurance of the report by checking the validity of evidence, the analytical rigour, the consistency of analyses, conclusions and recommendations, that requirements in Terms of Reference have been adequately dealt with, that the report is clearly structured and well-written, and that the report does not contain factual errors and inaccuracies.
- It is the responsibility of the Evaluation Department to synthesise comments received from Danida staff and other stakeholders into a set of consolidated comments pointing out which of the comments received are considered the most important, citing necessary changes to be made (comments relating to the issues above as well as further suggested changes, which the consultant may consider when preparing the final report).
- It is the responsibility of the Evaluation Department to ensure that relevant Danida staff and other stakeholders are kept informed about the progress of the evaluation. After the draft report has been presented, all communication between consultants and Danida staff and other stakeholders should go through the Evaluation Department.
- The Evaluation Department is required to react to all requests for assistance relating to situations in which consultants feel their independence threatened. Thus, during all steps of the evaluation process, it is an important task of the Evaluation Department to protect consultants against undue pressure from Danida staff and other stakeholders. If consultants encounter insufficient assistance or outright resistance during the evaluation process, including when carrying out the fieldwork, it is the responsibility of the Evaluation Department to contact the persons involved and ensure that proper cooperation is established. If the atmosphere during discussions of the draft report turns hostile, or arguments are not professionally sustained, it is the task of the Evaluation Department to intervene. In the extreme situation where comments can be interpreted as subtle or overt pressure against consultants to achieve specific conclusions, the Evaluation Department has a particular responsibility to take immediate action.
- If consultants suspect mismanagement, corruption or other illicit practices, the Evaluation Department must ensure that the information is passed on to relevant departments for appropriate action.
Code for Danida staff (Embassy and Headquarters staff )
Danida staff has a complex role in the evaluation process: they can request an evaluation, be the object of an evaluation, function as key resource persons during the evaluation, and/or be users of the results. They have a key role in identifying relevant evaluation topics and in assuring the usefulness of evaluation findings for the learning processes of Danida and its partners. They facilitate evaluations, and they contribute to quality assurance of the reports by pointing out factual errors and inaccuracies. Danida staff, thus, has the following responsibilities:
- When facilitating evaluations by providing contacts, references, information about activities and logistical support to the consultants, Danida staff is expected to respect the integrity of the evaluation team in making its own decisions about where to go and whom to see. Danida staff may certainly provide comments or background information on suggested sites for field visits and persons to be interviewed, but the final decision to visit the site rests with the evaluation team.
- Danida staff should assist with the identification of relevant documents, even if the material has not been specifically requested.
- When contributing to quality assurance of the report, Danida staff should observe the right of consultants to make conclusions and recommendations, which may not be shared by Danida. It is the responsibility of Danida staff to contribute to a fruitful exchange of information and views and to avoid any forms of communication, which could be perceived by consultants as undue pressure or even threats to produce certain conclusions.
- Reservations regarding the competence of evaluation team members, the quality of the fieldwork, the quality of analyses, etc. should be reported immediately to the Evaluation Department. If such reservations have not been expressed in due time, Danida staff should refrain from criticising the quality of the evaluation after its publication.
- After the draft report has been produced, all contact between consultants and Danida staff should go through the Evaluation Department. If consultants meet with Danida staff, the Evaluation Department should be present, and all correspondence between consultants and Danida staff should be copied to the Evaluation Department.
Code for other parties responsible for the activities under evaluation
With the increasing alignment of Danish development cooperation with partner management structures and with the emphasis put on partner ownership, actors in partner countries such as line ministries, private sector actors, and civil society organisations are key parties to the evaluation process. Other parties responsible for the activities under evaluation may include organisations in Denmark (e.g. Danish NGOs), multilateral organisations and other international institutions supported by Danida. Similar to Danida staff, these stakeholders have the complex role of being both the object of an evaluation, key resource persons during the evaluation, and/or users of the results. They have a key role in assuring the usefulness of evaluation findings for their own learning processes. They facilitate evaluations, and they provide quality assurance of the reports by pointing out factual errors and inaccuracies. Their responsibilities are as follows:
- When facilitating evaluations by providing contacts, references, information about activities and logistical support to the consultants, stakeholders are expected to respect the integrity of the evaluation team in making its own decisions about where to go and whom to see. Stakeholders may certainly provide comments or background information on suggested sites for field visits and persons to be interviewed, but the final decision to visit the site should be made by the evaluation team.
- Stakeholders should assist with the identification of relevant documents, even if the material has not been specifically requested.
- When providing quality assurance of the report, stakeholders should observe the right of consultants to make conclusions and recommendations, which may not be shared by their organisation. It is the responsibility of the stakeholders to contribute to a fruitful exchange of information and views and to avoid any forms of communication, which could be perceived by consultants as undue pressure or even threats to produce certain conclusions.
- Reservations regarding the competence of evaluation team members, the quality of the fieldwork, the quality of analyses, etc. should be reported immediately to the Evaluation Department. If such reservations have not been expressed in due time, stakeholders should refrain from criticising the quality of the evaluation after its publication.
- After the draft report has been produced, all contact between consultants and stakeholders should go through the Evaluation Department. If consultants meet with stakeholders, the Evaluation Department should be present, and all correspondence between consultants and stakeholders should be copied to the Evaluation Department.
8 Annex 2.
CHAPTER 4 EVALUATION QUESTIONS
Evaluation criteria
The OECD/DAC definition of evaluation has been adopted by Danida and all major development agencies internationally. The definition contains five evaluation criteria that should be used in assessing development interventions: relevance, efficiency, effectiveness, impact and sustainability.'
These are general criteria that should be used as a basis for developing evaluative questions through the full range of evaluations topics, i.e. from single intervention through to thematic, and ways of conducting the evaluation, e.g. joint evaluation.
Taken together, these five criteria should provide the decision-maker with the essential information and clues to understand the situation and determine what should be done next.
To the extent that specific evaluations have specific purposes, that there is no one right way to conduct an evaluation and that these criteria are interdependent and not mutually exclusive, their relative meaningfulness for a specific evaluation should be assessed and trade-offs discussed in each case to ensure that key questions are addressed and to avoid unnecessary effort and expense.
In addition, use of the five criteria does not exclude that other criteria might be used as well to better focus the evaluation on specific characteristics of the intervention and its context.
The criteria for the evaluation of humanitarian assistance are a case in point: because of some unique features of humanitarian intervention, the Active Learning Network for Accountability and Performance in Humanitarian Action, ALNAP http://www.alnap.org, has introduced three additional evaluation criteria: connectedness, coherence and coverage.9
Evaluation criteria
Relevance The extent to which the objectives of a development intervention are consistent with beneficiaries’ requirement, country needs, global priorities and partners’ and donors’ policies.
Efficiency A measure of how economically resources/inputs (funds, expertise, time, etc.) are converted to results.
Effectiveness The extent to which the development intervention’s objectives were achieved, or are expected to be achieved, taking into account their relative importance.
Impacts The positive and negative, primary and secondary long-term effects produced by a development intervention, directly or indirectly, intended or unintended.
Sustainability The continuation of benefits from a development intervention after major development assistance has been completed. The probability of long-term benefits. The resilience to risk of the net benefit flows over time.
Additional criteria for evaluation of humanitarian action
Connectedness The need to ensure that activities of a short-term emergency nature are carried out in a context that takes longer-term and interconnected problems into account.
Coherence The need to assess security, developmental, trade and military policies as well as humanitarian policies, to ensure that there is consistency and, in particular, that all policies take into account humanitarian and human rights considerations.
Coverage The need to reach major population groups facing lifethreatening suffering wherever they are.
Relevance
Relevance is a measure of the extent to which development interventions meet population needs and country priorities, and are consistent with donor policies.
For example, in a road project relevance could be assessed in terms of the rationale for constructing the road: was it to serve a political agenda of the few or to exploit real economic potential? In a sector programme to support agriculture, relevance could be assessed in terms of domestic market responses to new crops, farmers’ responses to the various programme initiatives, etc.
A change in society’s policies or priorities could imply that the development interventions are now accorded lower priority, or lose some of their rationale. Once an endemic disease has been eradicated, for instance, it could mean there is no longer any need for a special health programme.
In other words, relevance is basically a question of usefulness; in turn, the assessment of relevance leads to higher level decisions as to whether the development activities in question ought to be terminated or allowed to continue. And, if the latter is the case, what changes ought to be made, and in what direction? Are the agreed objectives still valid, and do they represent sufficient rationale for continuing the activities?
These questions should be addressed at various levels with reference to the partner country:
- At the higher level it concerns the relationship between the development activities and the development policy of the partner country, as well as whether the activities are in keeping with the priorities of the donor country.
- At the middle level it is a question of how development activities fit into a larger context, e.g. in relation to other development interventions and development efforts within a larger programme or sector.
- At the lower level it is a question of whether the development activities are directed towards areas accorded high priority by the affected parties.
Efficiency
Efficiency is a measure of the relationship between outputs, i.e. the products or services of an intervention, and inputs, i.e. the resources that it uses.
An output is a measure of effort; it is the immediate observable result of intervention processes over which the managers of the intervention, i.e. the implementers, have some measure of control. An intervention can be thought of as efficient if it uses the least costly resources that are appropriate and available to achieve the desired outputs, i.e. deliverables, in terms of quantity and quality.
The quality of the inputs and the outputs is an important consideration in assessing efficiency: the most economical resource is not necessarily the most appropriate and the trade-offs between the quantity of outputs and their quality are a key factor of overall performance.
Furthermore, assessing the efficiency of an intervention generally requires comparing alternative approaches to achieving the same outputs and this will be easier for some types of intervention that for others.
In practise, the extent to which intervention activities are standardised or not, i.e. the factors of production are well known or not, usually determines how efficiency is measured and assessed.
In a road building project for example, where the methods of construction are fairly well established, a typical measure of efficiency would be the cost per km per class of road. As well, because other projects and jurisdictions are also likely to use that same measure of efficiency, among others, the bases for comparison and assessment, or benchmarks, are readily available in most cases.
On the other hand, a national initiative on women’s rights for example is not standardised across countries. In such cases, relevant measures of efficiency typically address waste in the process, either at the level of inputs, i.e. economy – obtaining appropriate resources at least cost or fair market value, or at the level of process, i.e. duplication-triplication – etc. of activities, conflicting processes, throughputs that do not link to outputs. As well, good practices, i.e. lessons learned from similar endeavours, can be used as benchmarks for assessing efficiency.
Some examples of useful and practical criteria for assessing the efficiency of a programme or a project are:
- Appropriate resources acquired with due regard for economy
- Activities carried out as simply as possible
- Decisions made as close to where the products or services are delivered
- Overhead as low as possible
- Duplication or conflicts addressed and resolved
- Deliverables achieved on time and on budget.
Effectiveness
Effectiveness is a measure of the extent to which the intervention’s intended outcomes, i.e. its specific objectives – intermediate results – have been achieved.
Explicitly, effectiveness is the relationship between the intervention’s outputs, i.e. its products or services – its immediate results – and its outcomes, meaning usually the intended benefits for a particular target group of beneficiaries.
As such, an intervention is considered effective when its outputs produce the desired outcomes; it is efficient when it uses resources appropriately and economically to produce the desired outputs.
For example, a teaching programme is considered effective if students learn, i.e. acquire intended knowledge, skills and abilities; it is considered efficient if it provides instruction, i.e. teaching time and materials, economically and of quality.
An efficient intervention is not necessarily effective. Teaching may be provided economically and efficiently, but if it is not of good quality, e.g. appropriate to the needs and interests the students, intended learning outcomes will not be achieved, i.e. it will not be effective.
Evaluating the effectiveness of an intervention involves three steps:
1. Measuring for change in the observed outcome, e.g. did the students learn something;
2. Attributing the change in the observed outcome to the intervention, did the students learn something because of the teaching;
3. Judging the value of the teaching to the learning, e.g. by using comparisons such as targets, benchmarks, similar interventions, the assessments of teachers, students, others, etc.
Interventions have no control per se over outcomes; at best, a program strives to produce those outputs that have the greatest likelihood of producing the intended outcomes.
As such, an intervention’s effectiveness is driven primarily by two things: its design and its implementation, i.e. its management.
In many cases, attribution, i.e. internal validity, is typically the central challenge to assessing effectiveness of interventions, i.e. how and to what extent can observed changes in outcomes be attributed validly to interventions.
Where the internal validity of the programme is well established, e.g. an immunisation programme, attribution of outcomes, e.g. beneficiaries protected from disease because they have been vaccinated against that disease, can be fairly straightforward.
However, in the case of many development interventions, internal validity is not well established and attribution can become a significant challenge. For example, attributing validly a change in the incidence of human rights violations in a country to an intervention or set of interventions might be difficult for most evaluations.
Generally, the more the evaluand, i.e. what is being evaluated, is large and complex, the more attribution is likely to be difficult.
The reality of methodological and resource constraints in carrying out practical evaluation means that often attribution is expressed in terms of likelihood rather than proof, and that ultimately the test of validity is credibility.
Other challenges to assessing effectiveness are typically:
- Non-existent or poorly defined objectives, e.g. intended outcomes are not stated as measurable change over time in target groups
- Unrealistic and/or conflicting objectives
- Lack of targets or measures of success.
Impact
Impact is a measure of all significant effects of the development intervention, positive or negative, expected or unforeseen, on its beneficiaries and other affected parties.
Whereas effectiveness focuses on the intended outcomes of an intervention, impact is a measure of the broader consequences of the intervention such as economic, social, political, technical or environmental effects; locally, regionally, or at the national level; on the target group and other directly or indirectly affected parties.
For example an HIV/AIDS prevention and treatment programme targeting vulnerable groups could have broader effects both positive, such as a reduction in the incidence of tuberculosis on other groups, and negative, such as a reduction of funding to malaria prevention. Effects may also be economic in nature, e.g. size of the workforce, political, e.g. state budget allocation, and so on.
A broad assessment of impact is essential in a comprehensive evaluation, however there are two central challenges to assessing impact: boundary judgment, i.e. deciding what effects to select for consideration, and attribution, i.e. what effect is due to what.
Because, on the one hand, effects can be numerous and varied, and on the other they are typically the result of complex interactions, assessing impact is difficult in most circumstances.
As is the case for effectiveness, the assessment of impact poses a particular challenge with regard to attribution; in most cases, it is difficult to attribute rigorously broad effects on different groups and at different levels over time to a specific intervention or set of interventions.
Systems theory approaches typically provide more appropriate and useful tools for dealing with complex adaptive systems, e.g. societies, than simple linear causal approaches.
As well, a useful principle for dealing pragmatically with the issue of selection of effects, levels and groups for the evaluation, is making choices consistent with the intended use of the evaluative information.
In the final analysis, usually and at best, evaluations estimate impact on probability-based inferences derived from assumptions of simplified cause and effect relationships.
In the case of an impact evaluation, i.e. one in which measurement and assessment of impact is given priority, one must estimate the counterfactual – that means what would have happened if the intervention had not taken place. This can be done by choosing a control or comparison group – a group of individuals, households, etc., that are identical to the project group except for participation in the project or programme.
As well, baseline data, i.e. information about the state of groups before the intervention, is useful in order to measure differences after the intervention has taken place.
These methodological requirements pose particular challenges to the conduct of impact evaluation.
Sustainability
Sustainability is a measure of whether the benefits of a development intervention are likely to continue after external support has been completed.
While the four preceding criteria concern specific development interventions, the assessment of sustainability addresses the effects of the development process itself over the long term.
For example, in a road construction project, sustainability can be measured in terms of whether the road is likely to be maintained, the extent to which it will be used and provide benefits in the future, etc. In a sector programme to support agriculture, it could be measured in terms of financial and economic viability of the agricultural production and the supporting institutions, the extent to which economic surplus is reinvested productively by farmers, etc.
Sustainability is in many ways a higher level test of whether or not the development intervention has been a success. Far too many development initiatives tend to fail once the implementation phase is over, because either the target group or the responsible parties do not have the means or sufficient motivation to provide the resources needed for the activities to go further.
Sustainability is becoming an increasingly central theme in evaluation work since many development agencies are putting greater emphasis on long term perspectives and on lasting improvements.
As a result, capacity-development of communities and organisations is a common objective of development interventions, consistent with the overall goal of promoting increased autonomy and self-reliance of partner countries for the provision of public services.
Useful questions for assessing sustainability address the extent to which capacity has been successfully developed, e.g. through participation, empowerment, ownership, local resources are available and sustained political support exists.
As well, the sustainability of development interventions depend to a large extent on whether the positive impacts justify the required investments and whether the community values the benefits sufficiently to devote scarce resources to generating them.
Because sustainability is concerned with what happens after development activities are completed they are measured ideally some years afterwards. It is difficult to provide a reliable assessment of sustainability while activities are still underway, or immediately afterwards. In such cases, the assessment is based on projections of future developments based on available knowledge about the intervention and the capacity of involved parties do deal with changing contexts. It requires an analysis of the contextual setting – its capabilities and restraints – and future scenarios.
Experiences of donor agencies conclude that a development intervention’s sustainability hinges mainly on seven areas, also called sustainability factors.
These seven factors should be taken into account all along the implementation cycle, and should be used as a checklist during the evaluation to identify relevant questions:
Policy support measures The recipient’s commitment is one of the most commonly identified factors affecting success of development interventions. Commitment is expressed in terms of agreement on objectives, the scope of support to responsible organisations and the willingness to provide financial and personnel resources. Country commitment is also shaped by perceptions of mutuality of interests versus perceptions of predominantly donordriven interests.
Choice of technology The partner country’s financial and institutional capabilities are crucial determinants for the technology chosen and that the technology is accepted with mechanisms for its maintenance and renewal. Evaluation teams should consider the effects of technology in society and the costs of providing and maintaining the technology versus the benefits generated.
Environmental matters The importance of environmental considerations is now widely recognised. Although environmental effects may be negligible seen in a narrow, short-term perspective, the broader effects may be significant in a long-term perspective. Evaluations will frequently have to look specifically at environmental policy, incentives and regulatory measures, the interests of different stakeholders, and the effects of development interventions.
Socio-cultural aspects Social and cultural factors influence the adaptability and relevance of various development activities. They also influence motivation among the target group members and hence whether individuals and groups will participate and accept responsibilities in the development process.
Sustainability factors
Policy support measures Policies, priorities, and specific commitments of the recipient supporting the chances of success.
Choice of technology Choice and adaptation of technology appropriate to existing conditions.
Environmental matters Exploitation, management and development of resources. Protection of the environment.
Socio-cultural aspects Socio-cultural integration. Impact on various groups (gender, ethnic, religious, etc.).
Institutional aspects Institutional and organisational capacity and distribution of responsibilities between existing bodies.
Economic and financial aspects Economic viability and financial sustainability.
External factors Political stability, economic crises and shocks, overall level of development, balance of payments status and natural disasters.
Development interventions that are consistent with local traditions or do not assume major changes in behaviour patterns stand a better chance of success. Danida requires special attention to the roles of both women and men in implementing development interventions, particularly their access to means of production and support services, as well as their rights and benefits.
Institutional aspects The strength of institutions and the capacity of organisations are the most important factors in the success of development interventions. Current trends to change the division of roles between public and private organisations may raise the issue of how development interventions affect the co-operation and co-ordination of participating bodies. At the more detailed level, assessment may include considerations of managerial leadership, administrative systems and the involvement of beneficiaries.
Economic and financial aspects Evaluations should focus essentially on three aspects of economic and financial performance. Firstly, the cost effectiveness of the intervention strategy. Secondly, the economic and financial benefits of investments as compared with the funds and resources spent. Finally, the financial sustainability of operations in the future to explore whether funds are or will be sufficient to cover future operations, maintenance and depreciation of investments.
External factors Development assistance takes place in the context of political, economic and cultural environments that are beyond its control yet can influence it significantly. Factors such as political stability, economic crises and shocks, overall level of development, balance of payments status and natural disasters can have a determining influence on the sustainability of development interventions.
Additional criteria for evaluating humanitarian action
(Adapted from Beck, T. 2006)
Evaluation of humanitarian action, in response to natural disasters and to conflicts, has been the subject some attention over the last few years with a view to improving its quality. References for evaluators and for evaluation managers, such as “Evaluating Humanitarian Assistance Programmes in Complex Emergencies”,10 “Guidance for Evaluating Humanitarian Assistance in Complex Emergencies”11 and “Evaluating Humanitarian Action using the OECD/DAC Criteria”, are key to furthering the understanding and the quality of evaluation practise in this area.
Because of some of the distinct characteristics of humanitarian action, ALNAP proposes three additional criteria (to OECD/DAC’s five): connectedness, coherence and coverage.
Evaluation of humanitarian action (EHA) is defined by ALNAP as “a systematic and impartial examination of humanitarian action intended to draw lessons to improve policy and practice and enhance accountability.”
That they are undertaken usually during severe disruptions, often prolonged in the case of complex emergencies, give EHAs some distinct characteristics that can make access to data and information difficult, for example:
- Polarisation of perspectives in conflict situations that reduces the “space” for fair and balanced views;
- High turnover of staff working in humanitarian action that affects “organisational memory”;
- Reactive and quick implementation of humanitarian action that affects planning and the identification of performance measures;
- The context of humanitarian action may be disordered, with rapid changes in circumstances, invalidating as usual social and physical conditions.
Connectedness
Connectedness is defined as “The need to ensure that activities of a short-term emergency nature are carried out in a context that takes longer-term and interconnected problems into account.”
Connectedness is adapted from the concept of sustainability and is a measure of the relationship between humanitarian interventions and longer term goals, in particular the linkages between humanitarian action, recovery and development.
Some of the issues to consider when addressing connectedness are relative expenditure on relief and recovery, the nature of partnerships, in particular between international and national NGOs, and the extent to which local capacity is supported and developed.
Coherence
Coherence is defined as “The need to assess security, developmental, trade and military policies as well as humanitarian policies, to ensure that there is consistency and, in particular, that all policies take into account humanitarian and human rights considerations.”
The focus of this criterion is on the extent to which the policies of different actors, e.g. military, civilian, political, were complementary or contradictory. Policies may be of any type such as promoting gender equality, participation or environmental protection.
Coherence may be the most difficult criterion to evaluate, in part because it is often confused with coordination. The evaluation of coherence focuses mainly on the policy level while that of coordination more on operational issues.
Evaluation of coherence is important where there are many actors and increased risk of conflicting mandates and interests. Questions on the degree of coherence observed, or otherwise, are also important and useful to address.
Coverage
Coverage is defined as “The need to reach major population groups facing life-threatening suffering wherever they are.” The key questions this criterion generates are who was supported by humanitarian action and why.
Evaluation of coverage usually takes place at three levels: international, national or regional, and local, and the ecaluation should consider whether assistance was provided proportionally according to the need at each level.
Whether protection needs have been met is an important question to address and so are issues of inclusion and exclusion bias at regional and local levels.
Political factors often determine coverage so that analysing them is key to understanding the nature and extent of coverage of groups, including issues of protection and humanitarian space.
Finally, equity questions are central to the assessment of coverage and are addressed through geographical analysis and the organisation of data by socioeconomic categories such as gender, socioeconomic groupings, ethnicity, age and ability.
9 Beck, T. (2006). Evaluating humanitarian action using the OECD-DAC criteria: An ALNAP guide for humanitarian agencies. London, UK: Overseas Development Institute.
10 Hallam, A. (1998), Evaluating Humanitarian Assistance Programmes in Complex Emergencies. London, UK: ODI. Good Practice Review 7.
11 OECD (1999), Guidance for Evaluating Humanitarian Assistance in Complex Emergencies. Paris.
CHAPTER 5 EVALUATION TEAM TASKS
This chapter explains the expectations Danida and its Evaluation Department have of the external evaluation team with regard to the content and quality of the evaluation it is responsible for.
The information in this chapter intends to provide the external evaluation team with a clear understanding of the framework of principles, standards, roles, responsibilities and practices that Danida applies both to the methodologies and the management of its evaluations.
As such, the information is presented for each of the three main phases of the evaluation process that the external consultant team is responsible for: inception, fieldwork and reporting.
Inception: Planning the evaluation
The purpose of the inception phase is for the evaluation team to prepare a detailed operational plan, i.e. the inception report, for the next phases of the evaluation: fieldwork and reporting.
Proper planning is essential to identifying those activities required to provide well-supported answers to the evaluation questions and to avoiding other unnecessary activities and related expenditures of time, effort and money.
The planning phase provides the evaluation team with the opportunity, and responsibility, to discuss methodological specificities, fieldwork activities and reporting strategy with, and where required obtain approval from, the Evaluation Department, and as well to consult with other stakeholders.
The inception report should present:
- An overall logic model of the intervention (the evaluand), depicting the linkages between resources (inputs), intervention activities (processes), intervention results (outputs or deliverables), intended outcomes (intervention objectives), overall impacts, and their relationships in terms of the criteria of relevance, efficiency, effectiveness and impact; an explanation of how the sustainability criterion is defined and operationalised.
- The methodology: design, approach, sufficiency and appropriateness of evidence, data collection strategy and methods, analytical framework and reporting outline.
- The hierarchy of evaluation questions starting from the general ones that are presented in the Terms of References through to the specific ones that will produce data and information.
- For each specific question the basis for assessment, i.e. indicator of minimum acceptable performance.
- A matrix indicating for each specific question the nature and source of evidence.
- A schedule of activities.
- A communication and consultation plan (with stakeholders).
- In the case of evaluations with complex evaluation team organisation and logistics, e.g. joint evaluations, a systematic management plan that addresses key issues of management, coordination, authorities, responsibilities, etc.
It is expected that the external evaluation team presents, applies and documents its quality control and assurance process from the beginning of its work, i.e. at the start of the inception phase.
It should address key questions of methodology, e.g. reliability and validity issues, as well as of project management, e.g. security and confidentiality of data and information.
As well, as part of its quality control and assurance process, it is expected that the evaluation team defines, documents and presents to the Evaluation Department the structure and organisation of its evaluation file, i.e. the way in which all the information relating to the evaluation is organised and kept.
The inception phase and its attendant tasks, roles and responsibilities are organised as follows:
Inception: Steps, roles and responsibilities
|
| Evaluation step |
EVAL |
Other MFA staff |
Consultant |
Others |
| 1. Briefing of consultants |
a. Clarifies Terms of Reference b. Clarifi es role of consultant c. Presents expectations to inception report d. Adjusts time-table e. Provides background materials f. Provides writing instruction and Code of Conduct |
|
Raise uncertainties concerning interpretation of assignment and role of consultants for clarification by the Evaluation Department. |
|
| 2. Desk study |
a. Facilitates study by providing relevant documents and contacts b. Ensures involvement of relevant MFA staff and other stakeholders |
Embassy, BFT and other departments: Facilitate the study by providing relevant documents and contacts. |
|
Partners and possibly other donors: Facilitate the study by providing relevant documents and contacts. |
| 3. Preliminary field visit |
Facilitates visit by providing contacts, information and logistical support. |
Embassy: Facilitates visit by providing contacts, information and logistical support. |
Undertakes the visit. |
Partners: Facilitate visit by providing contacts, information and logistical support. |
| 4. Inception report, incl. evaluation matrix, proposed methodology, proposed sources of information, work plan and communication strategy |
a. Ensures involvement of relevant MFA staff and other stakeholders. b. Provides consolidated comments on the report c. Approves the report |
Embassy and BFT: Comment on the report. |
Prepares the report. |
Partners, reference group and possibly other donors: Comment on the report. |
Inception: Methodological considerations
The Terms of References of the evaluation stipulate usually, and to varying degrees, the methodology to be employed as well as the organisation of the evaluation activities, e.g. budget, schedule and travel. However, as a result of discussions with the Evaluation Department and other stakeholders, desk study, preliminary field visit, etc., and its own evaluation expertise, the evaluation team is expected to address thoroughly key issues and questions related to methodology and management, in order to propose defensible activities, changes and/or alternatives, that will improve the quality of the evaluation.
The DAC Evaluation Quality Standards require that sound methodology be used and explained in the evaluation report.
The purpose of methodology, and the basis on which its soundness is assessed, is to produce reliable data that allow for valid evaluative judgments that are useful for learning and making decisions.
While evaluation methodology is grounded firmly in traditional social science approaches, evaluation is more than the application of social science methods to study social problems.
To make judgments and facilitate decisions the evaluation team is required to engage with stakeholders and users, in order to integrate empirical evidence with standards and values to reach evaluative conclusions.
The methods used for one evaluation can differ greatly from the next, depending on what the evaluation intends to accomplish or answer and the theories and preferences that are used.
For example, if the priority is to make sure audiences use the evaluation, a utilisation-focused approach might be employed; if the priority is answering as unequivocally as possible “what works,” a randomized trial may be chosen; if the priority is engaging stakeholders and building evaluation capacity, an empowerment or participatory approach may be privileged.
Increasingly evaluators use “mixed methods” to optimise the configuration of components that make up the evaluation’s methodological framework as the evaluation’s methodology encompasses choice, and allows for mixing of:12
- Design, i.e. experimental, quasi-experimental, non-experimental;
- Approaches, e.g. theory-based, goal free, constructivist, empowerment, utilisation-focused, etc.);
- Qualitative and quantitative methods;
- Tools and techniques (e.g. logic modelling, strategic planning, concept mapping, etc.).
Methodology is also the basis on which validity and reliability of the evaluation is determined. As such, it is expected that the evaluation team develop methodology with due regard to minimising threats to validity and addressing issues of reliability.13
Validity Validity is a measure of the extent to which, taken together, the evaluation’s design, data collection methods and analyses provide a reasonable basis for conclusions about the evaluation’s questions.
Four types of validity are important to consider in planning and carrying out evaluations:
- External validity – the extent to which issues related to the generalisation of conclusions are addressed; this is a particularly significant issue whenever sampling is used.
- Construct validity – the extent to which inferences can be made legitimately from the operationalisations, e.g. the specific questions and indicators, to the constructs on which they are based, i.e. the general evaluation criteria.
- Internal validity – whether observed changes can be attributed to the intervention, i.e. the cause and not to other possible causes (sometimes described as alternative explanations, rival hypotheses or counterfactuals, for outcomes and impacts).
- Conclusion validity – the degree to which conclusions reached about relationships in the data are reasonable.
12 Adapted from Weiss, H. B. (Summer 2005). The Evaluation Exchange: Editorial. Harvard Family Research Project.
13 Adapted from Trochim, W. M. K. (2002). Research Knowledge Database.
Reliability Reliability is a measure of the quality of measurement; information is reliable if the measurement procedure yields the same results if applied repeatedly.
Reliability is a key factor for the quality of the evaluation and, as such, the evaluation team is expected to incorporate into its methodology ways of estimating the reliability of the data it gathers.
One of the most commonly used techniques for approximating the reliability of data is triangulation, applying the same measurement procedure to three or more different sources to obtain data that can be compared for similarity.
This type of reliability estimate is called internal consistency reliability; other forms of reliability estimates are:
- Parallel-forms reliability, e.g. where triangulation is applied to three different categories of data: verbal, documentary and observation.
- Test-retest reliability, used to assess the consistency of a measure from one time to another, e.g. where the same household survey is administered at periodic intervals.
- Inter-Rater or Inter-Observer Reliability, used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon, e.g. an assessment of the fairness and freedom of elections by different observers.
The evaluation team is expected to consider the relationship, i.e. the trade-offs, between issues of validity and reliability, as part of the development of the methodology.
For example, although a design that incorporates randomised control groups might be the most appropriate for dealing with issues of validity, availability and reliability of data might preclude such an approach (comparison group data, baseline data, surveys that could be substituted for baseline data, measures that are valid and reliable and so on). As well, the reality of the contexts in which most development interventions take place is such that even if such data are available they should be treated with caution.
As a result of these kinds of considerations an approach that makes use predominantly of qualitative data through interviews, observations and documentary reviews might often be more appropriate. What is expected of the evaluation team is that it undertake, and document in the evaluation file, these considerations systematically.
Sufficient and appropriate evidence When making choices about the amount and nature of data to gather, it is expected that the evaluation team will collect only the information required to answer the evaluation questions.
Sufficiency has to do with the amount of information required to provide persuasive support for the contents of the evaluation report, i.e. will the collective weight of the evidence be sufficient to persuade a reasonable person that the observations and conclusions are valid, and the recommendations appropriate.
Some of the factors to consider when judging sufficiency are:
- The quality of the data, i.e. its relevance, reliability and validity;
- The significance of the finding and conclusion the data are intended to support, e.g. how important is it?
- How much assurance is intended, e.g. is the evaluation important for accountability purposes?
- What is the risk of making an incorrect observation or reaching an invalid conclusion?
- What is the cost of obtaining additional information in relation to its additional benefits, i.e. in terms of support for observations and conclusions?
Appropriateness of data includes questions of reliability and validity, but as well of relevance, i.e. the extent to which information bears a clear and logical relationship to the evaluation criteria and questions.
General categories of data are verbal, documentary and observational; as a general rule of thumb, observations are considered the most robust type |