PASSIA
Seminars
|
|
|
Home > Seminars > Civil Society Empowerment Through Training & Skills Development |
3.1 Overview
of Types of Evaluations
Program evaluations are carried
out at different stages of project planning and implementation. They can
include many types of evaluations (needs assessments, accreditation, cost/benefit
analysis, effectiveness, efficiency, formative, summative, goal-based, process,
outcomes, etc.). The type of evaluation you undertake to improve your programs
depends on what you want to learn about the program.
In general, there are two main
categories of evaluations of development projects:
Formative evaluations (process
evaluations) examine the development of the project and may lead to changes in
the way the project is structured and carried out. Those types of evaluations
are often called interim evaluations. One of the most commonly used formative
evaluations is the midterm evaluation.
In general, formative evaluations are process
oriented and involve a systematic collection of information to assist
decision-making during the planning or implementation stages of a program. They
usually focus on operational activities, but might also take a wider
perspective and possibly give some consideration to long-term effects. While
staff members directly responsible for the activity or project are usually involved
in planning and implementing formative evaluations, external evaluators might
also be engaged to bring new approaches or perspectives. Questions typically
asked in those evaluations include:
·
To what extent do the activities and strategies correspond with those
presented in the plan? If they are not in harmony, why are there changes? Are
the changes justified?
·
To what extent did the project follow the timeline presented in the work
plan?
·
Are activities carried out by the appropriate personnel?
·
To what extent are project actual costs in line with initial budget
allocations?
·
To what extent is the project moving toward the anticipated goals and
objectives of the project?
·
Which of the activities or strategies are more effective in moving
toward achieving the goals and objectives?
·
What barriers were identified? How and to what extent were they dealt
with?
·
What are the main strengths and weaknesses of the project?
·
To what extent are beneficiaries of the project active in decision-making
and implementation?
·
To what extent do project beneficiaries have access to services provided
by the project? What are the obstacles?
·
To what extent are the project beneficiaries satisfied with project
services?
Summative evaluations (also
called outcome or impact evaluations) address the second set of issues. They
look at what a project has actually accomplished in terms of its stated goals.
There are two types of summative evaluations. (1) End evaluations aim to
establish the situation when external aid is terminated and to identify the
possible need for follow up activities either by donors or project staff. (2) Ex-post evaluations
are carried out two to five years after external support is terminated. The
main purpose is to assess what lasting impact the project has had or is likely
to have and to extract lessons of experience.
Summative evaluation questions include:
·
To what extent did the project meet its overall goals and objectives?
·
What impact did the project have on the lives of beneficiaries?
·
Was the project equally effective for all beneficiaries?
·
What components were the most effective?
·
What significant unintended impacts did the project have?
·
Is the project replicable?
·
Is the project sustainable?
For each of these questions, both quantitative data
(data expressed in numbers) and qualitative data (data expressed in narratives
or words) can be useful.
Summative evaluations are usually carried out as a
program is ending or after completion of a program in order to “sum up” the
achievements, impact and lessons learned. They are useful for planning
follow-up activities or related future programs. Evaluators generally include
individuals not directly associated with the program.
3.2
Overview of
Summative Evaluation Models
Terms like "outcome"
and "impact" are often used interchangeably. A distinction should be
made. Outcomes refer to any results or consequences of an intervention or a
project. Impact is a particular type of outcome. It refers to the ultimate
results (i.e. what the situation will be if the outcome is achieved). A UNICEF
publication clarifies the relationship between the two terms:
“Some people distinguish between outcomes and
impacts, referring to outcomes as short-term results (on the level of purpose)
and impacts as long-term results (on the level of broader goals). Outcomes are
usually changes in the way people do things as a result of the project (for
example, mothers properly treating diarrhea at home), while impacts refer to
the eventual result of these changes (the lowered death rate from diarrhea
disease). Demonstrating that a project caused a particular impact is usually
difficult since many factors outside the project influence the results.” (UNICEF,
A UNICEF Guide for Monitoring and Evaluation: Making a Difference?, New
York, 1991, p. 40.)
Impact evaluation should be
carried out only after a program or project has reached a sufficient level of
stability. It is usually preceded by an implementation evaluation to make sure
that the intended program/ project elements have been put in place and are
operational before we try to assess their effects. Assessing the impact at an
early stage is meaningless and a waste of resources.
The main question that impact
evaluations try to answer is whether the intervention or project has made a
difference for the target groups. There are different ways to find out and
prove if the intervention or project has made a difference. Those ways are
referred to as evaluation models.
Evaluation models differ in the
extent to which they are able to identify and prove project outcome or impact
and link them with project interventions, i.e. to make a causal link between
the two. Some models are more likely than others to generate reliable results
that could establish a causal link. In evaluation terms this is called the
scientific rigor or validity of the model. There are many evaluation models.
The following section reviews two commonly used models: the pretest-posttest
model and the comparison group model.
A. Pretest-Posttest Model
The basic assumption of this
model is that without project interventions, the situation that existed before
the implementation of the project will continue as did before. As a result of
the intervention, the situation will change over time. Therefore, we measure
the situation before the project starts and
repeat the same measures after the project is completed. The differences
or changes between the two points in time can be attributed to the project
interventions.
To increase the validity of this
model, we have to control some biases that might result from the application of
the model. For example the pre and posttests should be the same, measures
should be taken from the same groups, etc. In addition, to establish a strong
link between project interventions and project impact, the model should take
into account other biases that might occur between the two points in time. Some
of those biases might be out of the project control, i.e., social, political,
economic, and environmental factors.
Advantages: The main advantage of the pretest-posttest model
is that it is relatively easy to implement. It can be implemented with the same
group of project beneficiaries (does not require a control or comparison
group). It does not usually require a high level of statistical expertise to
implement and is able to assess progress over time by comparing the results of
projects against baseline data.
Disadvantages: The
main disadvantage of the pre and posttest model is that it lacks scientific
rigor. There are many biases that might take place between the pretest and the
posttest that could affect the results, and therefore, weaken the direct link
between project interventions and project outcomes or impact. In other words,
changes in the situation before and after project implementation might (at
least in part) be attributed to other external factors. This problem could be
reduced by adopting what is called the multiple time-series model, i.e.
repeating the measures at different points of time during the implementation of
the project and not only at the beginning and end points of time. This way,
results of measures can be tracked over time and the effects of the external
factors can be assessed and controlled. However, this might increase the work
burden and expand the cost of the evaluation.
Implementation Steps: Applying the pretest posttest model involves the following
main stages:
1.
Prepare a list of indicators
that would test project outcomes.
2.
Design evaluation tools and
instruments for data collection.
3.
Apply the tools and instruments
with the target group or a representative sample of the target group at the
pretest time (at the beginning or the project implementation phase or before
the implementation starts).
4.
Repeat the same measures at the
posttest time (at the end of the project implementation phase) with the same
target group or a representative sample of the target group.
5.
Analyze, compare and interpret
the two sets of evaluation data.
6.
Report findings.
B. Comparison Group Model
This evaluation model assesses
project outcomes or impact through the comparison between project results on
two comparable groups at the same period of time (say the end of project
implementation phase). The first group represents beneficiaries of the project
and the second represents a group that has not benefited from the project. To
control for design biases, the two groups should have the same characteristics
in many aspects (socioeconomic status, gender balance, education, and other
geographic and demographic aspects). Difference between the two groups could be
attributed to the project interventions.
Advantages: This model has relatively strong scientific rigor.
It is able to link project impact with project interventions or to attribute
outcomes to the intervention. The implementation of this model is relatively
easy when naturally existing comparison groups can be found.
Disadvantages: In many situations it is difficult to find a
comparison group. In addition, working with two different groups might increase
the research burden and increase the cost of evaluation.
Implementation
Steps: Applying the
comparison group model involves the following main stages:
1. Prepare a list of indicators that would test project outcomes.
2. Design evaluation tools and instruments for data collection.
3. Select a comparison group based on an appropriate set of criteria.
4. Apply the tools and instruments with the target and comparison groups, or representative samples of both, at
the same time.
5. Analyze, compare and interpret the two sets of
evaluation data.
6. Report findings.
Evaluating the impact or
results of a project is difficult to prove if we do not know the situation
prior to the project implementation. Baseline surveys are those surveys carried
out before project implementation start to generate data about the existing
situation of a target area or group. Such data becomes the reference against
which project/program impact can be assessed when summative evaluations are
carried out. For example, if the objective of the project is to reduce school
dropout rates, we have to know those rates prior to project implementation and
compare them with rates after the completion of the project.
Baseline surveys are especially important when the pretest
posttest evaluation model is adopted. The logic behind carrying out baseline
surveys is that by comparing data that describe the situation to be addressed
by a project or a program and data generated after the completion of the
project, evaluators would be able to measure progress or changes in the
situation and link those changes to project interventions. As well, baseline
data might be useful to track changes that the project would bring about over
time and to refine project indicators that are important for project monitoring
or for evaluating project impact.
Baseline surveys are especially important for
assessing project higher-level objectives. Special focus is given to gathering
information about various indicators developed to measure project effects. Both
quantitative and qualitative information are used in baseline surveys (see next
section). To control biases in methodological indicators, methods and tools
used in the baseline survey should be repeated when carrying out summative
evaluations.
Source: United Nations Development
Programme (UNDP), Who Are the Question-makers? A
Participatory Evaluation Handbook, OESP Handbook Series, 1997.
3.4 Review
of Key Outcome and Impact Evaluation Indicators
There are a number of
interrelated dimensions of programs and projects to measure their success
including: effectiveness, efficiency, relevance, impact, and sustainability.
Following is a summary review of each of those dimensions:
1. Effectiveness
Effectiveness in simple terms
is the measure of the degree to which the formally stated project objectives
have been achieved or can be achieved. To make such measure and verification
possible, project objectives should be defined clearly and realistically.
Often, evaluators have to deal with unclear and highly general objectives that
are hard to assess: “upgraded health conditions,” “improved living conditions”
or unrealistic objectives (in comparison with allocated resources, time or
level of activities). In such situations, the measurement of effectiveness
becomes difficult. Evaluators have to work with project staff to try to operationalize
those objectives based on existing documents and to draw clear and realistic
objectives as the point of reference for measuring effectiveness.
2. Efficiency
Efficiency is the measure of
the economic relationship between the allocated inputs and the project outputs
generated from those inputs (i.e. cost effectiveness of the project). It is a
measure of the productivity of the project, i.e., to what degree the outputs
achieved derive from an acceptable cost. This includes the efficient use of
financial, human and material resources. In other words, efficiency asks
whether the use of resources in comparison with the outputs is justified.
This might
be easy to answer in the field of business. In such situations,
the main difficulty in measuring efficiency is to determine what standards to follow as a point of reference. The
question, however, becomes more
difficult in the social context especially when ethical considerations
are involved. For example, how can we answer if spending X amount of dollars to
save the lives of Y number of children or to rehabilitate Z number of disabled
persons is justified. What are the acceptable standards in such situations?
In the absence of agreed upon
and predetermined standards, evaluators have to come up with some justifiable
standards. Following is a list of recommendations that evaluators may use:
·
Compare project inputs and
outputs against other comparable activities and projects.
·
Use elements of best practice
standards.
·
Use criteria to judge what
might be reasonable.
·
Ask questions such as: could
the project or intervention achieve the same results at a lower cost? Could the
project achieve more results at the same cost?
3. Relevance
Relevance is a measure used to
determine the degree to which the objectives of a program or project remain
valid as planned. It refers to an overall assessment to determine whether
project interventions and objectives are still in harmony with the needs and
priorities of beneficiaries. In other words, are the agreed objectives still
valid? Is there a sufficient rationale for continuing the project or activity?
What is the value of the project in relation to other priority needs? Is the problem
addressed still a major problem?
Society’s priorities might
change over time as a result of social, political,
demographic or environmental changes. As a result, a given project might
not be as important as it was when it was initiated. For example, once an
infectious epidemic has been eradicated, the justification for the project that dealt with the problem might no
longer exist. Or, if a natural disaster happens, society’s priorities
shifts to emergency or relief interventions,
and other projects and interventions might become less important.
In many cases, the continuation of project
relevance depends on the seriousness, quality of needs assessment and the
rationale upon which the project has been developed.
4.
Impact
Project impact is a measure of
all positive and negative changes and effects caused by the project, whether
planned or unplanned. While effectiveness focuses only on specific positive
and planned effects expected to accrue as a result
of the project and is expressed in terms of the immediate objective, impact
is a far broader measure as it includes both positive and negative project
results, whether they are intended, or unintended. Impact is often the most
difficult and demanding part of the evaluation work since it requires the
establishment of complex causal conditions that are difficult to prove unless a
strong evaluation model and a diverse set of techniques are used.
In assessing impacts, the point
of reference is the status of project beneficiaries and stakeholders prior to
implementation. Questions often asked in impact evaluations include: what are
the results of the project? what difference has the project made to the
beneficiaries and how many have been affected? What are the social, economic,
technical, environmental, and other effects on the direct or indirect
individual beneficiaries, communities and institutions? What are the positive
or negative, intended and unintended, effects that come about as a result of
the project activities?
Project impacts can be immediate
and long-range. Project staff and evaluators should decide how much time must
elapse until project impacts are generated. For example, an agricultural
project may produce important impacts after only a few months – whereas an
educational project might not generate significant effects until several years
after the completion of the project. Therefore, it is important to design the
program or project in a way that will lend itself to impact assessment at a
later stage, e.g., through the preparation of baseline data, setting of
indicators for monitoring and evaluation, etc.
5.
Sustainability
Sustainability in simple terms
is a measure of the continuation of the project program or positive results
after external support has been concluded. It has become a major issue in
development work and evaluation of projects.
Many development initiatives
fail once the implementation phase is over because
neither the target group or
responsible organizations have the means, capacity or motivation to
provide the resources needed for the activities to continue. As a result, many
development organizations became more interested in the long-term and lasting
improvements of projects. In addition, many donors are becoming interested to
know for how long should they need to support a project before it can run with
local resources.
During the last decade, the
concept of sustainability has been developed from merely asking whether the
project has succeeded in contributing to the achievement of its objectives or
whether the project will be able to cover its operational costs from local
sources to a broader set of issues including if there is an indication whether
the positive impacts are likely to continue after the termination of external
support. In addition, environmental, financial, institutional and social
dimensions have become major issues in the assessment of sustainability.
Since sustainability is
concerned with what happens after external support is completed, it should
ideally be measured after the completion of the project. It will be difficult
to provide definitive assessment of sustainability while the project is still
running. In such cases, the assessment will have to be based on projections
about future developments.
There are a number of factors that can be used to
ensure that project interventions are likely to become self-sustaining and
continue after the termination of external funding, including:
·
economic (future expenses,
especially recurrent costs)
·
institutional (administrative
capacity, technical capacity, institutional motivation, ownership of the
project, etc.)
·
social (community interest,
political will, etc.)
·
factors related to overall
environmental benefits.
PASSIA
The Palestinian Academic Society for the Study of International Affairs, Jerusalem
Tel: +972-2-6264426 / 6286566 Fax: +972-2-6282819
P.O. Box 19545, Jerusalem
Email:
passia@palnet.com
Copyright
©
PASSIA