This glossary covers the main terms used in evaluation and the broad context of research: planning a research project, and using its findings.
A useful but little-known concept first used by the philosopher Peirce around 1900. Similar to induction, it can be described as testing a theory by fitting it over a framework of facts.
Understanding something by dividing into smaller parts, and studying each part separately. The opposite of analysis is synthesis.
Comparison of one organization's performance with a group of other organizations, on some particular group of measures. The best performing organization is considered to have best practice, and other organizations may adopt its methods.
Deduction is what you do when you know the principles of something, and deduce a particular case. For example, if you know the principles of arithmetic, you can deduce that 10023 + 61 = 10084, even if you have never seen this example before. Induction is the opposite process. See also abduction.
Effectiveness and efficiency
Efficiency measures how economically something is done - trying to achieve the maximum output for the minimum input. Effectiveness measures whether it was worth doing in the first place.
Based on actual data. You might believe that 50% of the population is male and 50% female, but empirical data for nearly all countries shows that the balance is closed to 49% men and 51% women. The opposite of empirical is theoretical.
Any activity that assesses the success or otherwise of actions or policies, mostly for the public sector. The difference between evaluation and research is that evaluation can use research methods, and one use of research is for evaluation. If research is a cup, evaluation is the saucer that it sits on (does that make sense?).
The extent to which the findings of a study can be generalized to other populations. For example, how far might the findings of a survey of the population of Croatia in 2003 be generalized to the population of Serbia in 2004? Such questions can usually be answered only by comparing the results of a variety of studies. See also validity.
Research done to help create or improve a process or product. Contrasts with summative and process evaluation. Formative research is often qualitative, while summative research is usually quantitative.
When you generalize from a particular case to a broad conclusion, you are making a generalization. For example, "All my friends agree with me on this question, so everybody else must agree with me too." On a more professional scale, when a survey takes a sample from a population, the results are based on the sample, but are generalized to the whole population that the sample was taken from. See also sample and population and external validity.
A system that can be viewed at the same time as part of a larger system or as a group of smaller systems. For example, an organization is a holon, because it is made of of smaller systems (e.g. people) but also part of a larger system (e.g. the community where it is based). For more detail, see this History of Holons.
A statement or proposition capable of being tested. It must be stated in enough detail that its truth can be confirmed. For example, "TV news is more interesting than comedies" is not a hypothesis, but "The majority of Australians think that TV news is more interesting than comedies" is a hypothesis. A set of related hypotheses can be built into a theory.
An indirect measure of a broad concept which can't be measured directly. E.g. the visible wear in a museum carpet in front of an exhibit is an indicator of the exhibit's popularity. See also performance indicator and proxy and validity.
A generic word for any program that tries to change the status quo with deliberate action by some organization. When, for example, all children in an area are vaccinated against TB, that's an intervention. Often called a social intervention.
Key Performance Indicator. A proxy measure of the success of part of an organization, or a manager of that part. A type of indicator, with the difference that the future of the unit or person depends on achieving a satisfactory figure. See also performance indicator.
Equivalent to reach or cumulative audience: that is, the number of different people who use a service, buy a product, or see a program. Saturation and incidence also have much the same meaning.
An indicator of the success of a government or corporate program. For example, a performance indicator for an anti-poverty aid program could be the number of people visiting medical clinics who are judged to be suffering from malnutrition.
Prevalence Process evaluation Proposition Proxy indicator Reliability Stakeholder Strategic Synthesis System Theory Triangulation Validity
In epidemiology, this is the percentage of the population who at a particular time have the disease being studied. If you replace the topic by a media research measure, such as watching TV at a particular time, you immediately see the close link between epidemiological and media research measures. Just as the average audience to a station is the percentage reach times the average duration, the prevalence of a disease is its times the average duration.
Formative evaluation is concerned with the start of a project, and summative evaluation reviews the whole project. Process evaluation occurs between those two: it evaluates the success of the processes used.
Similar to a hypothesis, but need not be expressed in directly testable form. If the proposition is "After reading this web page, people will know more about evaluation," the corresponding hypothesis could be "After reading this web page, the average score on the Schinkel-Winkel Evaluation Comprehension Test will increase at least 10%." (Don't seek out the Schinkel-Winkel test - it's just a hypothetical example.)
A proxy is a type of indicator, used when you can't measure the real thing. In other words, not a very reliable indicator, but better than nothing. For example, if you are comparing innovation levels in different countries, the number of patents issued per million people per year is a proxy indicator.
A statistical term used in assessing an instrument, meaning consistency or predictability. E.g. a survey question has 100% reliability if the survey is repeated and each respondent gives the same answer both times. See validity.
Any person, group of people, or organization affected by or affecting the system being studied. Very similar to actor - except that totally passive stakeholders are not actors. Compare with actor.
A management buzzword that's often misunderstood. Strategic, in its original military sense, refers to looking at the "big picture" and long-term goals. Reacting to a short-term problem without changing your overall goals is usually tactical, not strategic.
Understanding that arises from combining parts into a whole. Almost the opposite of analysis.
Anything that has boundaries, receives inputs and processes them to produce outputs. For example, you: the boundary is your skin, you receive sensory and food inputs, and your outputs are whatever you do. There are also computer systems, energy systems, geophysical systems, mechanical systems, business systems, etc. In evaluation, a vital question is "What is the system being evaluated, and what are its boundaries?"
A theory is usually a set of hypotheses, suggesting a form of causal connection between sets of variables. A well known example is Darwin's theory of evolution. When knowledge is described as
Studying an issue using several different methods (e.g. a survey and focus groups), as if you're seeing it from different angles. Though different methods come up with different results, the results should be similar enough that they might be plotted on a graph as a small triangle. Somewhere inside that triangle is the "real truth."
The extent to which an instrument is measuring what it's supposed to be measuring. For example, counting growth rings is a valid measure of a tree's age. If no measure is fully valid, indicators can be used. See also reliability and external validity.
99% of the sample liked the program. The remaining one strongly disliked it, because, "very late one night during a conference in Tokyo, I came out of a nightclub only to find an inflatable rhinoceros blocking the street, and ever since that incident I have nightmares whenever I see that program."
If, a few days later, the manager remembers, not that 99% liked the program but that it caused nightmares, that's a vividness effect.
Theory Triangulation Validity