June 2006 // Volume 44 // Number 3 // Feature Articles // 3FEA7

Previous Article Issue Contents Previous Article

Factors Affecting Program Evaluation Behaviours of Natural Resource Extension Practitioners--Motivation and Capacity Building

Despite expectations for natural resource Extension practitioners to measure impacts of their programs, evaluation practices among this group are highly variable across individuals and states. The study described here assessed attitude towards evaluation, perceived organizational commitment to evaluation, practitioner characteristics, and levels of program evaluation conducted among natural resource Extension practitioners in the U.S. The study showed that age, years of experience, belief that one's job performance is assessed on the basis of program evaluation behavior, and other factors are linked to evaluation behavior. It also investigated factors in institutional capacity building for evaluation.

Shawn Morford
Benchmark Consulting
Forest Grove, Oregon

Robert Kozak
Associate Professor
University of British Columbia
Vancouver, British Columbia, Canada

Murari Suvedi
Associate Professor
Dept. of Community, Agriculture, Recreation and Resource Studies
Michigan State University
East Lansing, Michigan

John Innes
Dept. of Forest Resources Management
University of British Columbia
Vancouver, British Columbia, Canada


Natural resource Extension programs in the United States have shifted from a primary focus on landowner education to broader programs targeted at indicators relating to sustainability, wildlife habitat, water quality, and other environmental factors over the past 15 years. At the same time, Extension practitioners are increasingly expected to evaluate programs to measure impacts of their efforts (AREEA 2004; GPRA 2004). Many organizations are exploring how to make evaluation a core part of their business (Barnette & Sanders, 2003; CDC 2004, Compton, Baizerman, & Stockdill, 2002; World Bank, 2002).

Despite the expectations for impact data, evaluation practices among natural resource Extension practitioners (NREPs) in the U.S. remain highly variable. A wide range of administrative interventions are used to induce Extension practitioners to conduct program evaluations, including training, hiring state-level evaluation specialists, using evaluation experience as a hiring criterion, and evaluation practice as a job performance and promotion/tenure assessment criterion. However, the degree to which these administrative tools are used varies across considerably states and regions.

There are several key variables among natural resource Extension practitioners that could affect program evaluation behaviors. For example, in some states, NREP positions are tenure-track, while in other states they are considered non-tenure track. In some states, NREP positions are base (core) funded, while in others they are funded by grants and other "soft" funds. If evaluation is to become a "part of doing business" among Extension practitioners (Decker & Yerka, 1990), then understanding what variables account for the differences in evaluation practices is a critical step.

The research questions for the study described here stemmed from personal experience in natural resource Extension relating to evaluation practices. The authors perceived there was high variability among Extension practitioners in terms of evaluation practice from individual to individual and across states. The hypotheses were developed to examine what factors are linked with evaluation behaviours among natural resource Extension practitioners in the U.S. in an attempt to help guide administrators in finding ways to improve evaluation performance within their organizations. The hypotheses related to these factors:

  • Years of experience in Extension (NREPs with more years of experience conduct higher levels of evaluation than less experienced NREPs)

  • Age (older NREPs conduct higher levels of evaluation than younger NREPs)

  • Attitude towards evaluation (NREPs with positive attitudes towards evaluation conduct higher levels of evaluation than those with less positive attitudes towards evaluation)

  • Perceived organizational commitment to evaluation (The more they perceived their organizations are committed to evaluation, the higher the level of evaluation conducted)

  • Funding source for salary (NREPs positions with a higher percentage of grant or soft funds conduct higher levels of evaluation than those with a greater percentage of core funds)

  • Position classification (NREPs whose positions are considered "tenure track" conduct higher-level evaluations than those whose positions are not "tenure track")

  • Belief that one's job performance is assessed on the basis of program evaluation practices (NREPs who believe their own job performance assessment is based, in part, on the level of evaluation conducted tend to conduct higher-level evaluations than those who don't)

The study also examined how institutional factors such as internal evaluation processes also contribute to the overall organizational performance in evaluation.

The study used a tool called "Bennett's Hierarchy" (Bennett, 1976, Figure 1), which has been extensively by Extension practitioners for planning and evaluation. The Hierarchy describes a series of staircase levels of evidence of program impacts, beginning at the bottom step with "inputs" (allocation of resources to a program) and progressing to the top step, "end result" (measuring impacts of a program on long-term goals or conditions). Evidence of program impact at each ascending step is progressively more substantial albeit more difficult, costly, and time-consuming to measure, and use of the tool assumes equal quality of evaluation at each step. According to Bennett, "higher-level" evaluations provide stronger evidence of impact than "lower level" evaluations.

Figure 1.
Bennett's Hierarchy (source: Bennett 1976)

Bennett's Hierarchy from Level 1 Inputs to Level 7 End results.

At the lowest level on the Hierarchy, an Extension practitioner measures and reports on the amount of dollars allocated to a project as an indicator of program success (level 1: Inputs, as shown in Figure 1). While these data are relatively easy to obtain, they do not say much about "what difference" the program makes. Higher in the Hierarchy, however, Extension practitioners measure changes in knowledge, attitude, skills, aspirations, and behaviors of the target audience as a result of their program. The highest level of Bennett's Hierarchy shows process toward a long-term goal or desired condition (level 7: End Results).

Materials and Methods


In early 2004, email address lists of NREPs were obtained from each state, and an invitation was emailed to 523 NREPs, linking them to a commercial Web-based survey tool called "Zoomerang." After a second email reminder, 224 NREPs had returned completed questionnaires (42% response). A non-respondent survey showed that respondents had similar characteristics as non-respondents.


The survey instrument consisted of 26 questions, including two 5-point Likert-scale index questions to measure perceived organizational commitment to evaluation and attitudes about program evaluation. The survey also included open-ended and closed questions relating to NREP characteristics, attitudes towards evaluation, perceptions of organizational commitment to evaluation, and evaluation practices. Other questions asked respondents about the level of evaluation that they conducted most of the time, their sources of information about evaluation, what motivated them to conduct evaluation, their confidence levels in conducting evaluation, skills gaps, and job and personal characteristics. Open-ended questions asked respondents a) types of evaluations they have conducted other than end-of-event questionnaires, b) their perceived barriers to evaluation, and c) recommendations to improve the practice of evaluation in their organizations.

Two survey questions quantified evaluation behavior to enable hypothesis testing. The level of evaluation conducted was measured by asking respondents to select from a list of program evaluation levels indicating what level of evaluation they conducted most of the time and the frequency of evaluating at various levels (scale of 1-7 from "never" to "always"), as per Bennett's Hierarchy (Bennett, 1976).

Blank job performance assessment forms were also collected from individual states to determine if evaluation practice was included as a criterion of job performance. Thirty-nine states provided blank forms.


Using SPSS (Statistical Package for Social Sciences), the following statistical procedures were used to analyze the quantitative data.

  • One-way analysis of variance (ANOVA) and independent sample t-tests were used to identify significant differences between groups of respondents and attitude and behavior-related variables.

  • Cross tabulations were created to look for trends and patterns between respondent characteristics and other categorical data.

  • Cluster analysis was used to classify the responses into similar groups to identify trends that were not revealed through other procedures.


Survey results showed that a strong majority (79%) of respondents had conducted some kind of program evaluation during the past year. Of those, 40% had conducted some kind of evaluation in addition to end-of event questionnaires. Forty-seven percent of respondents were county or region-based, and 50% were campus-based. Three percent were "other" (primarily government agency from Colorado). The average years of experience was 12.7 years, but ranged widely (standard deviation = 9.3 years).

Sixty percent of respondents did not mind doing evaluation, but 27% "prefer to ignore it" or "dread it." A quarter thought it took away from real work of Extension. The most frequently mentioned motivator was "seeing my evaluation results used in decision making." Half of respondents believed their job performance were assessed on the basis of their evaluation practice. Sixteen percent were unsure. Sixty-two percent felt that evaluation practices should be a job performance criterion.

When blank job performance assessment forms were reviewed, criteria in 19 states (of 39) included vague or no reference to evaluation practices. Some criteria referred generally to "scholarly accomplishment" as a category.

There was disagreement in 16 states among respondents who had similar types of jobs whether or not their job performance was assessed on the basis of their evaluation practices. Respondents were slightly more motivated to do evaluation by external factors than by internal factors, but there was wide variation. Almost one quarter (21%) felt that their administration did not try to motivate them to conduct program evaluation.

Table 1 shows the frequency and percent of respondents conducting evaluation at various levels of Bennett's Hierarchy most often. There was a sharp decline in frequency between the level "measures changes in knowledge, attitude, skills, and aspirations (KASAs) at end of event" and "recontacting participants after the event."

Table 1.
Levels of Evaluation Evaluated Most Often (based on Bennett 1976) (n = 219)

Levels of Evaluation



Don't evaluate programs



Ask participants for reactions to events



Measure changes in knowledge, attitude, skills and aspirations at end of event



Re-contact participants after event to measure changes in knowledge, attitude, skills, and aspirations



Measure behavioural/practice changes



Measure long-term conditions




Hypothesis Testing

One-way Analyses of Variance using Scheffe and Duncan Post Hoc tests (alpha = 0.05) were conducted to test the null hypotheses of no significant difference between years of experience, age, attitude, and perceived organizational commitment when level of Bennett's Hierarchy was used as the grouping variable. The analysis showed that "perceptions regarding organizational commitment to evaluation" does not influence program evaluation levels; evaluation behavior appears to be more a function of a person's age, years of experience, and attitude towards evaluation than other factors.

Independent samples t-test showed that there was not a significant difference between tenure-track NREPs who have achieved tenure status and NREPs who have not yet achieved tenure with regard to evaluation behavior (p =0.954). There was a significant difference between those who believe their job performance is assessed on basis of program evaluation and those who believe it is not (p = 0.001). The t-test also showed that those who had access to evaluation specialists performed evaluation at higher levels of than those who did not (p = 0.000).

Cluster Analysis

A K-means (non-hierarchical) cluster analysis was an exploratory approach used to classify respondents into homogeneous groups with respect to their evaluation practices to reveal if there were trends among respondents that were not revealed through other analyses. The cluster analysis revealed three natural groupings of what were subsequently labelled as "high-levels," "middle-levels," and "lower-levels" because of the patterns of characteristics that accompanied their evaluation practices.

For example, with the exception of a few variables, "high-levels" responded highest on most positive aspects of evaluation, "lower-levels" scored the lowest, and "middle levels" were in the middle. A higher percentage of "high-levels" had positive attitudes towards evaluation, perceived that their organizations were committed to evaluation, and believed their own job performance assessment included criteria relating to evaluation than the other two groups. "High-levels" felt more confident in conducting program evaluations than either "middle-levels" or "lower-levels." "High-levels" were slightly older and more experienced. Position classification (tenure-track vs. non-tenure track), education level, and tenure status did not follow patterns across the three groups.

A much higher percentage of "high-levels" said that they love doing evaluation than did the other clusters, and a far smaller percentage of "high-levels" preferred to ignore evaluation than did the other two groups. A smaller percentage of "high-levels" said that their motivation to conduct program evaluation was due to external factors than did the other two groups.

"High-levels" listed "seeing their evaluation results used in program decisions" as their top motivating factor. "Middle-levels" and "lower-levels" tended to rely more on external motivators such as promotion and tenure or using evaluation as a criterion of job performance assessments as their top motivators.

Forty percent of "lower-levels" said that they have access to evaluation specialists to help, compared with 72 percent of "high-levels" who said that they had access to evaluation specialists.

Qualitative Results

In the open-ended survey questions, respondents discussed their own barriers, evaluation approaches, and recommendations for improving evaluation capacity and behaviors within their institutions. The qualitative data generally supported the quantitative findings. Table 2 lists some direct quotes from respondents that describe the barriers.

Table 2.
Key Barriers Discussed by Respondents

Key Barriers


Lack of skills

"Most of us working in county programs are master's level. We don't have a deep well of scholarly evaluation methods from which to draw."

Time prioritization issues

"Once a project ends, it's on to another. The scope and complexity of my day-to-day responsibilities limit my ability to devote much time to evaluation."

Lack of funding

"Good evaluations can be expensive. Our resources are constantly being cut. Evaluation sometimes gets dropped."

Methodological difficulties

"Client survey fatigue, inability to interpret data collected, low survey response rate, difficulty getting contact information for clients, and tedious university human subjects review processes."

Lack of organizational system for evaluation

"The organization is aware and wanting us to do more significant evaluations, but not yet giving us all the means (tools) to make the transition."

Lack of rewards/incentives for doing evaluation

"I have seen very little evidence that this is a real priority. It's expected and required but not well supported."

Skepticism about the value of formal evaluation

"So many of the evaluations I have seen others do seem valueless. The idea that we are responsible for our client's actions is politically driven and in some ways unrealistic."


The large number of qualitative responses (over 140 individual comments) on barriers and recommendations showed that NREPS have many ideas and opinions regarding evaluation and how to improve the practice. The largest number of recommendations fell within the categories "standardization/establish evaluation system" and "administrative incentives/rewards/prioritization for evaluation." Many respondents felt that verbal encouragement or requirements for evaluation were not backed up with adequate support systems, guidelines, or administrative procedures. The most frequently mentioned statements related to time prioritization--many respondents spoke of too many other program priorities that took precedence. One respondent said, "taking time for more thorough evaluations would interfere with productivity."


The study described here showed several factors that influence evaluation behavior of natural resource Extension practitioners.

Job performance assessment was a significant factor in program evaluation behavior variability among NREPs. The review of blank job performance assessment forms from across the country revealed a significant gap in clear criteria relating to evaluation in many states. It is notable that respondents with the similar type of job in 16 states disagreed on whether or not their job performance was assessed, in part, on their evaluation practices.

Another key influence was the availability of evaluation specialists. The data showed that respondents who conducted higher levels of program evaluation had access to evaluation specialists in their state. Nineteen of 55 respondents who provided written recommendations suggested that hiring (or maintaining) an evaluation specialist is or would be an influence on their evaluation practices. Hiring an evaluation specialist has several benefits relating to the ingredients for successful evaluation in an organization: having evaluation specialists communicates organizational commitment for evaluation, can lead to increased confidence levels through training and mentoring, and provides a focal point for improved internal evaluation processes.

The influence of the motivator "Seeing my evaluation results used for decision making" was also a factor in evaluation behavior variability. The data revealed that many respondents, particularly those who conducted high levels of evaluation, were motivated to do evaluation by seeing their evaluation results used in program decision-making. Several respondents recommended that administrators show how evaluation results are being used in higher-level decisions as a way to enhance evaluation practice in their organizations. Belief that evaluation is not being conducted just for the sake of accountability was important to many respondents.

Despite the expectation that tenure status, funding source, and education levels would influence evaluation behavior, there did not appear to be any link to evaluation behavior and these variables.

Implications for Evaluation Capacity Building

These findings about individual motivation alone might lead Extension administrators to believe that if they hire older people who feel confident and have positive attitudes about evaluation, give them access to evaluation specialists, and ensure that their performance is assessed on the basis of their evaluation practices, then their evaluation worries would be over. However, recent literature on organizational evaluation capacity building indicates that an organization's evaluation performance is more complex than examining motivation of employees at one moment in time (Compton, Baizerman, & Stockdill, 2002; Torres & Preskill, 2001; Toulemonde, 1999).

The findings of this study need to be tied to a larger discussion of the dynamics of organizational evaluation capacity building and the factors that make evaluation viable within an organization over time. The literature on evaluation capacity building reveals that there are many interactive factors related not only to individual motivation, but also to organizational systems and processes that influence the capacity of an organization to effectively evaluate its work. Individuals are complex, and they operate in complex social and work situations. The evaluation capacity literature instructs us to look at not only what makes NREPs tick regarding evaluation, but what makes an organization tick regarding evaluation, with the individual as one--but only one--component.

Known by several names (organizational learning, mainstreaming, enculturation, and evaluation capacity building), the concept of multi-faceted intervention to enhance evaluation at all levels of an organization needs to be a continuing focus of Extension organizations at the national, state, and local levels. Extension organizations need to develop internal evaluation processes, support systems, incentive systems, and ways to enhance individual competencies to affect change in evaluation behaviors. The organizations can affect individual motivation through training and mentoring to build confidence, and by demonstrating commitment to evaluation at all levels.

The study described here looked at a specific set of NREP personal and job characteristics; it is possible that other factors or barriers exist or have arisen since the survey was released. It has been recently suggested, for example, that the university human subjects ethics review process has recently become a significant barrier to some NREPs. Additionally, the study examines factors from the perspective of NREPs. A useful complementary study would be an examination of administrative perspectives and other organizational factors, or a case study of administrative interventions designed to enhance evaluation practices.


AREERA (2004). Agriculture Research, Extension and Education Reform Act

Barnette, J.J., & Sanders, J.R. (2003). The mainstreaming of evaluation. Jossey-Bass, San Francisco.

Bennett, C.F. (1976). Analyzing impacts of Extension programs. Washington D.C., U.S. Department of Agriculture Extension Service, No. ESC 575.

Centers for Disease Control and Prevention (2004). Framework for program evaluation in public health. MMWR 1999; 48, No. RR-11.

Compton, D., Baizerman, M., & Stockdill, S. (2002). The Art, craft, and science of evaluation capacity building. Jossey Bass, San Francisco.

Decker, D., & Yerka, B.L. (1990). Organizational philosophy for program evaluation. Journal of Extension [On-line], 28(2). Available at: http://www.joe.org/joe/1990summer/f1.html

GPRA (1993). Government Performance and Results Act. Available at: http://www.whitehouse.gov/omb/mgmt-gpra/gplaw2m.html

Torres, R., & Preskill, H. (2001). Evaluation and organizational learning: Past, present, and future. American Journal of Evaluation, 22(3): 387-395.

Toulemonde, J. (1999). Building effective evaluation capacity: Lessons from practice. Transaction Publishers, New Jersey.

World Bank (2002). Annual report on evaluation capacity Development. Washington D.C., Operations Evaluation Department.