The Journal of Extension -

December 2010 // Volume 48 // Number 6 // Feature // v48-6a1

Evaluating Multiple Prevention Programs: Methods, Results, and Lessons Learned

Extension faculty and agents/educators are increasingly collaborating with local and state agencies to provide and evaluate multiple, distinct programs, yet there is limited information about measuring outcomes and combining results across similar program types. This article explicates the methods and outcomes of a state-level evaluation of approximately 200 individual programs across eight program types. A three-tiered approach was devised and implemented with a collaborative, iterative approach to assess the extent to which common program objectives were met. Results provided evidence of efficacy and impact, and offered tools for diverse stakeholders. Lessons learned and implications for Extension professionals are offered.

Francesca Adler-Baeder
Associate Professor and Extension Specialist
Auburn University
Auburn, Alabama

Jennifer Kerpelman
Professor and Extension Specialist
Auburn University
Auburn, Alabama

Melody M. Griffin
Post Doctoral Research Fellow
Auburn University
Auburn, Alabama

David G. Schramm
Assistant Professor and Extension Specialist
University of Missouri
Columbia, Missouri


Extension professionals are increasingly partnering with grassroots groups and state organizations to evaluate external programs. In this article, we describe a method of simultaneously evaluating multiple prevention program types to document program impact, share results, and provide lessons learned when assessing outcomes across multiple, distinct prevention programs.

In 2005, we were awarded a 3-year contract from the Alabama Department of Child Abuse and Neglect Prevention/The Children's Trust Fund (CTF) to conduct an evaluation study of their funded programs. Each year, approximately 200 individual community-based programs are awarded funding through a competitive process in the areas of parent education/support, home visiting, respite care, fatherhood, community awareness, school-based, non school-based/after-school, and mentoring.

The primary challenge for the evaluation study was to develop methods for standardized data collection across individually distinct prevention programs, so that aggregation of outcome results within each of the program types could be conducted. Prior to the study, grantees had determined their own methods for assessing outreach and outcomes, and submitted individual reports that could not be combined within similar program types. The model, results, and "lessons learned" are described to assist Extension professionals with their current or future work with local and state agencies as an evaluation partner.


Program evaluation often serves numerous functions. Roucan-Kane (2008) provided several purposes of evaluation, including the need to report results to establish and continue public relations and to demonstrate program impact to funders. Evaluation data presented by Extension educators to businesses, agencies, and organizations can be a strong marketing tool (Stup, 2003). Additionally, evaluations identify strengths and weaknesses of programs, and results can be used to compare programs within program types. This information can be used by organizations, such as CTF, to assist in decisions regarding whether program funding for specific grantees should be continued.

One area in need of documented evaluation involves programs aimed at preventing child abuse and neglect. The widespread prevalence of child abuse and neglect demands strategic efforts focused on reducing its occurrence due to the harmful effects it has on children and youth immediately (e.g., physical aggression, social impairment, low self-esteem, poor academic performance, etc.) and later during adulthood (e.g., low self-esteem, depression, abuse/neglect with offspring, etc.) (Gross & Keller, 1992; Higgins & McCabe, 2003; McGee, Wolfe, & Wilson, 1997; Solomon & Serres, 1999; Vissing, Straus, Gelles, & Harrop, 1991). With Alabama ranking 48th in the composite rankings of child well-being indicators in the national Kids Count ratings, successful efforts in reducing risks for Alabama children are particularly important. Prevention programs are educational in nature; therefore, agencies look to Extension faculty and agents/educators as natural partners.


Evaluation Approach

As the university-based partner, we focused as much on how we implemented the evaluation as on the actual nature of the evaluation tools we used. Our goal was to empower grantees with an understanding of the value of systematic evaluation methods for them individually, as programs, as well as for grantees as a whole. It was critical to be sensitive to feelings of apprehension about evaluation often experienced by community-based program staff, as well as to needs for evaluation methods that did not interfere significantly with time needed for program implementation.

While striving to empower grantees, the focus was simultaneously on CTF's need for valid measurement of outreach success and more meaningful documentation of indicators of program impact. Built into the design were methods for "standardizing" data collection across programs within each program type to allow for the aggregation of data across programs. However, there was allowance for some flexibility due to programs in each program type differing in program design and goals.

The key elements of the approach included the following:

  1. Valid measurement of participant demographics across all grantee programs (e.g., unduplicated numbers served, characteristics of participants, etc.).

  2. Valid measurement of program impact based upon participation (i.e., documenting positive changes among participants attributable to participation in the program).

  3. Aggregation of program impact evidence across similar types of programs.

  4. Inclusion of "user friendly" methods of data collection and provision of direct support by the evaluation team to grantees.

Evaluation Design

To document scientific evidence of program impact, the methods for the evaluation study assessed participants' knowledge, attitudes, and skills prior to program participation and their knowledge, attitudes, and skills after participation. Before 2005, CTF simply required programs to report "outcome data" about their participants as a group (e.g., percent that could use at least three positive parenting practices, etc.). However, such assessments did not include information about participants before the program, which made it impossible to know whether/how program participation had had an impact. Although CTF programs included in the evaluation study differed in approaches and curricula, the evaluation design was based on similarities across the programs' goals and objectives within each program type. Prior to beginning our evaluation process, we conducted an initial systematic review of programs' applications for funding in which goals and objectives were stated.

Due to the large number of programs funded by CTF and the anticipated volume of data, the evaluation team developed a three-tiered evaluation design.

Tier 1

Tier 1 of the evaluation study consisted of gathering data on perceptions of impact from all programs. We established lists of program objectives for each program type. All CTF grantees selected a minimum number of program objectives (minimum three, maximum 12) for their program type that they expected to see affected through program participation.

The achievement of these objectives was assessed through the Post-test with Retrospective Pre-test Questionnaire method (Pratt, McGuigan, & Katzev, 2000), which consisted of items that were re-statements of program objectives in the first person. This method was efficient and provided meaningful quantitative documentation of the achievement of specific program objectives (Lam & Bengo, 2003).

For all programs, participants completed the questionnaire when program participation ended in order to assess changes in targeted knowledge, commitment, and ability. Due to the very large number of youth served, program staff surveyed a random sample of students (e.g., every 10th class). For each item, participants indicated, in terms of specific objectives, where they stood after they have been exposed to the program and, concurrently, where they stood on the same items before program participation. For example, on the questionnaire for Parent Education/Support programs, one item stated, "My ability to use several forms of positive discipline." Participants in these programs responded to indicate how they rated themselves on this item after participating in the program (response options: is poor, is fair, is good, is excellent) and before participating in the program (response options: was poor, was fair, was good, was excellent). Participants were fully aware that they were assessing the impact of the program in their lives.

This method has value in that participants may have a clearer perspective on the extent to which their knowledge, attitudes, and abilities have changed, whereas before the program, they may under- or, typically, over-estimate their levels. In addition to the program objectives data, participant demographics were gathered at the Tier 1 level.

Tier 2

Tier 2 involved more detailed methods of data collection and was used with 25 to 30 representative programs in three programmatic areas—Home Visiting/Parent Education, Fatherhood, and Youth. Separate pre-program and post-program surveys were used that contain established social science measures of key factors that predict child maltreatment risk (NRC, 1993). These factors included: distress level, child development knowledge, use of nonpunitive parenting practices, perceived parenting efficacy, level of caregiving and involvement, family harmony, and co-parenting quality. Measures differed slightly based on program type. The specific measures for the eight program types are available from the first author. Participants in the Tier 2 programs completed the Pre-Program Survey prior to program participation and the Post-Program Survey after participation. Questionnaires took approximately 20 minutes to complete.

Tier 3

In Tier 3, ethnographic methods were employed to collect qualitative data about program experiences from a select number of participants and facilitators. Qualitative data can be used to inform program design (i.e., assist CTF staff and program directors with implementing changes that serve to enhance program impact) and can be used to "tell the stories" of program impact. Qualitative data put a human face on the interpretation of quantitative results.


The quantitative results of the evaluation efforts from program year (PY) 2006-2007 provided evidence of the positive outcomes due to CTF-funded program participation for parents and youth in Alabama. Almost 10,000 adults/parents and over 75,000 children/youth participated in program offerings during PY 2006-2007. The majority of adult participants were between the ages of 19 and 30 (46%). They were predominantly female (81%), and nearly equally Caucasian (48%) and African American (45%). For parent participants over the age of 18, only 41% reported working full-time; 48% reported not working for pay. Additionally, for these parents, one-quarter (26%) reported not completing high school, while 38% reported a high school degree or the GED as their highest level of education. Nearly half (49%) of these parents indicated their gross household income level was less than $14,000. An additional one-third (32%) of participants reported an income between $14,000 and $39,999. Most of the youth participants were in 5th grade or below, with almost half (46%) in grades 3 through 5. Half were boys, and half were girls. They were predominantly African American (55%) and Caucasian (39.5%).

Parents were served through parent education, home visiting, respite care, and fatherhood programs. Youth were served through school-based, non school-based, after-school, and mentoring programs. Evaluation methods considered developmental differences and used separate questionnaires for 5th grade and below and for 6th through 12th grade. For data from Tier 1, paired sample t-tests revealed statistically significant (p< .001) improvements on the average level of commitment, skill, and knowledge in all areas of program objectives. For example, see the following.

  1. For Parent Education/Home Visiting (3,294 participants, 24 objectives), key areas of improvement included commitment to seek informal support, ability to use positive discipline, and knowledge of age appropriate activities for their children.

  2. For Fatherhood (414 participants, 24 objectives), key areas involved commitment to co-parenting, commitment to cooperate with Child Support Enforcement, and knowledge of positive parenting strategies.

  3. For youth 5th grade and below (3,901 participants, 13 objectives), key areas included knowledge about good touch and bad touch, how to tell people what they want, and how to control anger.

  4. For youth in grades 6 through 12 (7,035 participants, 24 objectives), key areas included ability to handle anger, ability to communicate intentions, and commitment to tell someone about abuse/neglect situations.

Because significant differences can be easily detected with such large samples, we also made efforts to explicate the meaning and the magnitude of the changes. For stakeholders who are not researchers, we have found that a description of the shift in frequency of responses is meaningful. We included in our reports such descriptions as, "prior to program participation only 28% (n = 95) of fathers reported the quality of their co-parenting relationship was good or excellent; following program participation 62% (n = 205) reported the quality of their co-parenting relationship was good or excellent."

Analyses of Tier 2 program data revealed statistically significant increases (p > .001) in levels of parental involvement, positive parenting practices, parental efficacy, and beliefs about father responsibilities (19 programs, n = 312). They also indicated decreases (p > .001) in level of parental distress. Analyses of data from 6th through 12th grade participants (11 programs, n = 662) showed statistically significant increases in levels of self-esteem, commitment to education, accountability for actions, conflict management skills, and assertiveness skills.

The qualitative data collected through ethnographic methods used in Tier 3 further demonstrated the effectiveness of the programs. For example, parents stated the following.

  • "I take more time to cool off and communicate better with my kids."

  • "[...] kids get into trouble at school by mimicking parents' bad behavior!"

  • "I learned things about being a parent [...] I didn't know how to be a parent originally."

Youth also shared the following.

  • "I learned how not to argue [...] how to 'tune someone out' when getting into a conflict with them [...] I can express myself better."

  • "[The program] helped me complete my homework, perform better on tests, and improve my grades."

These quotes represent just some of the data collected that "give voice" to the quantitative results. Policy-makers typically appreciate both the quantifiable evidence and the "stories" of Alabama citizens.

These results from CTF-funded programs can be interpreted as evidence of the reduction of the likelihood of participating parents to engage in child abuse/neglect and the increase in factors associated with positive outcomes for participating youth. Importantly, this evidence comes from systematic documentation of the combined effort and outcomes of program offerings. This kind of evidence, along with even the most basic reports of demographics of participants, had never been explicated in the 22-year history of the Children's Trust Fund prior to this evaluation project.

Lessons Learned

In addition to documenting program impact, another goal of this ongoing evaluation study was to assess CTF grantees' responses to evaluation methods and to use their feedback and lessons learned in improving data management. Offering our methods and experiences for use in similar large-scale evaluation studies of multiple programs also is a goal.

Attempting to use standard measures across very different program designs is not without challenge. We stress that a highly collaborative approach is essential in implementing an appropriate, efficient, and effective research design. While systematic assessment is critical, such assessment cannot be rigid and inflexible. We worked to incorporate "real world" issues and adjusted methods when needed. For example, when requested by school programs serving large numbers of youth, we allowed for random sampling of participants, rather than data collected from every single participant. We developed an "assessment & referral" questionnaire for parenting programs that differed in the nature of their objectives compared to "traditional" parenting programs. We also adjusted the list of objectives and the corresponding items on the retrospective pre/post questionnaire in response to suggestions for nontargeted objectives and additional objectives, removing and adding items each year.

Beyond these measurement-specific lessons, organizational methods were developed for dealing with the complexity of simultaneously implementing evaluation methods with nearly 200 programs across eight program types. We held weekly team meetings to coordinate ongoing project tasks, as well as to keep all team members informed of evaluation activities.

Two Extension specialists served as co-principal investigators and project directors to oversee the project design and output. A post-doctoral research fellow served as the project manager and oversaw all aspects of the project, as well as provided technical assistance for CTF grantees. Another full-time team member served as the research lab manager to monitor day-to-day activities of the research lab, which included quarterly data report processing, management of a sequential, multi-step data management system, and training and oversaw the data entry by undergraduate research assistants. We also employed a site visit manager and found the resulting face-to-face contact with grantees to be critical to the success of the project. We supported two to four part-time graduate research assistants who gained experience in both research and community liaison work, and we provided research experience for over 75 undergraduate students.

We developed standardized procedures for each area of the project in order to ensure consistency across the length of the project and across team members. It also was important to use systems to track and centralize communication with individual programs because a grantee may speak with several team members across time.


Collectively, our experiences during the first 2 1/2 years of the 3-year project provide valuable information to others for navigating the challenges of jointly evaluating multiple, diverse programs with community-based programs that have little to no experience documenting outreach and impact. Our experiences may be helpful to other Extension educators engaged in program evaluation work with large audiences and diverse program types. Such efforts are important for providing tools to intermediary groups or agencies, such as CTF of Alabama, for determining the efficacy of providing funding support for programs, for assessing the impact of individual programs in order to make informed re-funding decisions, and for seeking continued or increased resources from funding sources.


This study was supported through a grant from the Alabama State Department of Child Abuse and Neglect Prevention (CTF-CFTF-2008-310). The authors wish to thank all state and local partners for their time and efforts in the successful conduct of this project.


Gross, A. B., & Keller, H. R. (1992). Long-term consequences of childhood physical and psychological maltreatment. Aggressive Behavior, 18, 171-185.

Higgins, D. J., & McCabe, M. P. (2003). Maltreatment and family dysfunction in childhood and the subsequent adjustment of children and adults. Journal of Family Violence, 18, 107-120.

Lam, T. C., & Bengo, P. (2003). A comparison of three retrospective self-reporting methods of measuring change in instructional practice. American Journal of Evaluation, 24, 65-80.

McGee, R. A., Wolfe, D. A., & Wilson, S. K. (1997). Multiple maltreatment experiences and adolescent behavior problems: Adolescents' perspectives. Development and Psychopathology, 9, 131-149.

National Research Council. (1993). Understanding child abuse and neglect. Washington, DC: National Academy Press.

Pratt, C. C., McGuigan, W. M., & Katzev, A. R. (2000). Measuring program outcomes: Using retrospective pretest methodology. American Journal of Evaluation, 21, 341-349.

Roucan-Kane, M. (2008). Key facts and key resources for program evaluation. Journal of Extension [On-line], 46(1) Article 1TOT2. Available at:

Solomon, C. R., & Serres, F. (1999). Effects of verbal aggression on children's self-esteem and school marks. Child Abuse & Neglect, 23, 339-351.

Stup, R. (2003). Program evaluation: Use it to demonstrate value to potential clients. Gorham, E. E., DeVaney, S. A., & Bechman, J. C. (1998). Journal of Extension [On-line], 41(4). Article 4 COM1. Available at:

Vissing, Y. M., Straus, M. A., Gelles, R. J., & Harrop, J. W. (1991). Verbal aggression by parents and psychosocial problems of children. Child Abuse & Neglect, 15, 223-238.