Chapter III.2 from the new workbook of the Slovenian Association of Evaluators.  forthcoming, Spring/Summer 2024.

Summary: Participatory evaluation struggles to aggregate diverse individual impacts of complex interventions into summary indicator of impact. Traditional micro-to-macro approaches prove inadequate. This section explores constructivist methods emphasizing meso-level synthesis focusing on intermediate phenomena (such as authentic middle, the empty middle, relative thirdness…). This conceptual shift from micro-macro to micro-meta-level synthesis unlocks the full potential of participatory evaluation. It fosters a form of collective rationality that doesn’t necessitate a trade-off between inclusivity and collective rationality in evaluation. Meso-matrices and Venn diagrams are suggested to operationalize a new concept of synthesis and collective rationality.

Another core principle of participatory evaluation is to foster collective rationality. This necessitates ensuring that it consistently generates higher and more integrated societal benefits from the given contributions of diverse individuals and groups, compared to other approaches. The effectiveness of the tools employed in participatory evaluation should then be assessed based on their ability to address the aggregation problem, specifically, how successfully they integrate diverse perspectives into a coherent collective sense or decision. Restricted, selective, or partial aggregation are justified with incoherent or biased aggregation methods. Such methods lead to suboptimal decisions, resource waste, and hinder collective learning and significantly disadvantage the collective good. This, in turn, erodes trust, causing community disengagement and reluctance to cooperate in communal endeavours.

The terms ‘aggregation’ and ‘synthesis’ are used interchangeably to refer to the process of ‘aggregative synthesis’ (Munn et al., 2014). This process involves accumulating data from multiple sources into a unified dataset (aggregation) followed by interpretation to create a novel, cohesive understanding (synthesis). Synthesis produces cohesive understanding by making sense of the relationships between the individual parts from the viewpoint of the whole.

The logic underpinning aggregative synthesis must align with the conceptualization of the whole within the evaluation object, its scope, and the underlying theory of change (Creswell, 2022). For simple evaluation objects, the methodology of aggregation is simple. It is typically descriptive and quantitative, employing linearly additive, commensurable. The scope of simple aggregation is on summarizing, averaging, or identifying the frequency of specific values or data points within a dataset. Conversely, complex evaluation objects necessitate complex aggregation methods. These methods handle incommensurable inputs and involve comparative, indirect, and non-linear data. The objective is to identify characteristic patterns, contradictions, or novel meanings through a deeper level of interpretive reading.

Participatory evaluation inherently faces the aggregation problem. It stems from the multifaceted nature of collective choice. For instance, complex interventions can produce diverse effects on various groups, sectors, levels, thematic areas, or domains (Scriven, 2003). This characteristic heterogeneity makes direct comparisons of intervention’s effects problematic. The core problem lies in how to effectively account for the wide range and diversity of inputs received during participatory evaluation while simultaneously achieving a unified understanding of the collective outcome.

The concept of the whole can be constructed in diverse ways during the synthesis. Multiple conceptualizations exist, such as the hierarchical structure of the MSC and OH, the horizontal network pattern of the CM, or a combined vertical-horizontal structure in the SM. Furthermore, as Scriven (2003) emphasizes, the purpose of an evaluation determines the logic of synthesis. Therefore, the selection of an aggregation approach requires careful justification

Historically, policy impact evaluation has exhibited a perplexed relationship with the need for evaluative synthesis. A central point of contention lies in the absence of a unified perspective on the role and methodology of data aggregation within the evaluation process. This lack of consensus has led to deep divisions among the dominant evaluation traditions – micro, macro, and meso levels.

Leopold et al. (1971) developed a detailed matrical method for impact assessment at the micro level. They believed that aggregation of fragmented inputs into policy-relevant conclusions at the macro level must be avoided as it requires value judgments. The evaluator’s primary task is to inform and comment on specifics rather than generalize. Leopold argues that refusal of aggregation is necessary for neutral evaluation, as it creates a boundary between the evaluator and policy-maker, protecting the former from political interference.

Since Leopold, many evaluators have refused to aggregate their findings. The Impact Assessment Board (IAB), which advises the European Commission, found that most evaluation studies fail to provide policymakers with useful information at the macro level. Similar conclusions were reached by Hageboeck et al. (2013) in their meta-evaluation of 340 evaluation reports prepared for the United States Agency for International Development, and by Huitema et al. (2011), who evaluated the quality of synthesis in 259 evaluation studies commissioned for the European Union climate policy.

Conducting detailed assessments can certainly help policy-makers better understand detailed impacts at lower levels of the organisational hierarchy. However, it does not make the decision-making process easier (Diamond, 2005). When findings are not aggregated, evaluation studies produce an information overload. They fail to capture the complex reality that policy-makers operate within and only provide banal answers to policy questions (Virtanen, Uusikylä, 2004).

The rejection of summation in evaluation, and shifting the responsibility to policy-makers, program managers or other key stakeholders, assumes that they can perform the task neutrally. This, however, can be difficult to justify (Stiglitz et al., 2009) due to their bounded rationality (Simon) and vested interests (Crano, 1983). Scriven (1994, p. 378) highlighted that rejecting summation in evaluation means “letting the client down at exactly the moment they need you most” – this is when the stakeholders need to correlate opposite findings and make sense of them at the collective level. Scriven saw that rejection of aggregation only exposes evaluation results and their interpretation to manipulation by included minority.

Some evaluation approaches aggregate their detailed conclusions and they do it in diverse ways. The traditional method involves aggregating fragmented commensurable findings, from the micro to the macro level that creates a composite indicator of the intervention’s overall success. Here, input data are linked to the aggregate through a common metric, such as euros, tons of CO2 emissions, or number of votes. However, challenges arise when dealing with incommensurable data in qualitative research. Some methods treat micro and macro levels as non-comparable or incommensurable. But the SM and the CM conceptualize micro and macro as extreme points on a continuum, allowing researchers to zoom in or zoom out on specific details or subaggregates within the overall findings.

In complex situations, simply aggregating similar data points is insufficient. Aggregation is “not simply a routine process of drawing general lessons from local projects; one must also take into account how power, ignorance, and framing play a role” (Geels, 2007, p. 646). Various aggregation methods reflect different ethical and political concerns of the diverse values, preferences, and interests of stakeholders, mirroring their differing comprehension of societal structure and its power dynamics.

Due to documented inability of result-based approaches, the evaluation of complex interventions calls for a constructivist approaches. Its epistemology moves beyond a purely evidence-based model, embracing design-based approaches that are inclusive and emphasize the subjective and contextual explanations for complex interventions’ effects.

Emerging within the fourth generation of evaluation approaches (Guba, Lincoln; 1989), constructivist evaluation emphasizes the collective sensemaking through dialogue and reflexive processes. Meta-narrative synthesis (Flanagan, 2022), a specific constructivist approach,[1] is explicitly referenced in the CM. Instead of aggregating data points, narratives are synthesised thematically. Rather than seeking uniformity across findings, meta-narrative evaluation explores the relationships and contradictions between them. It acknowledges the influence of research contexts and theoretical perspectives on the data. While identifying contradictions is a crucial step in evaluation, the meta-narrative approach goes further by emphasizing their resolution through synthesis, leading to a more comprehensive understanding.

Consensus-building is another prominent constructivist approach employed in evaluation synthesis (Hill et al., 1997). This approach emphasizes bringing together participants with diverse perspectives to collaboratively discuss a specific issue and arrive at a mutually agreeable understanding. The approach utilizes facilitated dialogue to identify key themes while acknowledging the existence of disparate interpretations. Structured discussions, often organized hierarchically, guide the synthesis process. Voting might be used to reach a final decision, particularly when adhering to the ‘One Person – One Vote’ principle.

Another constructivist approach is the Qualitative comparative analysis (QCA), followed by the CM. QCA identifies the core themes, issues, and questions emerging from the qualitative data. Researchers interpret the results and conclude the causal relationships or patterns observed in the data. Synthesis can be achieved retrospectively using the ‘grounded theory’ approach, which entails searching for a theoretical framework that best explains the data after it has been collected and analysed.

The concept of responsive evaluation, introduced by Stake in 1967 and followed by OH (Beardmore et al., 2023), prioritizes stakeholder concerns over predetermined objectives or indicators. It emphasizes the use of methodologies that are sensitive to the cultural backgrounds of participants. During the synthesis phase, the evaluator employs triangulation, a process of corroborating findings from multiple case studies, to identify recurring patterns within the collected data.

Participatory action research (PAR), introduced by Freire in 1970 is followed by the CM (Copestake et al., 2019). This approach emphasizes the active participation of community members in all stages of the research process, including data collection, result synthesis, and meaning interpretation. The evaluator bridges research, policy, and community, acts as a facilitator, communicator, and advocate, fostering a participatory environment that empowers social change.

The OH tool also incorporates principles from utilization-focused evaluation (Patton, 2011). This approach prioritizes identifying and engaging with key stakeholders who will ultimately utilize the evaluation findings. The evaluation is then designed to address their specific information needs and preferences. Synthesis of findings occurs during collaborative meetings or workshops. These sessions facilitate discussion and prioritization among key users, ultimately informing action planning based on the evaluation’s insights.

While none of the four aforementioned deliberative tools explicitly employ the dialectical method, the value of this method in evaluating complex problems is recognized by many prominent evaluation scholars. Patton (2010) highlights the relevance of a dialectical perspective within evaluation methodology, as it acknowledges the inherent uncertainty and indeterminacy of qualitative data. In today’s complex societies, competing demands and conflicting viewpoints are inevitable. The dialectical method offers a critical lens for examining opposing positions, exposing their limitations and inconsistencies (Stake, 1998). Proponents like Dick (2000)[2] posit that dialectical synthesis can identify synergies and lead to win-win solutions for all stakeholders, unlike majority voting which creates winners and losers. Guba and Lincoln (1990) emphasize how the dialectical perspective empowers stakeholders to construct their understanding of reality through a process of cross-sectional synthesis of opposing viewpoints and a dialectical interpretation of the resulting interconnected findings.

Dialectics introduces a meta-theoretical hybrid method, transcending traditional divisions within epistemology. It intersects critical analysis with the constructivist formation of antagonist viewpoints. This intersectional emphasis is particularly characteristic of dialectical constructivism, an epistemological framework grounded in the concept of cross-sectional or intersectional synthesis. Charles Sanders Peirce’s (1996) work exemplifies this framework. The core ideas of dialectical constructivism can be operationalized through the concept of a meso-matrix: a quadratic matrix designed in a moderate span (at least three; Simon) between small and large number of independent domains.

The constructivist approaches to evaluation discussed previously share several key features, most notably their rootedness in the meso level synthesis. Constructivism inherently adopts a meso-level perspective, prioritizing the examination of social phenomena at intermediate levels, including groups, sectors, domains, and thematic areas. This meso-level focus underscores the importance of deliberation, facilitation, interpreting diverse narratives, and investigating the construction of shared meanings within these social contexts. Chelimsky (1997) advocates for a middle-ground approach to evaluation that acknowledges both the specific context of individual programs and the broader organizational or societal environment. Similarly, Scriven (1991) proposed a pragmatic approach situated at the meso level of evaluation, bridging the gap between micro and macro levels. Examples of such methods include network analysis, matrix models, causal models, cluster analysis, and meta-synthesis.

The authors of the MSC stress the importance of evaluating intermediate impacts. The CM and OH also suggest focusing on meso-level evaluation, such as participant-related structures, routines, or sub-maps. Unlike traditional outcome-driven approaches, the OH, MSC, and CM do not aim to identify interventions’ outcomes. They focus on changes within the system, such as behavioural shifts, process drivers, emerging patterns, contextual influences, or modifications to standard procedures. The SM also aligns with this perspective by employing a triadic structure for meso-level evaluation, encompassing three distinct, context dependent evaluation domains.

While various aggregation procedures exist at the meso-level of evaluation, their collective rationality vary significantly. The four tools under consideration all share a lineage within the meso-level tradition. These tools employ a mixed-methods approach, intersecting quantitative and qualitative data and methodologies. They encompass diverse research aspects to ensure comprehensive and impartial coverage across all domains of inquiry. Their purpose is confined to specific topics within narrower groups or thematic areas. For instance, they do not address broader, abstract (but intrinsically mesoscopic) concepts such as ‘sustainability’ or ‘social cohesion’. Traditional constructivist approaches utilize ‘mid-range’ or ‘multicriteria’  perspective, which fall short of a authentic mesoscopic perspective (Section IV.2).

Authentic mesoscopic theorizing within evaluation necessitates acknowledging the existence of opposing poles (Dopfer, 2013), such as binary constructs A and B. Synthesis of opposing viewpoints can only be achieved through the introduction of a third, intermediary category that integrates elements of both. For instance, socio-economic development serves is intermediary category between the economic and social domains of sustainable development. An intersectional perspective empowers evaluators to reframe fundamental oppositions from a stance of antagonism to a middle ground perspective. This middle ground allows for the coexistence of opposing viewpoints while, preserving the core aspects of both opposing viewpoints.

A critical distinction exists between absolute and relative thirdness. The triadic relationship is not constructed from three opposing domains (A, B, and C – absolute thirdness). Instead, it is formed by two opposing domains (A and B) with a third, intermediary category (ab) that is relative to them. Authentic meso-level evaluation hinges on this concept of relative thirdness (A, ab, B). This distinction reflects the difference between divisive (inauthentic) and integrative (authentic) mesoscopic structures. According to philosopher Charles Sanders Peirce, relative thirdness arises between at least three pairs (!) of domains, whereas dyadic oppositions only exist between two individual domains.

Peirce’s philosophy posits that phenomena can be apprehended in three distinct modes: firstness (independent quality), secondness (reaction or opposition), and thirdness (mediation or representation). ‘Firstness’ governs absolute qualities; ‘secondness’ oversees relative forces and ‘thirdness’ governs mediation. None of the three basic forms of logical reasoning has an absolute advantage over the other. Firstness is about the unit, secondness is about a pair, and thirdness is about a multitude of units. Firstness is about the counting of commensurable contents, secondness is about division, and thirdness is about multiplication. All three are indispensable. The monistic method is appropriate for classification, logic, and mathematics; the dualist method is suitable for the application of causality, dialectics, and correlation; and thirdness is prescribed in actions that generate integrated meaning from initially incomplete and inconsistent concerns (Peirce, 1931, 2004).

From this foundation, Peirce elaborated the concept of ‘secondness of thirdness,’ which bridges the divide between two-part and three-part reasoning. It offers a three-dimensional perspective on the dialectical relationships between domains at meso-level. Within this framework, secondness represents opposing poles, while the triadic element functions as a mediator that facilitates their interaction. This intermediary role is critical for enabling the collective construction of collective choices (Radej, 2022). Relative thirdness fosters the most authentic middle level reasoning.

Secondness and thirdness relationships can be operationalized for evaluation purposes within a meso-matrix. This matrix is a square structure containing at least three evaluation domains (like economic, social, and environmental domain of sustainable development), organized by three rows representing complex interventions and three columns representing assessment criteria. Secondness serves to organize and correlate dyadic relationships between these domains. Thirdness, on the other hand, facilitates triangulation between previously correlated pairs. For instance, examining the economic impact on social criteria alongside the social impact on economic criteria. This double correlation constructs a ‘meta-overlap’ (which aligns with the concept of ‘Meso 3 sublevel’ in Radej; 2021). The meso-matrix approach assumes a two-phase aggregation synthesis: first, from the micro level of participants to the meso level of evaluation domains, and then from the meso level to the meta-level of overlap between all evaluation domains that offer well justified frame for interpreting complex issues.[3] Alternatively, the concept of secondness of thirdness can be operationalized using a Venn diagram. It depicts three partial overlaps between three circles, culminating in a single meta-overlap situated within these partial overlaps.

The objective of mesoscopic synthesis fundamentally differs from both conventional aggregation methods (micro to macro) and inauthentic middle-level methods (as exemplified by Dopfer’s focus on micro-meso-macro synthesis). Ultimate goal of aggregation synthesis in complex conditions is not to reach the macro level, which is inherently incompatible with the mesoscopic nature of complex phenomena due to its logocentric (reason-centered; result-based) character. In contrast, authentic mesoscopic synthesis’ Ultimate aim of mesoscopic synthesis is to reach the meta-level, as the highest non-exclusive level of synthesis between overlaps, devoid of logocentric assertions.

The collective rationality of the four evaluation tools especially hinges on their ability to achieve meso-to-meta-level synthesis. In turn it cannot achieve the methodological intersection of relative thirdness with relative void which fosters the emergence of an empty middle at the meta-level. Standing in the empty middle allows participants to gain a new perspective on issues that are otherwise  obscured by bias. Previously epistemically blind participants can now see complex things as blindsighted. They do not decide between oppositions, but this does not equate them to relativism. Blindsighted evaluator from the empty middle only promotes reasonable judgment free from preconceived notions about which values, knowledge, or preferences should be prioritized in evaluating complex interventions. In participatory evaluation , the most privileged perspectives are those that demonstrate at a given level of inclusivity the highest degree of collective rationality (blindsighted).

[1] There are as many ways of constructivist synthesis as there are those who observe uncertain things, because everyone sees them differently. For example, Mann collected 1000 different constructs of the concept of sustainable development designing 1000 different approaches to integrating its integral domains. Mann S. 2011. Sustainable Lens: A visual guide. NewSplash Studio, Dunedin. 206 p.

[2] Dick B. 2000. Delphi face to face.

[3] The aggregation from micro to meso level is not different from the conventional approaches to aggregation since it is based on commensurability of impacts, by evaluation domains.