StackScore Methodology

The thinking under
the measurement.

StackScore™ isn't a maturity model. It's the operationalization of a capability formalism that's been thirty years in the making. Here's what sits under it — and what makes the constraint diagnostic possible.

The capability formalism

Capability = Competence × Enabling Conditions.

Amartya Sen's Development as Freedom (1999) and Martha Nussbaum's Creating Capabilities (2011) defined capability as the substantive freedom to function. Not the skill alone. Not the resources alone. The real opportunity to exercise the skill given the conditions you're actually inside of. The Fusion Architecture under StackScore operationalizes that distinction into one equation, per layer.

When you split each layer into a Competence signal (do you know how, do you practice it) and an Enabling Conditions signal (does your environment let you), you stop conflating two situations that look identical in a single-score model. The brilliantly trained nonprofit with no budget. The well-funded operation with no practice. They produce the same composite. They need opposite interventions. The C-E gap is the structural inequality, measured at the layer level. That is the equity thesis made empirically testable — the move no other adaptive-capacity instrument in market is making.

For each layer of the Stack we measure two signals and fuse them as a geometric mean: Capability_i = √(C_i · E_i). The geometric mean is theoretically consistent with Sen's framework — it goes to zero whenever either signal is zero (you cannot exercise skill without conditions, and you cannot use conditions without skill) and naturally penalizes divergence. A weighted additive form would have implied substitutability, which Sen explicitly rejects; v1.1 corrected that deviation from theory. Whether the geometric mean produces more diagnostic separation than the strictest interpretation (min(C, E)) is one of the empirical questions Paper 5 addresses.

The five layers

Derived, not assembled.

Each layer sits on a distinct literature. Five factors because the construct is theoretically five-factor, not because exploratory analysis surfaced five clusters. The CFA proposed in Paper 5 either confirms the structure or breaks the theory. That's the honest test — and the reason this stands up to peer review when maturity-model competitors don't.

Signal

See what's changing.

The capacity to detect emerging change through systematic and diverse information monitoring. Grounded in environmental scanning (Choo 2001), weak-signal detection (Ansoff 1975), information inequality (van Dijk 2005), and critical signal literacy (Noble 2018). Measured on practices like the regularity of scanning, source diversity, cross-domain awareness, and whether anyone names the bias in the information environment itself.

Sense

Understand what it means.

The capacity to interpret detected signals through structured analytical frameworks, producing coherent accounts of what change means for the organization and its constituents. Grounded in Futures Literacy (Miller 2018), Causal Layered Analysis (Inayatullah 2004), sensemaking in organizations (Weick 1995), and scenario methodology (Schwartz 1991). Measured on scenario practice, implications mapping, temporal assessment, and whether analytical conversations produce conclusions or just continue.

Shape

Decide what you want.

The capacity to articulate preferred futures and commit to strategic direction based on explicit values and priorities. Grounded in normative futures (Bell 2003), epistemic justice (Anderson 2016), and imaginative sovereignty (Benjamin 2019). Measured on values articulation, vision coherence, decision-making effectiveness under uncertainty, stakeholder alignment, and imaginative range — the ability to envision genuinely different futures rather than incremental extensions of the present.

Sprint

Build the first version.

The capacity to translate strategic direction into tangible prototypes, pilots, and testable implementations through rapid experimentation. Grounded in design thinking (Simon 1969), lean methodology (Ries 2011), and build-measure-learn cycles. Measured on prototyping practice, resource availability, experimentation culture, iteration speed, and whether equity is assessed inside what gets built rather than appended afterward.

Sustain

Keep building without external support.

The capacity to embed adaptive practices in organizational systems, culture, and infrastructure such that they persist beyond individual champions and funding cycles. Grounded in systems thinking (Meadows 2008), organizational learning (Argyris & Schön 1978), Emergent Strategy (brown 2017), and the nonprofit starvation cycle critique (Gregory & Howard 2009). Measured on documentation, distributed capacity, feedback mechanisms, funding sustainability, and whether the organization builds its own capacity or stays dependent on external consultants.

The constraint logic

The diagnostic that matters isn't the average. It's the minimum.

The throughput of a system is governed by its weakest station. Eliyahu Goldratt named the principle in The Goal (1984); the Theory of Constraints is the operational expression. The StackScore overall score isn't a five-layer average. It's the minimum-layer capability — because that's what's actually capping the system.

A profile of 4.5 / 4.2 / 4.0 / 1.8 / 3.9 is not “generally strong with one weak spot.” It is a Sprint-constrained system producing far less adaptive value than its strong layers should allow. Insight piles up; nothing reaches first version. The Constraint Report leads with the binding layer, not the composite. The recommended actions address the constraint, not the strengths. Investing in the strong layers when one is broken is the most common failure mode of capacity-building work — and it's exactly what an average-based score would tell you to do.

The Fusion 2×2 diagnostic

Not just where you're stuck. Why.

Each layer's C and E scores plot into one of four cells. The cell determines which kind of intervention will actually work.

Structural Constraint

High Competence × Low Conditions. The skill exists; the environment blocks it. This is the equity gap — it tends to characterize community organizations and historically under-resourced institutions. Intervention: resources, access, infrastructure, institutional permission.

Full Capability

High Competence × High Conditions. The layer is genuinely strong. Maintain rather than develop. The constraint sits elsewhere.

Double Deficit

Low Competence × Low Conditions. The hardest quadrant. Requires simultaneous investment in both — typically through embedded programs rather than standalone training or pure funding shifts.

Untapped Potential

Low Competence × High Conditions. The environment is ready but the skill isn't there. Intervention: training, mentorship, practice, exposure. Common in well-resourced institutions that haven't yet developed the practice.

The 2×2 is the diagnostic output. It tells you not just what's capping you but why, and therefore what kind of intervention will actually work. The Constraint Report's recommended actions are routed by the cell, not just the layer.

The cross-instrument move

Whose conditions are creating whose constraints.

The same five-layer structure applies at individual, organizational, community, and funder scales. The instruments differ by persona because the competencies and conditions differ by context; the formalism is the same. That fractal structure makes systemic diagnosis possible.

When a funder and a representative sample of their grantees both take StackScore, the cohort analysis compares per-layer Conditions scores across roles. If the funder's Sustain-Conditions score is low (they don't fund multi-year general operating support) AND the grantees' Sustain-Conditions scores are also low (their adaptive capacity isn't funded through operating budget), the diagnostic emits a structural-causation call: the grantees' Sustain constraint is downstream of the funder's practice. Gregory & Howard's (2009) nonprofit starvation cycle, made empirical at the layer level.

Four call types render: structural causation (both sides low — the conditions are creating the constraint), upstream-doing-its-part (funder high, grantee low — look elsewhere), downstream-overcoming (grantee high despite funder low — fragile but real), and enabling-pair (both high — the relationship works at this layer). This is the diagnostic move that distinguishes StackScore from every other adaptive-capacity instrument in market.

What this instrument doesn't yet do

Honest about v1's limits.

StackScore v1.1 is designed to be validatable. The empirical work that turns "designed to be validatable" into "validated" hasn't happened yet. Paper 5 (FCI Measurement Instrument) is the forthcoming work that closes these gaps; Teen Takeover's Summer 2026 cohort generates the N=240+ data points.

What v1.1 does not yet have:

(1) Empirically confirmed factor structure. The five-factor model is theoretically derived, not yet validated by confirmatory factor analysis. The CFA in Paper 5 either confirms the structure or breaks the theory; that's the honest test.

(2) Measured reliability coefficients. Cronbach's α and composite reliability per factor are stated as targets, not measurements. Paper 5 measures them on the N=240+ cohort.

(3) Cross-instrument measurement invariance. The cohort diagnostic compares per-layer Conditions scores across role groups (Funder vs. grantees, Employer vs. employees) on the assumption that the instruments measure the same latent construct comparably. This assumption has not been tested. The structural-causation calls should be read as diagnostic hypotheses worth examining in partnership conversation, not confirmed causal claims. Paper 5 includes configural / metric / scalar invariance analysis.

(4) Criterion validity. Whether higher FCI scores actually predict better adaptive outcomes at Time 2 has not yet been tested. This requires longitudinal data collection and is a multi-year validation arc.

(5) Empirical calibration of the Fusion formula. v1.1 uses the geometric mean (sqrt(C·E)) based on theoretical consistency with Sen's framework. Whether the geometric mean produces more diagnostic separation than the minimum (or a calibrated weighted form) is an empirical question the data will answer.

Naming these gaps is a feature, not a bug. A maturity model that claims confident measurement on undefended weights is suspicious to a sophisticated buyer. An instrument that names its v1 limitations explicitly and ships fixes against them is the more trustworthy artifact.

Validity

The instrument earns the right to exist in the .30 to .60 band.

StackScore is designed to be related to but distinguishable from adjacent constructs. Convergent validity targets: Absorptive Capacity (Jansen, Van Den Bosch & Volberda 2005), Dynamic Capabilities (Teece 2007), Organizational Learning (Goh & Richards 1997), Innovation Climate (Ekvall 1996). Each factor should correlate with related constructs in the .30 to .60 range. Below that, the construct is unrelated. Above that, it's redundant.

Discriminant validity matters more. The five factors should be empirically distinguishable from each other — Signal capacity should be measurably distinct from Sense capacity, and so on. That's the test of the framework's structural claim, planned in the confirmatory factor analysis described in Paper 5 (FCI Measurement Instrument). The Teen Takeover Summer 2026 cohort generates the N=240+ data points for that validation.

Reliability targets are conventional: Cronbach's α ≥ .80 per factor, composite reliability ≥ .70, test-retest stability over a 4-6 week window. Items that fail these thresholds get revised in v1.1.

References

The work it sits on top of.

Anderson, R. (2016). Afrofuturism 2.0: The Rise of Astro-Blackness. Lexington Books.

Ansoff, H. I. (1975). Managing strategic surprise by response to weak signals. California Management Review, 18(2), 21–33.

Argyris, C., & Schön, D. A. (1978). Organizational Learning: A Theory of Action Perspective. Addison-Wesley.

Bell, W. (2003). Foundations of Futures Studies. Transaction Publishers.

Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press.

brown, a. m. (2017). Emergent Strategy: Shaping Change, Changing Worlds. AK Press.

Choo, C. W. (2001). Environmental scanning as information seeking and organizational learning. Information Research, 7(1).

Ekvall, G. (1996). Organizational climate for creativity and innovation. European Journal of Work and Organizational Psychology, 5(1), 105–123.

Goh, S., & Richards, G. (1997). Benchmarking the learning capability of organizations. European Management Journal, 15(5), 575–583.

Goldratt, E. M. (1984). The Goal: A Process of Ongoing Improvement. North River Press.

Gregory, A. G., & Howard, D. (2009). The nonprofit starvation cycle. Stanford Social Innovation Review, 7(4), 49–53.

Inayatullah, S. (2004). The Causal Layered Analysis Reader. Tamkang University Press.

Jansen, J. J. P., Van Den Bosch, F. A. J., & Volberda, H. W. (2005). Managing potential and realized absorptive capacity. Academy of Management Journal, 48(6), 999–1015.

Meadows, D. H. (2008). Thinking in Systems: A Primer. Chelsea Green Publishing.

Miller, R. (Ed.). (2018). Transforming the Future: Anticipation in the 21st Century. Routledge/UNESCO.

Noble, S. U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.

Nussbaum, M. C. (2011). Creating Capabilities: The Human Development Approach. Harvard University Press.

Ries, E. (2011). The Lean Startup. Crown Business.

Sen, A. (1999). Development as Freedom. Oxford University Press.

Simon, H. A. (1969). The Sciences of the Artificial. MIT Press.

Teece, D. J. (2007). Explicating dynamic capabilities. Strategic Management Journal, 28(13), 1319–1350.

van Dijk, J. A. G. M. (2005). The Deepening Divide: Inequality in the Information Society. Sage.

Weick, K. E. (1995). Sensemaking in Organizations. Sage Publications.

Try it

The methodology is the diagnostic.

Taking the assessment is the fastest way to see how this works. Free, 10-15 minutes, results unlock immediately.

The thinking under the measurement.