Capability = Competence × Enabling Conditions.
Amartya Sen's Development as Freedom (1999) and Martha Nussbaum's Creating Capabilities (2011) defined capability as the substantive freedom to function. Not the skill alone. Not the resources alone. The real opportunity to exercise the skill given the conditions you're actually inside of. The Fusion Architecture under StackScore operationalizes that distinction into one equation, per layer.
When you split each layer into a Competence signal (do you know how, do you practice it) and an Enabling Conditions signal (does your environment let you), you stop conflating two situations that look identical in a single-score model. The brilliantly trained nonprofit with no budget. The well-funded operation with no practice. They produce the same composite. They need opposite interventions. The C-E gap is the structural inequality, measured at the layer level. That is the equity thesis made empirically testable — the move no other adaptive-capacity instrument in market is making.
For each layer of the Stack we measure two signals and fuse them as a geometric mean: Capabilityi = √(Ci · Ei). The geometric mean is theoretically consistent with Sen's framework — it goes to zero whenever either signal is zero (you cannot exercise skill without conditions, and you cannot use conditions without skill) and naturally penalizes divergence. A weighted additive form would have implied substitutability, which Sen explicitly rejects; v1.1 corrected that deviation from theory. Whether the geometric mean produces more diagnostic separation than the strictest interpretation (min(C, E)) is one of the empirical questions Paper 5 addresses.
Derived, not assembled.
Each layer sits on a distinct literature. Five factors because the construct is theoretically five-factor, not because exploratory analysis surfaced five clusters. The CFA proposed in Paper 5 either confirms the structure or breaks the theory. That's the honest test — and the reason this stands up to peer review when maturity-model competitors don't.
The diagnostic that matters isn't the average. It's the minimum.
The throughput of a system is governed by its weakest station. Eliyahu Goldratt named the principle in The Goal (1984); the Theory of Constraints is the operational expression. The StackScore overall score isn't a five-layer average. It's the minimum-layer capability — because that's what's actually capping the system.
A profile of 4.5 / 4.2 / 4.0 / 1.8 / 3.9 is not “generally strong with one weak spot.” It is a Sprint-constrained system producing far less adaptive value than its strong layers should allow. Insight piles up; nothing reaches first version. The Constraint Report leads with the binding layer, not the composite. The recommended actions address the constraint, not the strengths. Investing in the strong layers when one is broken is the most common failure mode of capacity-building work — and it's exactly what an average-based score would tell you to do.
Not just where you're stuck. Why.
Each layer's C and E scores plot into one of four cells. The cell determines which kind of intervention will actually work.
The 2×2 is the diagnostic output. It tells you not just what's capping you but why, and therefore what kind of intervention will actually work. The Constraint Report's recommended actions are routed by the cell, not just the layer.
Whose conditions are creating whose constraints.
The same five-layer structure applies at individual, organizational, community, and funder scales. The instruments differ by persona because the competencies and conditions differ by context; the formalism is the same. That fractal structure makes systemic diagnosis possible.
When a funder and a representative sample of their grantees both take StackScore, the cohort analysis compares per-layer Conditions scores across roles. If the funder's Sustain-Conditions score is low (they don't fund multi-year general operating support) AND the grantees' Sustain-Conditions scores are also low (their adaptive capacity isn't funded through operating budget), the diagnostic emits a structural-causation call: the grantees' Sustain constraint is downstream of the funder's practice. Gregory & Howard's (2009) nonprofit starvation cycle, made empirical at the layer level.
Four call types render: structural causation (both sides low — the conditions are creating the constraint), upstream-doing-its-part (funder high, grantee low — look elsewhere), downstream-overcoming (grantee high despite funder low — fragile but real), and enabling-pair (both high — the relationship works at this layer). This is the diagnostic move that distinguishes StackScore from every other adaptive-capacity instrument in market.
Honest about v1's limits.
StackScore v1.1 is designed to be validatable. The empirical work that turns "designed to be validatable" into "validated" hasn't happened yet. Paper 5 (FCI Measurement Instrument) is the forthcoming work that closes these gaps; Teen Takeover's Summer 2026 cohort generates the N=240+ data points.
What v1.1 does not yet have:
(1) Empirically confirmed factor structure. The five-factor model is theoretically derived, not yet validated by confirmatory factor analysis. The CFA in Paper 5 either confirms the structure or breaks the theory; that's the honest test.
(2) Measured reliability coefficients. Cronbach's α and composite reliability per factor are stated as targets, not measurements. Paper 5 measures them on the N=240+ cohort.
(3) Cross-instrument measurement invariance. The cohort diagnostic compares per-layer Conditions scores across role groups (Funder vs. grantees, Employer vs. employees) on the assumption that the instruments measure the same latent construct comparably. This assumption has not been tested. The structural-causation calls should be read as diagnostic hypotheses worth examining in partnership conversation, not confirmed causal claims. Paper 5 includes configural / metric / scalar invariance analysis.
(4) Criterion validity. Whether higher FCI scores actually predict better adaptive outcomes at Time 2 has not yet been tested. This requires longitudinal data collection and is a multi-year validation arc.
(5) Empirical calibration of the Fusion formula. v1.1 uses the geometric mean (sqrt(C·E)) based on theoretical consistency with Sen's framework. Whether the geometric mean produces more diagnostic separation than the minimum (or a calibrated weighted form) is an empirical question the data will answer.
Naming these gaps is a feature, not a bug. A maturity model that claims confident measurement on undefended weights is suspicious to a sophisticated buyer. An instrument that names its v1 limitations explicitly and ships fixes against them is the more trustworthy artifact.
The instrument earns the right to exist in the .30 to .60 band.
StackScore is designed to be related to but distinguishable from adjacent constructs. Convergent validity targets: Absorptive Capacity (Jansen, Van Den Bosch & Volberda 2005), Dynamic Capabilities (Teece 2007), Organizational Learning (Goh & Richards 1997), Innovation Climate (Ekvall 1996). Each factor should correlate with related constructs in the .30 to .60 range. Below that, the construct is unrelated. Above that, it's redundant.
Discriminant validity matters more. The five factors should be empirically distinguishable from each other — Signal capacity should be measurably distinct from Sense capacity, and so on. That's the test of the framework's structural claim, planned in the confirmatory factor analysis described in Paper 5 (FCI Measurement Instrument). The Teen Takeover Summer 2026 cohort generates the N=240+ data points for that validation.
Reliability targets are conventional: Cronbach's α ≥ .80 per factor, composite reliability ≥ .70, test-retest stability over a 4-6 week window. Items that fail these thresholds get revised in v1.1.
The work it sits on top of.
Anderson, R. (2016). Afrofuturism 2.0: The Rise of Astro-Blackness. Lexington Books.
Ansoff, H. I. (1975). Managing strategic surprise by response to weak signals. California Management Review, 18(2), 21–33.
Argyris, C., & Schön, D. A. (1978). Organizational Learning: A Theory of Action Perspective. Addison-Wesley.
Bell, W. (2003). Foundations of Futures Studies. Transaction Publishers.
Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press.
brown, a. m. (2017). Emergent Strategy: Shaping Change, Changing Worlds. AK Press.
Choo, C. W. (2001). Environmental scanning as information seeking and organizational learning. Information Research, 7(1).
Ekvall, G. (1996). Organizational climate for creativity and innovation. European Journal of Work and Organizational Psychology, 5(1), 105–123.
Goh, S., & Richards, G. (1997). Benchmarking the learning capability of organizations. European Management Journal, 15(5), 575–583.
Goldratt, E. M. (1984). The Goal: A Process of Ongoing Improvement. North River Press.
Gregory, A. G., & Howard, D. (2009). The nonprofit starvation cycle. Stanford Social Innovation Review, 7(4), 49–53.
Inayatullah, S. (2004). The Causal Layered Analysis Reader. Tamkang University Press.
Jansen, J. J. P., Van Den Bosch, F. A. J., & Volberda, H. W. (2005). Managing potential and realized absorptive capacity. Academy of Management Journal, 48(6), 999–1015.
Meadows, D. H. (2008). Thinking in Systems: A Primer. Chelsea Green Publishing.
Miller, R. (Ed.). (2018). Transforming the Future: Anticipation in the 21st Century. Routledge/UNESCO.
Noble, S. U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.
Nussbaum, M. C. (2011). Creating Capabilities: The Human Development Approach. Harvard University Press.
Ries, E. (2011). The Lean Startup. Crown Business.
Sen, A. (1999). Development as Freedom. Oxford University Press.
Simon, H. A. (1969). The Sciences of the Artificial. MIT Press.
Teece, D. J. (2007). Explicating dynamic capabilities. Strategic Management Journal, 28(13), 1319–1350.
van Dijk, J. A. G. M. (2005). The Deepening Divide: Inequality in the Information Society. Sage.
Weick, K. E. (1995). Sensemaking in Organizations. Sage Publications.