Methodology · v1.0 · 2026-04-26

Every number on this site is citable.

This page is the authoritative methodology reference for every metric exposed by the Acquint API. Anchored URL fragments (#scope, #overhead-v03, …) are stable; in-product source pills deep-link here. Renaming an anchor is a breaking change.

The dbt semantic layer is the source of truth for SQL; this document is the source of truth for definitional intent. When the two diverge, this page is wrong and gets updated; the SQL is what runs. See the canonical DEFINITIONS.md on GitHub for the markdown source.

Table of contents

1 · Scope
2 · Fiscal year
3 · Overhead (v0.3)
4 · Parent rollup
5 · Cross-channel (civilian-awarded)
6 · Sub-awards
7 · Cascade
8 · DATA Act / OC clause
9 · Reconciliation tolerance
10 · Known data-quality gaps
11 · Source roles
OC compliance trajectory
Caveats summary
Versioning

Section 1

Scope#

The Acquint warehouse covers DoD-funded contract obligations, sub-awards under those contracts, IDV parents, and the document corpus that contextualizes the spend (J-Books, GAO reports, OIG reports, CRS products, Federal Register notices). We distinguish three terms that the field uses interchangeably and we do not:

DoD-awarded — the awarding toptier agency is 097. The FPDS prime-contract picture most people mean when they say “DoD spending.” Source: silver.contracts, ingested from the USAspending FPDS monthly archive ZIP for agency 097.
DoD-funded — the funding toptier agency is 097, regardless of who awarded the contract. Includes contracts awarded by civilian agencies (DoE national labs, GSA Schedules, Interior GovWorks, NASA) using DoD money. Source: silver.contracts_civilian_awarded, via the USAspending search API.
DoD-related — anything in the broader defense-industrial ecosystem (intelligence community, DOE/NNSA, VA medical procurement). We do not cover this. Out of scope.

We ingest exactly five USAspending source slices, registered in pipeline/usaspending/sources.py:

DOD_PRIME_CONTRACTS — DoD-awarded prime contracts (FPDS archive ZIP, agency 097)
DOD_ASSISTANCE — DoD-awarded grants/cooperative agreements
DOD_IDV_PARENTS — DoD-awarded IDV parent records
DOD_SUB_AWARDS — sub-awards under DoD-funded primes (search API)
DOD_FUNDED_CIVILIAN_AWARDED — DoD-funded prime contracts awarded by civilian agencies

What we explicitly do not cover: classified spending (no publisher exposes it); foreign-government awards; civilian-agency contracting unrelated to DoD; military pay and civilian salary equivalents; and anything that would require sub-tier supply-chain disclosure beyond level-2 (see #cascade).

Section 2

Fiscal year#

The federal fiscal year runs October 1 of the previous calendar year through September 30 of the named year. FY2024 = 2023-10-01 through 2024-09-30. We use the federal definition exclusively — never calendar year, contractor fiscal year, or congressional appropriations cycles.

FPDS archive ZIPs carry an authoritative fiscal_year column we pass through unchanged. USAspending search-API rows often do not — for sub-awards and civilian-channel awards we derive FY from action_date: if the month is October–December, FY = calendar year + 1; otherwise FY = calendar year. This matches OMB / Treasury / GAO / CBO conventions.

Action date in the future. Cancellations and corrections sometimes carry a future action_date. We do not filter them — they appear in the FY of the published action date. FPDS publishes them; we surface them.

Lag.DoD has a 90-day FPDS publication lag. The most recent fiscal quarter's numbers continue to firm up for ~90 days post-action. Every contract metric carries DOD_LAG_CAVEAT. The monthly snapshot cadence (Spec 002.5 FR-1) preserves the trajectory so a journalist can see how a quarter firmed up.

Section 3

Overhead (v0.3)#

“Overhead” has no single legal or accounting definition in federal contracting. DCAA / FAR 31.203 means indirect costs allocated to a cost objective. The OMB Object Class taxonomy means certain OC codes capture overhead-like spending. Journalists usually mean “money spent on bureaucracy and consultants instead of warfighting.” A defensible product offers multiple definitions, lets the user pick, and explains the difference.

We compute three tiers, all PSC-only as of v0.3 (the OC clause was dropped — see #data-act-clause). The definitions live in dbt/models/silver/contracts.sql:

c.product_or_service_code = 'R408' AS is_overhead_conservative,
c.product_or_service_code IN ('R408','R499','R425','D302','D399') AS is_overhead_mainstream,
starts_with(coalesce(c.product_or_service_code, ''), 'R')
    OR starts_with(coalesce(c.product_or_service_code, ''), 'D') AS is_overhead_aggressive

Conservative

R408 only

Program Management/Support Services. Captures only what GAO, CSIS, and POGO most clearly recognize as program-management overhead. Most defensible against challenge; understates total.

Mainstream (default)

R408, R499, R425, D302, D399

Program management plus professional services plus consulting-coded IT work. Aligns with most journalist and Hill-staff usage. Reconciles to GAO-19-383SP and CSIS Defense Budget Analysis service-contract reports.

Aggressive

Any R or D PSC

Every PSC starting with R (Support) or D (IT). Maximalist; includes legitimate operational support. Approximates the upper bound POGO publishes.

dod_overhead_obligations_mainstream

v0.3

/api/v1/metrics/dod_overhead_obligations_mainstream/v0.3

What it measures: Total DoD-awarded prime obligations on PSCs in {R408, R499, R425, D302, D399} for the requested fiscal year.

Formula

SUM(federal_action_obligation)
WHERE is_overhead_mainstream = TRUE
  AND fiscal_year = :fy

Sources

primaryUSAspending FPDS monthly archive (agency 097)
definitionMethodology #overhead-v03
reconciliation_comparableGAO-19-383SP DoD service contract spending taxonomy

Known caveats

Overhead is computed using the mainstream PSC-only definition. Other definitions yield different totals.
Includes contract obligations only; civilian salary equivalents and military pay are excluded.
OC clause was dropped in v0.3 because DATA Act compliance regressed from 85% (FY19) to 28% (FY25).

dod_overhead_obligations_mainstream_combined

v0.1

/api/v1/metrics/dod_overhead_obligations_mainstream_combined/v0.1

What it measures: UNION of DoD-awarded primes plus DoD-funded civilian-channel awards on the mainstream PSC set. Surfaces the consulting flow that DoD-only views miss.

Formula

UNION ALL
  silver.contracts (channel='dod_awarded')
  silver.contracts_civilian_awarded (channel='civilian_awarded')
WHERE is_overhead_mainstream = TRUE

Sources

primaryUSAspending search API (funding agency=097, awarding agency≠097)
definitionMethodology #cross-channel

Known caveats

Civilian-channel rows lack object_classes_funding_this_award and cannot contribute to OC linkage compliance metrics.
FY17 and FY18 GSA chunks have ~50–80% coverage due to the search API's 10K-row page cap.

Section 4

Parent rollup#

Conglomerates register multiple UEIs. Lockheed Martin Corporation has at least 10 active UEIs (LM Aeronautics, LM Space, Sikorsky, etc.). A naive GROUP BY recipient_ueiquery of “top vendors FY24” will scatter Lockheed across 10 rows. This is the #1 thing every incumbent dashboard gets wrong.

We collapse subsidiary UEIs into their parent UEI via silver.entity_lineage, built from the USAspending recipient API:

COALESCE(el.parent_uei, c.recipient_uei) AS rollup_uei,
COALESCE(el.parent_name, upper(trim(c.recipient_name))) AS rollup_name,

silver.sub_awards mirrors the pattern with subawardee_rollup_uei / subawardee_rollup_name. The _rollup metric variants GROUP BY rollup_uei; non-rollup variants are preserved for backwards compatibility.

Coverage caveat. Backfill is partial — approximately 17% of UEIs have a resolved parent. The COALESCE means unresolved UEIs fall back to their raw UEI (graceful degradation), not NULL. Backfill priority is row-volume-DESC, so Lockheed, Boeing, Raytheon/RTX, Northrop, Booz Allen, and General Dynamics resolve first. Every _rollup metric carries UEI_ROLLUP_PARTIAL_CAVEAT.

Why it matters: in FY24, the raw-UEI top-vendors view scatters Lockheed across roughly 10 rows totaling ~$50B; the rollup view collapses these into a single ~$50B row, surfacing the actual concentration ranking that matters for HHI and antitrust analysis.

top_vendors_by_obligation_rollup

v0.1

/api/v1/metrics/top_vendors_by_obligation_rollup/v0.1

What it measures: Top-N vendors by total DoD-awarded prime obligations for the requested fiscal year, grouped by rollup_uei (parent UEI when known, raw UEI otherwise).

Formula

SELECT rollup_uei, rollup_name, SUM(federal_action_obligation) AS total
FROM silver.contracts
WHERE fiscal_year = :fy
GROUP BY rollup_uei, rollup_name
ORDER BY total DESC LIMIT :limit

Sources

primaryUSAspending FPDS monthly archive (agency 097)
primaryUSAspending recipient API (parent-recipient lineage)
definitionMethodology #parent-rollup

Known caveats

Parent-UEI lineage backfill is partial (~17% resolved). Unresolved UEIs fall back to their raw UEI.
Backfill prioritizes the largest conglomerates first; long-tail vendors may still appear under their raw UEI.

Section 5

Cross-channel (civilian-awarded)#

A non-trivial fraction of DoD obligations flow through awards made by civilianagencies. DoE national labs (Oak Ridge, LLNL, Los Alamos, Sandia) execute substantial DoD-funded research. GSA awards on behalf of DoD via Multiple Award Schedules. Interior's GovWorks operates as an interagency awarding shop. NASA executes joint programs.

These awards are funded by DoD but not awarded by DoD. They never appear in the DoD FPDS archive ZIP (filtered to awarding_toptier_agency_code = 097). An incumbent dashboard that shows only the DoD archive misses them. We estimate ~$13B/year of DoD obligations flow through these civilian channels in recent fiscal years.

We ingest this slice via the search API with filters.agencies = [{type: funding, name: DoD}] and the awarding-agency filter intentionally not constrained. The result lands in silver.contracts_civilian_awarded with a schema that mirrors silver.contracts exactly so the two UNION cleanly:

WITH combined AS (
    SELECT fiscal_year, federal_action_obligation, 'dod_awarded' AS channel
    FROM main.contracts
    UNION ALL
    SELECT fiscal_year, federal_action_obligation, 'civilian_awarded' AS channel
    FROM main.contracts_civilian_awarded
)

Why this matters editorially:services-coded overhead concentration in the civilian-awarded channel runs 30–50% of obligations, vs ~10–15% in the DoD-awarded channel. Civilian-agency vehicles (especially GSA Schedules) are where consulting flows. A DoD-only view of “overhead” undercounts consulting spend by a multiple.

Section 6

Sub-awards#

Sub-awards are the FFATA File F layer: amounts that prime contractors disburse to lower-tier vendors under the prime award. We ingest via the search API with sub_award_types = [procurement, grant] and agencies.funding = DoD. Output: silver.sub_awards.

PSC codes arrive from the search API as a JSON-blob string like '{"code":"R408","description":"…"}' rather than the bare PSC code FPDS uses. We decode in the silver model (commit 293d48a):

COALESCE(
    json_extract_string(s.product_or_service_code, '$.code'),
    nullif(s.product_or_service_code, '')
) AS psc_code_decoded

After decoding, the same is_overhead_conservative / mainstream / aggressive flags from the prime-contract layer apply, so sub-awards roll up alongside primes in any cross-grain analysis.

Filters. We drop rows where subaward_amount > $1B as known data-entry errors. Empirical probes found rows of $39B+ that are obviously misformatted. The threshold is generous; legitimate single sub-award actions of that size are vanishingly rare.

top_subawardees_by_obligation

v0.1

/api/v1/metrics/top_subawardees_by_obligation/v0.1

What it measures: Top-N subawardees by sub-award amount for the requested fiscal year, with the rollup variant grouping by subawardee_rollup_uei.

Sources

Known caveats

GAO has flagged FFATA Sub-award Reporting (FSRS) data quality. Silver layer filters known-bad rows above $1B but data quality remains a published concern.
subawardee_parent_name is not exposed by the search API; subawardee parent rollup runs through the same entity_lineage join as primes.

Section 7

Cascade#

The cascade — /api/v1/cascade/{uei} — is the iceberg view, parent-collapsed. For a single vendor (rolled up to its parent UEI), it returns:

dod_direct_prime— DoD-awarded prime contracts where any of the parent's UEIs is the recipient.
civilian_channel_awards— DoD-funded contracts awarded by civilian agencies where any of the parent's UEIs is the recipient.
sub_awards_received— sub-awards where any of the parent's UEIs is the subawardee (money flowing in).
sub_awards_paid_out— sub-awards where any of the parent's UEIs is the prime (money flowing out; includes the top 5 subawardees).

total_dod_funded_exposure_usd = (1) + (2). Sub-award flows (3) and (4) are not addedto the total because they are already inside the prime aggregates — adding them double-counts. Surfacing them separately is the point: it shows where the prime's money lands.

The deep-cascade endpoint (/api/v1/cascade/{uei}/deep) drills one more level: for each top-N level-1 subawardee, return that level-1 vendor's total outgoing-sub flow plus their top-N level-2 subs.

Critical caveat:level-2 outgoing flows are NOT scoped to the original prime's funding. FFATA/FSRS does not propagate prime provenance through hops, so once you go from prime → sub, you cannot ask “of this sub's outgoing flow, what fraction traces back to the original prime?” The level-1 sub's TOTAL outgoing is what we surface. The cascade response makes this explicit.

We cap depth at 2 in v0.1 to match data quality. Adding level-3+ would compound the provenance-loss problem and produce numbers that look authoritative but aren't. When a publisher shipping level-3 disclosures emerges, we'll add it.

Section 8

DATA Act / OC clause#

The Digital Accountability and Transparency Act of 2014 (Pub. L. 113-101) requires every federal award to be linked to the financial accounts (Treasury Account Symbol, Object Class) that funded it. The linkage is published on USAspending in the object_classes_funding_this_award field, which carries entries like "25.1: Advisory services; 31.0: Equipment".

DoD's compliance has regressed dramatically:

Fiscal year	Row coverage	Dollar coverage
FY19	~85%	~85%
FY25	~28%	~28%

dod_oc_linkage_compliance_pct (and the per-sub-agency variant dod_oc_linkage_compliance_by_sub_agency) surface this trajectory directly. The metric is itself a finding, not a footnote. It is the single most legible signal of DATA Act non-compliance currently visible.

This regression is also why v0.3 overhead is PSC-only. The v0.2 mainstream definition included an OC in ('25.1','25.2')clause as a logical OR with the PSC clause. As OC compliance dropped through FY20–FY24, the OC clause's contribution became increasingly noisy: rows that should have included the OC component had NULL object_classes_funding_this_award, so they fell out of “overhead.” We dropped the OC clause in v0.3 (Spec 002 FR-7) so the metric trajectory reflects actual contracting patterns rather than a publisher data-quality regression.

dod_oc_linkage_compliance_pct

v0.1

/api/v1/metrics/dod_oc_linkage_compliance_pct/v0.1

What it measures: Per-fiscal-year share of DoD-awarded prime contract rows (and dollars) that successfully link to an object_classes_funding_this_award entry, per the DATA Act §3 requirement.

Formula

SELECT fiscal_year,
       COUNT(*) FILTER (WHERE object_classes_funding_this_award IS NOT NULL)
         / COUNT(*)::DOUBLE AS row_coverage_pct,
       SUM(federal_action_obligation) FILTER (WHERE object_classes_funding_this_award IS NOT NULL)
         / SUM(federal_action_obligation) AS dollar_coverage_pct
FROM silver.contracts
GROUP BY fiscal_year

Sources

Known caveats

Compliance regressed from ~85% (FY19) to ~28% (FY25). The metric trajectory IS the finding.
Civilian-channel rows lack this field by construction (search API does not expose it) and are excluded from this metric.

Section 9

Reconciliation tolerance#

Every contract topline metric is dbt-tested against a published comparable within ±0.5%. The test name is reconcile_per_fy_topline. Failure blocks the dbt build; a metric whose number cannot be reconciled to within 0.5% of the publisher's own topline does not ship.

The comparable values live in dbt/seeds/gao_topline_dod_contracts.csv:

fiscal_year,gao_topline_usd,source_url,source_note
2024,445100000000,https://www.gao.gov/blog/snapshot-government-wide-contracting-fy-2024-interactive-dashboard,GAO FY2024 Snapshot...
2025,491600000000,https://www.usaspending.gov/agency/department-of-defense,USAspending agency profile FY2025...

The seed is the source of truth. When GAO publishes a new Snapshot or updates an FY estimate, we update the seed and the dbt test re-runs. The Source model on every metric response includes a typed tolerance_test entry pointing back here so a journalist can see which test protects the figure.

The 0.5% tolerance is calibrated to the publisher's own re-statement noise. The current FY24 reconciliation runs at ~0.19% (we report $445.95B against GAO's $445.1B). Anything > 0.5% blocks the build.

The API populates ReconciliationStatus inline on every metric response narrowed to a single FY with a known comparable. The journalist sees our_value, comparable_value, delta_pct, tolerance_pct, tolerance_test, and the publisher URL — all on the response, no extra clicks.

dod_total_obligations

v0.1

/api/v1/metrics/dod_total_obligations/v0.1

What it measures: Total DoD-awarded prime contract obligations for the requested fiscal year, reconciled against the GAO/USAspending published topline.

Formula

SELECT fiscal_year, SUM(federal_action_obligation) AS total_obligations_usd
FROM silver.contracts
WHERE fiscal_year = :fy
GROUP BY fiscal_year

Sources

primaryUSAspending FPDS monthly archive (agency 097)
reconciliation_comparableGAO FY2024 Snapshot — government-wide contracting dashboard
tolerance_testdbt test reconcile_per_fy_topline (±0.5%)

Known caveats

DoD has a 90-day FPDS publication lag — most-recent-quarter figures continue to firm up.
FPDS (piid, modification_number, transaction_number) is not unique. We do not deduplicate; the bronze sum is the truth.

Section 10

Known data-quality gaps#

Real ones, named:

Parent-UEI rollup coverage <20%

Lineage backfill is partial. Coalesce-to-raw-UEI fallback means unresolved UEIs appear under their own UEI rather than NULL, but a long-tail vendor may still scatter across multiple subsidiary UEIs until backfill catches its parent. Backfill prioritizes the largest conglomerates first.

Civilian-channel sub-awards lack OC linkage

The USAspending search API does not return object_classes_funding_this_award. silver.contracts_civilian_awarded carries NULL for that field by construction. Civilian-channel rows cannot contribute to OC linkage compliance metrics.

FY17 and FY18 GSA partial coverage

The search API enforces a 10K-row page cap. For high-volume agencies in early fiscal years (notably GSA in FY17 and FY18), the page cap exceeded our chunking strategy's resolution and we landed an estimated 50–80% of the rows. Subsequent backfills are tightening this; the gap is documented per-chunk in the manifest.

USAspending search-API 10K-row cap

Drives our chunking strategies. Every search-API ingest is chunked along sub-agency × month or sub-agency × calendar quarter. Chunk boundaries are recorded in the bronze manifest (r2://$R2_BUCKET/raw/usaspending/_manifest.jsonl) so a re-fetch can target the exact slice that needs refresh.

DoD's 90-day FPDS publication lag

The most recent fiscal quarter's numbers continue to firm up for ~90 days post-action as agencies report. Every contract metric carries this caveat. Snapshot cadence preserves the trajectory so a journalist can see how a quarter firmed up over time.

FPDS keys are not unique

When (piid, modification_number, transaction_number) repeats in a single archive, the rows represent separate accounting actions that happen to share an identifier (a published FPDS quirk). We do not deduplicate. Empirically: bronze FY17 = $321B; with naive dedupe by last_modified_date, FY17 silver collapsed to $248B (-23%). Bronze sum is the truth.

Sub-award amount data-entry errors

GAO has flagged FSRS data quality. We filter rows where subaward_amount > $1B (probe found $39B+ rows that are obviously misformatted). The $1B threshold is generous; legitimate single sub-award actions of that size are vanishingly rare.

DATA Act compliance regression

Already covered in #data-act-clause. Worth re-stating: the publisher's own data quality regressed by 57 percentage points FY19–FY25 on the OC linkage. We surface the trajectory; we do not paper over it.

Section 11

Source roles — citation as contract#

Every API response carries a typed Source[] array. Each Source has a role that tells the journalist whywe cited it. The roles are typed, finite, and load-bearing: an editor can ask “does this metric have a tolerance_testsource?” and the answer is always yes/no, never “sort of.”

primary

The publisher dataset the number was actually computed from. USAspending FPDS archive ZIPs, the search API, the recipient API, FFATA File F. If you want to verify our number, this URL is where you go to pull the raw rows.

reconciliation_comparable

An independently-published topline our number was tested against. GAO Snapshots, USAspending agency-profile FY totals, CSIS Defense Budget Analysis service-contract reports. We reconcile to within ±0.5% or the dbt build fails — see #reconciliation-tolerance.

definition

The authority for the definitional choices baked into the metric. Statutes (DATA Act, FAR 31.203), this methodology page, named GAO/CSIS taxonomies. When the field is contested — and “overhead” is — the definition source names the contest.

tolerance_test

A pointer to the dbt test that protects the figure (reconcile_per_fy_topline, etc.). Click through and you can see which test, what tolerance, and what comparable value gates the build. If a number ships, the test passed.

This is the moat. Incumbent dashboards cite an internal BI cube; we cite the publisher. Incumbent dashboards footnote caveats in a PDF; we ship them on the response. Incumbent dashboards reconcile in a slide deck once a quarter; we reconcile on every dbt build, with the test name and tolerance on the response. Citation as contract — not citation as decoration.

Trajectory

DoD object-class linkage compliance#

DATA Act §3 requires DoD obligations to link to an object-class code. Coverage was ~85% in FY2019 and is ~28% in FY2025 — a 57-point regression in six years. The dod_oc_linkage_compliance_pct metric tracks the trajectory; the per-sub-agency variant dod_oc_linkage_compliance_by_sub_agency shows where the regression is concentrated.

Full chart support lands in a later iteration. Until then, the metric endpoint is queryable directly:

/api/v1/metrics/dod_oc_linkage_compliance_pct/v0.1/api/v1/metrics/dod_oc_linkage_compliance_by_sub_agency/v0.1

See #data-act-clause for the full mechanism and why this is why v0.3 overhead dropped the OC clause.

Quick reference

Caveats#

DoD has a 90-day FPDS publication lag — most-recent-quarter figures continue to firm up for ~90 days post-action.
Civilian-channel rows (DoE, GSA, NASA, etc.) lack object-class metadata; overhead share on that channel is PSC-only.
Sub-awards data is FFATA File F via the USAspending search API. GAO has flagged FSRS data-quality concerns; rows above $1B are filtered as data-entry errors.
Parent rollup uses USAspending lineage — coverage is partial (~17% resolved). UEIs without a resolved parent fall back to raw UEI as their own rollup.
FY17 and FY18 GSA chunks have ~50–80% coverage due to the search API's 10K-row page cap.
FPDS (piid, modification_number, transaction_number) is not unique; we do not deduplicate. Bronze sum is the truth.

Each metric response also surfaces its own typed caveat list under known_caveats — these are not footnotes, they ship on the response.

Versioning#

This is v1.0 of the methodology. Changes to active definitions ship as new versions (v1.1, v2.0) — never silent rewrites. Anchored URL fragments (#scope, #overhead-v03, etc.) are stable; renaming an anchor is a breaking change for every downstream consumer that linked to it.

The dbt semantic layer is the source of truth for SQL; this document is the source of truth for definitional intent. When the two diverge, this page is wrong and gets updated; the SQL is what runs.

Canonical DEFINITIONS.md ← Back to home