GEP 10 — Units and Dimensionality#
Author |
|
Status |
Draft |
Type |
Standards Track |
Created |
2026-06-03 |
Resolution |
(none yet) |
Abstract#
This GEP gives every quantity in GETTSIM a unit — Euros, Euros per square meter, etc. — declared on parameters, policy functions, and (optionally) input data. The framework reads those units to do two things:
Dimensional safety. It checks that the arithmetic combining quantities is sound, so mixing incompatible kinds — say, a monthly amount and a per-square-meter rent — becomes a loud error when the model is defined, not a silent wrong number far downstream.
Automatic unit conversion. It converts compatible quantities to a common unit. For example, parameters denominated in Deutsche Mark can be converted to Euros at build time, so a parameter’s history can include values in both currencies and the user can run in either one without hand-converting the numbers. Time conversions of flows work the same way. The existing
_y/_q/_m/_w/_dsuffix convention is preserved.
The engine is pint, and it runs only while the model is built: it checks dimensions and converts units, then steps aside. The numeric runtime is unchanged. As in GEP 9, the checks fire at definition time, catching a whole class of unit bugs before they can reach a result.
Terminology#
dimension — the basic kind of a quantity:
[currency],[time],[area], or dimensionless. Counting quantities (children, adults, household members) are dimensionless, following the SI and pint convention.unit — a particular way of measuring a dimension, such as Euros for
[currency]or years for[time]. A unit carries a conversion factor to the dimension’s base unit, so e.g.1 month = 1/12 year. The available units are called unit tokens.
Motivation and Scope#
Three long-standing problems motivate this GEP.
No dimensional safety. The DAG carries quantities of many kinds, but a function body may add, subtract, or compare them freely.
betrag_m + miete_pro_qm_m(a monthly amount plus a monthly rent per square meter) is a bug that runs silently today and surfaces, if at all, as an implausible number far downstream.Hand-converted historical currency. Every Deutsche-Mark-era parameter is divided by
1.95583by a maintainer before being written to YAML, with the original value preserved only in a free-textnote. There is no machine-checkable provenance and no guard against a transcription error. This is both prone to errors and violates GETTSIM’s law-to-code approach.Hand-written time arithmetic.
ttsim/unit_converters.pyimplements ~50 conversion functions (y_to_m,per_y_to_per_m, …) and their stock/flow duals by hand. The resulting arithmetic has itself been a source of bugs.
Scope. The GEP covers ttsim (the framework) and gettsim (the German currencies
and the policy annotations). GEP 1’s _y/_q/_m/_w/_d suffix automation is
preserved; only the arithmetic behind the conversions moves onto the unit engine.
Usage and Impact#
Units enter the model through its data: every parameter and every input column
carries a unit= declaration. From there the framework works out the unit of whatever a
policy function computes by running the body on its inputs (the dry-run); the function
still restates that unit in unit=, checked against the inferred result so its
declaration is a guard rail, not a new source of truth. Flow tokens (CURRENCY_FLOW, …)
take their period from the GEP 1 name suffix:
@policy_function(unit=Unit.CURRENCY_FLOW) # name betrag_m -> resolved CURRENCY/month
def betrag_m(regelsatz_m: float, anzahl: int) -> float:
return regelsatz_m * anzahl
@policy_function(unit=Unit.CURRENCY) # a stock; a time suffix would be an error
def vermögen(aktien: float, immobilien: float) -> float:
return aktien + immobilien
A policy function names no particular currency, so the same body serves a Euro run and a
DM run unchanged; parameters, by contrast, record their legal currency in the token
itself (DM_FLOW, EUR_FLOW). One optional currency argument to main() picks the
currency the model runs in — defaulting to the registered base currency ("EUR" for
GETTSIM) — and every currency-denominated parameter is converted to it at build time.
Tagging input data with units is optional, through a dedicated unit-annotated input
tree; results can likewise be returned as a unit-annotated tree in precise run-currency
units. And every mistake the framework can see — a mistyped token, mixing incompatible
quantities, a unit that does not line up across a DAG edge, a missing declaration —
surfaces when the model is defined (at decoration, load, or environment build), never
as a wrong number at run time. DIMENSIONLESS is a real declaration — it states that
the quantity carries no dimension — not a blank one.
The rest of this GEP is the reference: the token vocabulary, the period sources, the currency model, and exactly what the checks catch.
Backward Compatibility#
User code shape is unchanged. Bare arrays and the DataFrame/mapper interface keep working;
currencydefaults to"EUR"and output stays in Euros.The
unit/reference_periodmetadata is repurposed.unitbecomes one member of the token vocabulary andreference_periodbecomes functional (it supplies the period for…_FLOWparameters that no name can carry) rather than purely descriptive.No blanket opt-out. Unlike the GEP 9 beartype claw, there is no env-var escape hatch that switches the unit check off wholesale; the only opt-out is per-function and body-only (
verify_units=False, see below).A migration is required. Every node must declare a unit; suffix-less flow parameters are renamed to carry a time suffix (
arbeitnehmerpauschbetrag→arbeitnehmerpauschbetrag_y), since the suffix is now the period source wherever a name can carry one; and a bare literal of a real dimension is promoted to a parameter or its function body opts out withverify_units=False.
Detailed Description#
The unit vocabulary#
A declaration is one member of the token vocabulary. Its backbone is a closed core
enumeration — a Unit StrEnum shipped by ttsim, spelled identically in code
(Unit.CURRENCY_FLOW) and in YAML (unit: CURRENCY_FLOW):
token |
resolves to |
typical use |
|---|---|---|
|
|
wages, claims, benefits |
|
|
wealth, asset thresholds |
|
|
shares, rates, counts |
|
|
Zugangsfaktor per year |
|
|
ages, durations |
|
|
working hours |
|
|
dwelling size |
|
|
rent caps |
A token ending in …_FLOW needs a period; every other token is complete as written and
takes no period. So the …_FLOW suffix is the only flow marker — there is no separate
“stock” spelling, a currency stock is the bare CURRENCY token. Tokens are not pint
syntax: each resolves internally to a pint unit (flow tokens after the period is filled
in), but pint expressions never appear in a declaration.
HOURS_FLOW is the one flow token that resolves to a dimensionless quantity: hours
and the period are both [time], so hours per week is a time-over-time ratio. It is
kept as a distinct token so the time-suffix and time-conversion bookkeeping still apply
to working hours, but dimensionally it cannot be told apart from a bare DIMENSIONLESS
quantity. Likewise, a per-period dimensionless quantity is DIMENSIONLESS_FLOW, not
DIMENSIONLESS: the pension Zugangsfaktor moves by a fixed factor per year of earlier
or later retirement (zugangsfaktor_veränderung_y, § 77 SGB VI) — a pure number, but
per year it is 1/year, and multiplied by the gap in YEARS the years cancel to the
dimensionless adjustment.
Counting quantities, booleans, and identifiers are dimensionless (DIMENSIONLESS),
following SI and pint convention. A per-person parameter declares the same token as any
other amount (EUR_FLOW for a monthly Regelsatz); scaling it by a head count is a plain
multiplication that preserves the unit. A boolean is a {0, 1} value, and an identifier
(p_id, *_id, p_id_*) carries no dimension — both spell that out rather than being
silently waved through.
beitragssatz:
unit: DIMENSIONLESS # a rate is dimensionless
reference_period: null
type: scalar
2024-01-01:
value: 0.013
There are no exemptions — every active node has a unit; only its source differs.
Most nodes declare it. Derived nodes get one auto-assigned
(see below); the framework-injected date nodes get theirs from the
framework (policy_year is in years, etc.). So UNSET_UNIT has a single meaning — no
declaration was made — which the mandatory-units check always reports as an error, with
no second “legitimately blank” reading to disambiguate.
Beyond the core enumeration, the full vocabulary adds one set of concrete currency
tokens per registered currency (see Currency); the
currency-dimensioned rows of the table above are the agnostic tokens. The core
enumeration lives in ttsim, is shared by all downstream packages, and grows only by an
upstream PR; each package’s params JSON schema stays statically enumerable, listing the
core tokens plus its own currency tokens.
pint runs at build time only#
The foundational constraint is that pint never wraps a live array. A pint.Quantity is
not a JAX pytree and does not trace under jit; wrapping runtime columns would fight
both JAX and the GEP-9 FloatColumn vocabulary. Instead, pint is used in two build-time
roles:
to compute conversion factors (time and currency), which are baked into the compiled workers as plain numeric constants; and
to run the dry-run dimensionality check on representative
Quantitys.
The numeric runtime path stays pure arrays, single currency, and JAX-safe. Time is a
first-class pint dimension here: the conversion factors are sourced from pint
(Quantity(1, "year").to("month")), while the suffix auto-generation and naming follow
the GEP 1 conventions.
Units, suffixes, and periods#
A flow token is completed by exactly one period source; complete tokens admit none. The
period comes from the name suffix wherever a name or key can carry one, and from
reference_period only where it cannot:
node kind |
flow period from |
|
|---|---|---|
column / policy function |
name suffix |
forbidden |
scalar parameter / string-keyed dict leaf |
name (or key) suffix |
forbidden |
integer-keyed dict leaf |
dict-level |
required |
mapping parameter axis |
|
required |
Where the suffix supplies the period it is also mandatory and exclusive: a time suffix
requires a …_FLOW token and a …_FLOW token requires a time suffix, so a complete
token on a suffixed name — or a flow token on an unsuffixed one — fails at build. This
makes the GEP 1 convention machine-checked: a node named …_m whose body
computes a stock cannot be declared. Because reference_period is forbidden there,
there is nothing to reconcile; only where no name carries a suffix (integer keys,
schedule axes) is reference_period functional.
Dict parameters with heterogeneous leaves#
A dict parameter whose leaves carry different units declares unit: as a mapping from
leaf keys to tokens (or DIMENSIONLESS for a dimensionless leaf). A flow leaf with a
string key takes its period from the key’s own time suffix; an integer-keyed flow leaf,
which has no suffix to carry, takes it from the dict-level reference_period:
schedule:
unit:
child_amount_y: EUR_FLOW # string key -> period from its own _y
max_age: YEARS
type: dict
2024-01-01:
child_amount_y: 3000.0
max_age: 18
satz_nach_kindanzahl:
unit: EUR_FLOW # uniform: one token for all leaves
reference_period: Month # integer keys carry no suffix -> dict-level period
type: dict
2024-01-01:
1: 250.0
2: 250.0
Where a leaf key carries a suffix and the dict also sets a reference_period, the two
must coincide — there is no precedence order. Mixed-period dicts are legal when each
flow leaf carries its own suffix (base_amount_m next to annual_bonus_y).
Leaves that change name across the parameter’s history. The unit: mapping is a
union over all dated entries: the mandatory-units check looks only at the leaves
present in the value active at the policy date and ignores mapping entries for leaves
that exist only at other dates. So a leaf renamed across a reform is covered by listing
both names (child_amount_y before, base_amount_y after). A value leaf with no entry
in the mapping is a missing declaration and is flagged, so a mistyped key cannot pass
silently. When the renamed leaves share a token, the simpler uniform form — a single
scalar unit: EUR_FLOW with the period read from each leaf’s own suffix — makes the
rename irrelevant; the mapping is only for genuinely heterogeneous leaves. A leaf whose
currency changes across dates is a changeover (see Currency).
In the dry-run, dict parameters become dicts of representative Quantitys (uniform for
a scalar unit:, per-leaf for a mapping), so bodies that subscript them are verifiable.
Mapping parameters: one token per axis#
A schedule or lookup table is not a quantity — it is a function between quantities,
with a domain and a codomain. The mapping parameter types (the piecewise_* family, the
lookup tables, the phase-in/out types) therefore declare input_unit: and
output_unit: instead of unit:; a unit: on them is an error, and the JSON schema
enforces the split per type::
tarif:
input_unit: EUR_FLOW # taxable income per year in ...
output_unit: EUR_FLOW # ... tax per year out
reference_period: Year
type: piecewise_quadratic
...
Each axis token follows the same kind rules as a scalar declaration; per-axis
declarations are single tokens (or DIMENSIONLESS), never mappings. The single
reference_period supplies the period of every flow axis; a reference_period that
no flow axis consumes is dangling and fails; a time suffix on the parameter’s name
must coincide with the output axis — the suffix names what the parameter yields.
Currency#
Currencies live in the framework as a [currency] dimension, with concrete currencies
registered by downstream packages via
register_currency(name, *, base=False, definition=None). gettsim registers EUR
(base) and DM = EUR / 1.95583. Registration does two things: it provides the
conversion factors, with pint as the single source of truth for the rate; and it
derives the currency’s declaration tokens — one concrete variant per
currency-dimensioned core token (DM, DM_FLOW, DM_PER_SQUARE_METER_FLOW, EUR_*,
…) — spelled by replacing the agnostic CURRENCY prefix with the upper-cased currency
name.
Agnostic and concrete tokens. A currency-agnostic token (CURRENCY,
CURRENCY_FLOW, …) is a placeholder for any registered currency: it declares the unit
of a function or column for which it does not matter which currency the model runs in. A
concrete currency token (DM_FLOW, EUR) names one specific currency; what it adds
over its agnostic counterpart is denomination — it names the currency a parameter’s
stored numbers are written in, which the build-time conversion reads off the
declaration. For every dimensionality check a concrete token means exactly what its
agnostic counterpart means: the dry-run and the edge check compare at the dimension
level and never see a concrete currency, so a DM-denominated parameter feeds a
currency-agnostic function without further ado, while adding Euros to Euros per square
meter is still caught.
Parameters must be concrete; functions must be agnostic. A parameter’s numbers are
written in some currency, so once a concrete currency is registered, an agnostic
CURRENCY_* token on a parameter is a build error — the declaration must name the
denomination (DM_FLOW, not CURRENCY_FLOW). Columns and functions may only declare
agnostic tokens.
The run currency. The currency argument to main() defaults to the registered
base currency; it is the currency the input data is taken to be in and that the outputs
come out in. At environment build, every currency-denominated parameter is converted
from its declared denomination to the run currency: scalar values, dict parameters leaf
by leaf (each currency leaf by its own token), schedules axis by axis, and lookup-table
values. The factors are baked in at build time; the numeric runtime path stays
single-currency.
A changeover within one parameter’s history. A dated entry may restate the unit field(s), overriding the top-level declaration for that entry’s numbers. This is how the DM→Euro switch is written — entries before the reform denominated in the legacy currency, entries from the reform date in the new one:
arbeitnehmerpauschbetrag_y:
unit: DM_FLOW
type: scalar
1990-01-01:
value: 2000
2002-01-01:
unit: EUR_FLOW # the changeover: denominated in Euro from here on
value: 1044
updates_previous cannot cross a changeover: an entry that restates the unit
declaration must restate the full value, because a merged value would mix numbers
denominated in different currencies.
Build-time checks and boundary conversion#
The checks run in two layers, both at build time, neither needing a fabricated dataset:
Layer 1 — DAG validity |
Layer 2 — boundary |
|
|---|---|---|
when |
|
GEP-9 canonicalisation boundary |
input |
none — representative |
the user’s unit-annotated input tree |
checks |
inferred body unit vs. declaration; |
tag currency → run currency; period vs. suffix; unknown token rejected; every tag’s dimension vs. resolved unit |
Layer 1 runs each scalar body in NumPy+pint, infers the unit that falls out, and checks it against the declaration; an edge-consistency pass then confirms each producer’s unit equals its consumer’s declared expectation. The mechanics are below.
Layer 2 is offered through the unit-annotated input tree (a sibling of the ordinary
input tree in which every leaf is a pint Quantity). Only the tree interface carries
tags — a DataFrame column has nowhere to hang a per-column unit, so the DataFrame modes
and the bare tree stay tag-free and are taken to already be in the run currency. When
the mode is used every leaf must be tagged, including identifiers and other
dimensionless columns (tagged dimensionless) — a full-coverage contract that is what
lets the dimension check be exhaustive. The dimension check reads the extracted input
units against the resolved environment units; it feeds no node, so it adds no back-edge
to the boundary and needs no declared unit threaded through processed_data.
Symmetrically, the unit-annotated result tree relabels each output leaf with its
precise run-currency unit (euro/month, not the agnostic CURRENCY_FLOW) — pure
naming, since results are already computed in the run currency.
How the dry-run checks one body. The check runs the function body, but with
units in place of numbers. Each input becomes a stand-in carrying its resolved unit
and a throwaway magnitude of 1; pint carries the units through the body’s arithmetic,
and the unit that falls out of the return is compared to the declaration. Because the
magnitude is never used, no real data is needed:
@policy_function(unit=Unit.CURRENCY_FLOW) # -> CURRENCY / month
def betrag_m(
einkommen_m: float, satz: float, mindestbetrag_m: float, befreit: bool
) -> float:
if befreit:
return 0.0
if einkommen_m > mindestbetrag_m:
return einkommen_m * satz
return mindestbetrag_m
Here einkommen_m and mindestbetrag_m enter as EUR/month, satz as a dimensionless
1, and befreit as a boolean stand-in. einkommen_m * satz is a flow times a
dimensionless number, so it stays EUR/month — matching the declaration; the
mindestbetrag_m arm matches too.
Every branch is covered, by re-running. To evaluate if befreit: Python needs a
yes/no, but a unit stand-in has no value to compare. So the stand-in intercepts the
truth test itself (Python’s __bool__) and hands it to a small driver — the path
explorer — that decides which way to go, re-running the body and steering the open
branches differently each time (a depth-first walk of the decision tree, in the style of
concolic execution) until every reachable path is driven. The number of runs is the
number of reachable paths, not 2^n: when befreit is true the body returns before
the income test, so that comparison never even becomes a question. Each run’s result is
checked on its own, so a unit slip on a single arm — say, returning a yearly figure
where _m was declared — is caught even though the other arms are clean. A return 0.0
arm yields a dimensionless result and falls back to the declaration, so the ubiquitous
if befreit: return 0.0 guard never raises a false alarm.
What the dry-run catches:
a body whose inferred unit disagrees with its declaration, on any reachable branch — a stock times a per-year rate labelled as a stock, or a
_mflow returned where_yis declared;an addition or subtraction of two non-equivalent quantities — a monthly flow plus a yearly one (
betrag_m + freibetrag_y), or a stock plus a flow. At run time the assembled DAG computes on bare arrays with no pint, so such a combination is unit-blind and silently wrong; the dry-run rejects it rather than letting pint’s build-time auto-conversion of same-dimension operands paper over it;an ordering comparison (
<,<=,>,>=) of two non-equivalent quantities;a missing unit, and malformed declarations: a flow token without a period, a currency-agnostic token on a parameter, disagreeing period sources, or a boolean node carrying a concrete unit.
What it cannot catch:
wrong magnitudes — a coefficient, rate, or constant with the correct unit but the wrong value (a 2.5% rate written as
0.25); units are not values;a result that comes out dimensionless — a body inferring a dimensionless value (an early
return 0.0, or arithmetic that cancels) falls back to the declaration;equality comparisons —
==and!=are not unit-screened, unlike ordering.
A body the dry-run cannot evaluate must opt out explicitly. The dry-run executes a
scalar body symbolically, so a body it cannot trace must opt out: vectorized functions
(vectorization_strategy="not_required", no scalar form for pint to walk), piecewise
polynomials and lookup tables (evaluated by table machinery), and bodies calling join
or a raw xnp op. Rather than silently trusting such a body, the check rejects it
unless the author marks it verify_units=False on the decorator. The opt-out is of body
inference only: the declared unit still stands and the body’s edges are still checked.
Because the flag is explicit, every un-verified body is greppable — a deliberate choice,
and a ready-made worklist should the dry-run later learn to evaluate these operations.
Auto-generated nodes#
Auto-generated nodes receive auto-assigned units: time-conversion variants inherit the
source’s base token and read the variant’s period off its own suffix; auto-aggregations
derive their token from the source and the aggregation type, paralleling how
GEP 4 resolves their types. SUM/MEAN/MIN/MAX preserve the source
token; COUNT is a head count and is DIMENSIONLESS; ANY/ALL yield a boolean (a
dimensionless quantity) and so auto-assign DIMENSIONLESS (as does a SUM over a
boolean column — a head count). A @group_creation_function group id is auto-assigned
DIMENSIONLESS (it is an identifier, and the decorator exposes no unit=). Where the
source’s token pins down a concrete currency (a parameter), the derived node inherits
the agnostic counterpart.
Literals#
The dry-run executes a body on representative Quantitys, so a bare numeric literal
combined additively with a unit-carrying value raises (pint refuses to add a
dimensionless number to a currency). A literal that is only a multiplicative factor
(betrag * 0.5) is fine — multiplying by a dimensionless number preserves the unit.
Most apparent cases dissolve once the quantities are declared correctly: an ordinal such
as geburtsmonat (the month 1–12) is DIMENSIONLESS, so geburtsmonat - 1 is
dimensionless arithmetic and needs no tag. For a genuine constant of a real dimension,
either promote it to a parameter (the norm — it then gets the same provenance,
currency conversion, and checking as any other parameter, and the body becomes
dry-runnable), or opt the body out with @policy_function(verify_units=False) for
genuine code-level constants where a parameter would be artificial (the same body-level
opt-out as above).
Implementation#
Delivered as several PRs, with the framework proven on mettsim before any German
annotation. The tracking issues are:
ttsim #117 — framework core + tracer bullet
ttsim #118 — full dimension model
ttsim #119 — mandatory units + edge-consistency
ttsim #120 — currency knob + Layer-2 boundary
ttsim #121 — annotate mettsim, switch check on, CI test
gettsim #1191 — register EUR/DM
gettsim #1192 — gettsim rollout
Each package’s params schema enumerates its own token vocabulary: the core tokens minus
the agnostic currency tokens (the schema governs parameters, which must be concrete)
plus the concrete variants of that package’s registered currencies. It also enforces the
unit: XOR input_unit:/output_unit: split per parameter type: and admits the
per-entry overrides in dated entries. The schema shipped with ttsim (listing mettsim’s
CASTAR_*/SILVER_PENNY_* tokens) is the template; the copy at
docs/geps/params-schema.json (the validation target for all German parameter YAMLs) is
migrated together with the YAML files in #1192, adding the DM_*/EUR_* tokens.
Alternatives#
Runtime pint Quantities flowing through the DAG#
Rejected. Quantity is not a JAX pytree, breaks tracing, contradicts the GEP-9 column
vocabulary, and adds hot-path cost. Units in a tax-transfer model are static structural
properties of nodes, not of data, so runtime wrapping buys nothing the build-time check
does not already provide.
Inference-only (no declared units)#
Rejected in favour of mandatory declarations. Inference alone localises a bug only where dimensions clash downstream; a mandatory declared return unit localises it at the offending function and is self-documenting, at the cost of annotation churn the codebase largely already absorbs for types.
Keep hand-written time conversions; use pint only for checks#
Possible, but the stock/flow duality is exactly what a unit engine encodes for free. Sourcing the factors from pint removes a class of hand-maintained arithmetic without touching the naming.
A [count] dimension for head counts#
Considered, prototyped, and rejected. An earlier draft promoted counting quantities to a
custom [count] dimension, making per-person parameters CURRENCY / count and head
counts count. The intended payoff was catching a forgotten per-capita scaling. It was
dropped because:
the protection is weaker than it looks: a single generic
[count]cannot distinguish per-child from per-adult from per-household, so scaling by the wrong count still type-checks — only the forgot-entirely case is caught;the annotation tax lands on every per-capita parameter in the system (Regelsätze, Kindergeld, Freibeträge, …), which would read
CURRENCY / countwhere the law and every practitioner say “Euros per month”;SI and pint treat counting quantities as dimensionless; deviating surprises anyone who knows either.
The accepted cost is that a missing per-capita scaling is no longer a unit error. If
that bug class accumulates in practice, the closed token vocabulary makes a future
amendment with genuinely distinct dimensions ([person], [child], …) a clean
retrofit.
Make functions time-agnostic#
Rejected. Collapsing betrag_m and betrag_y into one node would erase the law-to-code
correspondence GEP 1 is built on.
Discussion#
(Open. To be resolved on Zulip.) Known points for debate: the strictness of literal
tagging; whether per-capita scaling should ever get dedicated dimensions (see the
rejected [count] alternative — revisit if missing-scale bugs accumulate); and whether
the gettsim rollout should be a single large PR or staged behind a temporary gate.
References and Footnotes#
Copyright#
This document has been placed in the public domain.