ADR-0073: ODDRN is the universal identity for every entity

Every entity carries a stable ODDRN string — the same ODDRN always means the same entity across ingests and producers, which is what makes idempotent ingestion and cross-system lineage possible.

Status

Accepted. Reconstructed from the codebase on 2026-05-31; the decision is live in the source today.

Context

ODD Platform ingests metadata repeatedly and from many sources. The same real-world dataset is reported on every collector run, may be reported by more than one producer, and changes over time. On each payload the platform has to decide whether an incoming entity is new or one it already holds — and it has to decide that without a prior handshake: a collector reporting a table for the first time has never seen the platform's internal row id. Lineage compounds the problem, because an edge can join an entity in one source to an entity in another, so the two endpoints must be nameable independently of any single ingest. Alerts make it sharper still: the Alertmanager webhook arrives from an external system that knows the entity only as a string.

A surrogate database id cannot carry this identity. It is assigned by the platform after first contact, it is not stable across systems, and producers never learn it. The platform needs an identity that is stable, producer-independent, and computable from the entity's real-world coordinates before the platform has ever seen the entity.

Decision

Every entity is identified by an ODDRN — Open Data Discovery Resource Name — a stable string that encodes the entity's data-source family and connection coordinates (for example //postgresql/host/1.2.3.4/databases/ex_database/schemas/public/table/ex_table). The ODDRN is the natural key: data_entity.oddrn is declared UNIQUE and backed by a dedicated unique index, so the database enforces one row per ODDRN. The internal bigserial id exists only for foreign keys inside the platform; the identity that crosses ingests, producers, and systems is the ODDRN.

The rule everywhere is same ODDRN means the same entity:

  • Ingestion is idempotent on the ODDRN. The service keys incoming items by ODDRN, looks them up against existing rows (listByOddrns), and partitions the batch into update (ODDRN already present) and create (ODDRN absent). Re-reporting an entity updates it in place; it does not duplicate it.

  • Lineage edges are ODDRN pairs. The lineage table's primary key is (parent_oddrn, child_oddrn) — there is no surrogate edge id — and the graph is walked by a recursive query that joins on those ODDRN columns, so lineage spans sources and producers by string identity alone.

  • Alerts route by ODDRN. The Alertmanager webhook reads the entity_oddrn label off each external alert and stores it as the alert's target; the alert finds its entity by ODDRN, with no shared database id.

ODDRN is universal in both directions. Producers supply it for ingested entities — collectors and push adapters build it with the open-source generator libraries. The platform also mints it for entities it creates itself: Data Entity Groups and Lookup Tables both call the same Java generator to build an //oddplatform/… ODDRN once the row has an id. Every entity, whatever its origin, is addressable by exactly one.

Consequences

  • Idempotent ingestion, cross-system lineage, and external alert routing all work with no prior id handshake — they are downstream of identity being a producer-computable string. The lineage walk itself is realised as a Postgres recursive CTE over the ODDRN pairs (see ADR-0071).

  • The identity burden moves to the producers. Every producer reporting the same entity must emit the exact same ODDRN. Because the ODDRN encodes connection coordinates (host, account, region), two agents that disagree on a hostname or static IP for the same source mint different ODDRNs, and the platform then holds two entities where there is one. The generator libraries exist precisely to make these strings agree; hand-built ODDRNs are where it breaks.

  • Identity is coupled to coordinates. If a source's connection coordinates change — a database moves host — its entities' ODDRNs change, and the platform treats the moved entities as new ones. History does not automatically follow the move.

  • A malformed or unrecognised ODDRN fails quietly, not loudly. An entity whose ODDRN prefix the platform cannot parse as a known data-source family is grouped under a catch-all other bucket in the Directory rather than rejected — so a typo in a hand-built ODDRN surfaces as a misfiled entity, not an error.

Evidence

  • odd-platform-api/.../db/migration/V0_0_1__init.sql:66-71 — the data_entity table; oddrn varchar(255) UNIQUE. :246-247CREATE UNIQUE INDEX ix_unique_data_entity_oddrn ON data_entity USING btree (oddrn); the database enforces one row per ODDRN.

  • odd-platform-api/.../service/ingestion/IngestionServiceImpl.java:86 — incoming entities are keyed by ODDRN (toMap(DataEntityIngestionDto::getOddrn, identity())); :93-98listByOddrns(...) then partitioningBy(d -> existingPojoDict.containsKey(d.getOddrn())) splits the batch into update (present) vs create (absent) — idempotent ingest on the ODDRN.

  • odd-platform-api/.../repository/reactive/ReactiveDataEntityRepositoryImpl.java:228-241listByOddrns resolves existing rows via DATA_ENTITY.ODDRN.in(oddrns).

  • odd-platform-api/.../db/migration/V0_0_2__add_lineage.sql:1-7 — the lineage table; primary key is (parent_oddrn, child_oddrn) — edges are ODDRN pairs, no surrogate id.

  • odd-platform-api/.../repository/reactive/ReactiveLineageRepositoryImpl.java:121-131,150-176getLineageRelations walks the graph with DSL.withRecursive, seeding from …in(oddrns) and recursing by joining LINEAGE on the ODDRN columns.

  • odd-platform-api/.../controller/AlertManagerController.java:21-24 and .../service/AlertServiceImpl.java:178 — the Alertmanager webhook routes each alert to its entity by setDataEntityOddrn(externalAlert.getLabels().get("entity_oddrn")).

  • odd-platform-api/.../service/DataEntityGroupServiceImpl.java:191-195 and .../service/DataEntityLookupTableServiceImpl.java:247-251 — the platform mints ODDRNs for entities it creates itself (oddrnGenerator.generate(ODDPlatform…Path.builder().id(...).build())).

  • odd-platform-api/.../service/DirectoryServiceImpl.java:101-110 and .../utils/OddrnUtils.java:7 — an ODDRN whose prefix cannot be parsed falls back to UNKNOWN_DATASOURCE_TYPE = "other", the Directory catch-all; malformed ODDRNs degrade silently.

See also

Last updated