> For the complete documentation index, see [llms.txt](https://docs.opendatadiscovery.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.opendatadiscovery.org/developer-guides/architecture-decision-log/adr-0073-oddrn-universal-identity.md).

# ADR-0073: ODDRN is the universal identity for every entity

## Status

**Accepted.** Reconstructed from the codebase on 2026-05-31; the decision is live in the source today.

## Context

ODD Platform ingests metadata repeatedly and from many sources. The same real-world dataset is reported on every collector run, may be reported by more than one producer, and changes over time. On each payload the platform has to decide whether an incoming entity is new or one it already holds — and it has to decide that *without a prior handshake*: a collector reporting a table for the first time has never seen the platform's internal row id. Lineage compounds the problem, because an edge can join an entity in one source to an entity in another, so the two endpoints must be nameable independently of any single ingest. Alerts make it sharper still: the Alertmanager webhook arrives from an external system that knows the entity only as a string.

A surrogate database id cannot carry this identity. It is assigned by the platform *after* first contact, it is not stable across systems, and producers never learn it. The platform needs an identity that is **stable, producer-independent, and computable from the entity's real-world coordinates before the platform has ever seen the entity**.

## Decision

**Every entity is identified by an ODDRN — Open Data Discovery Resource Name — a stable string that encodes the entity's data-source family and connection coordinates** (for example `//postgresql/host/1.2.3.4/databases/ex_database/schemas/public/table/ex_table`). The ODDRN is the natural key: `data_entity.oddrn` is declared `UNIQUE` and backed by a dedicated unique index, so the database enforces one row per ODDRN. The internal `bigserial` id exists only for foreign keys inside the platform; the identity that crosses ingests, producers, and systems is the ODDRN.

The rule everywhere is **same ODDRN means the same entity**:

* **Ingestion is idempotent on the ODDRN.** The service keys incoming items by ODDRN, looks them up against existing rows (`listByOddrns`), and partitions the batch into *update* (ODDRN already present) and *create* (ODDRN absent). Re-reporting an entity updates it in place; it does not duplicate it.
* **Lineage edges are ODDRN pairs.** The `lineage` table's primary key is `(parent_oddrn, child_oddrn)` — there is no surrogate edge id — and the graph is walked by a recursive query that joins on those ODDRN columns, so lineage spans sources and producers by string identity alone.
* **Alerts route by ODDRN.** The Alertmanager webhook reads the `entity_oddrn` label off each external alert and stores it as the alert's target; the alert finds its entity by ODDRN, with no shared database id.

ODDRN is *universal* in both directions. Producers supply it for ingested entities — collectors and push adapters build it with the open-source generator libraries. The platform also **mints** it for entities it creates itself: Data Entity Groups and Lookup Tables both call the same Java generator to build an `//oddplatform/…` ODDRN once the row has an id. Every entity, whatever its origin, is addressable by exactly one.

## Consequences

* Idempotent ingestion, cross-system lineage, and external alert routing all work with no prior id handshake — they are downstream of identity being a producer-computable string. The lineage walk itself is realised as a Postgres recursive CTE over the ODDRN pairs (see ADR-0071).
* **The identity burden moves to the producers.** Every producer reporting the same entity must emit the *exact same* ODDRN. Because the ODDRN encodes connection coordinates (host, account, region), two agents that disagree on a hostname or static IP for the same source mint different ODDRNs, and the platform then holds two entities where there is one. The generator libraries exist precisely to make these strings agree; hand-built ODDRNs are where it breaks.
* **Identity is coupled to coordinates.** If a source's connection coordinates change — a database moves host — its entities' ODDRNs change, and the platform treats the moved entities as new ones. History does not automatically follow the move.
* **A malformed or unrecognised ODDRN fails quietly, not loudly.** An entity whose ODDRN prefix the platform cannot parse as a known data-source family is grouped under a catch-all `other` bucket in the Directory rather than rejected — so a typo in a hand-built ODDRN surfaces as a misfiled entity, not an error.

## Evidence

* `odd-platform-api/.../db/migration/V0_0_1__init.sql:66-71` — the `data_entity` table; `oddrn varchar(255) UNIQUE`. `:246-247` — `CREATE UNIQUE INDEX ix_unique_data_entity_oddrn ON data_entity USING btree (oddrn)`; the database enforces one row per ODDRN.
* `odd-platform-api/.../service/ingestion/IngestionServiceImpl.java:86` — incoming entities are keyed by ODDRN (`toMap(DataEntityIngestionDto::getOddrn, identity())`); `:93-98` — `listByOddrns(...)` then `partitioningBy(d -> existingPojoDict.containsKey(d.getOddrn()))` splits the batch into update (present) vs create (absent) — idempotent ingest on the ODDRN.
* `odd-platform-api/.../repository/reactive/ReactiveDataEntityRepositoryImpl.java:228-241` — `listByOddrns` resolves existing rows via `DATA_ENTITY.ODDRN.in(oddrns)`.
* `odd-platform-api/.../db/migration/V0_0_2__add_lineage.sql:1-7` — the `lineage` table; primary key is `(parent_oddrn, child_oddrn)` — edges are ODDRN pairs, no surrogate id.
* `odd-platform-api/.../repository/reactive/ReactiveLineageRepositoryImpl.java:121-131,150-176` — `getLineageRelations` walks the graph with `DSL.withRecursive`, seeding from `…in(oddrns)` and recursing by joining `LINEAGE` on the ODDRN columns.
* `odd-platform-api/.../controller/AlertManagerController.java:21-24` and `.../service/AlertServiceImpl.java:178` — the Alertmanager webhook routes each alert to its entity by `setDataEntityOddrn(externalAlert.getLabels().get("entity_oddrn"))`.
* `odd-platform-api/.../service/DataEntityGroupServiceImpl.java:191-195` and `.../service/DataEntityLookupTableServiceImpl.java:247-251` — the platform mints ODDRNs for entities it creates itself (`oddrnGenerator.generate(ODDPlatform…Path.builder().id(...).build())`).
* `odd-platform-api/.../service/DirectoryServiceImpl.java:101-110` and `.../utils/OddrnUtils.java:7` — an ODDRN whose prefix cannot be parsed falls back to `UNKNOWN_DATASOURCE_TYPE = "other"`, the Directory catch-all; malformed ODDRNs degrade silently.

## See also

* [Main Concepts — ODDRN](/introduction/main-concepts.md#oddrn) — the canonical definition, format, generator libraries, and the same-string-same-entity limitation, written for the operator.
* [Data Lineage](/features/data-lineage.md) — the cross-system graph that ODDRN identity makes possible.
* [ADR-0071 — PostgreSQL is the only required runtime dependency](/developer-guides/architecture-decision-log/adr-0071-postgres-only-runtime-dependency.md) — the lineage graph keyed on ODDRN is walked by a Postgres recursive CTE, one instance of that posture.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/developer-guides/architecture-decision-log/adr-0073-oddrn-universal-identity.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
