Custom metadata
Operator-curated metadata on data entities — a paired surface combining a deployment-wide field catalogue with per-entity value assignments. Surfaced on the entity's Overview tab.
Every data entity in the catalog carries a Custom metadata panel — a key/value list operators populate to capture per-entity facts the source system does not provide (cost-centre allocation, regulated-PII flag, downstream-consumer team, freshness SLA, anything you want to attach as a typed annotation). Two halves sit behind the panel: a field catalogue that defines the available keys for the deployment, and a per-entity value set that binds each key to a value for one specific entity.
This page covers both halves — how the catalogue is populated, how per-entity values are authored, how the two origins (operator-curated INTERNAL vs collector-ingested EXTERNAL) differ, the permission gates, and the load-bearing caveats: the silent-no-op write path, the dropped active flag, the absence of server-side type validation, the API path that overwrites EXTERNAL values, the unauthenticated catalogue enumeration, and the absence of activity-feed events on metadata mutations.
Where to find it
Open any data entity's detail page → Overview tab. The Metadata panel renders in the main column below the description and attachments panels. The panel shows a combined list — operator-curated fields and collector-ingested fields rendered side by side, distinguished by the field's Origin badge.
Operators with the DATA_ENTITY_CUSTOM_METADATA_CREATE permission see an Add affordance for assigning a new field value to the entity; the affordance opens an autocomplete picker over the deployment's field catalogue (with the option to type a new field name on miss — see the catalogue side-channel caveat below). Operators with _UPDATE see an in-place edit affordance on each existing value; operators with _DELETE see a remove affordance.
The two halves
Field catalogue (deployment-scoped vocabulary). A metadata_field table holds one row per field name + type pair the deployment knows about. The catalogue is shared across every entity in the deployment — the same cost_centre field name resolves to the same metadata_field.id for every entity that uses it. Reading the catalogue is what powers the autocomplete picker when an operator adds a new value to an entity, and the catalogue read is the only read of the metadata surface that does not go through the per-entity endpoint.
Per-entity value set (entity-scoped binding). A metadata_field_value table holds one row per (data_entity_id, metadata_field_id) pair — the binding that says "this entity has this field set to this value." Operators author each row through the per-entity write endpoints listed below; the read side ships the bound values back as part of the entity's detail-page payload.
Field types and origin
Fields carry two pieces of metadata beyond the name: a type (the value's shape) and an origin (who owns the field's existence in the catalogue).
Supported field types — the type is set when the field is first minted in the catalogue and is immutable thereafter. Seven types are selectable when authoring a field:
STRING— free-text.INTEGER,FLOAT— numeric.BOOLEAN—true/false.DATETIME— ISO-8601 timestamp.ARRAY— a list of strings (each element rendered as a chip in the value display).JSON— arbitrary JSON document (rendered as collapsible tree in the value display).
The platform's internal type enum carries one more value, UNKNOWN, beyond the seven above (the public API MetadataFieldType enum exposes only the seven). UNKNOWN is a defensive fallback the ingestion parser assigns when it cannot classify a collector-supplied value into one of the seven shapes — it is not offered when authoring a field and is not a type an operator chooses.
The per-entity value side stores the value as a JSON-encoded string regardless of declared type; the type drives the value-editor's input shape and the display formatter. The API does not enforce the declared type on write — see the caveat below.
Origin — two values, mutually exclusive per field:
INTERNAL — operator-curated. The field was minted by a catalog user authoring a value on an entity (see the auto-create-on-miss side-channel below) or by an explicit catalogue mutation. INTERNAL fields are the only ones surfaced in the autocomplete picker on the Add-value affordance.
EXTERNAL — collector-ingested. The field came in attached to an entity via the ingestion pipeline (a collector's adapter mapped a source-side property into ODD's metadata schema). EXTERNAL fields render alongside INTERNAL fields in the entity's Metadata panel but cannot be edited or added from the UI — they are owned by the source system and refreshed on every ingestion pass. The API does not enforce that boundary, though — see the EXTERNAL-origin caveat below.
When an operator views an entity's Metadata panel, both origins render in the same list. The UI distinguishes them with an inline origin marker; the Add affordance only writes INTERNAL.
Field naming is case-sensitive
Field names in the catalogue are case-sensitive. cost_centre and Cost_centre are two distinct rows in metadata_field; an operator who types Cost_centre into the autocomplete picker, sees no match, and accepts the auto-create-on-miss side-channel (below) mints a parallel field that operators searching for cost_centre will not find. The autocomplete query uses a case-insensitive substring match for the suggestion list, but the resolution against the catalogue is by exact-string match — autocomplete saves a few keystrokes; it does not protect against case-drift duplicates.
Treat field names as a controlled vocabulary that benefits from a documented naming convention (snake_case is the most common in deployments we have seen). A naming-convention drift is the most common source of "I added this field on entity X yesterday but I can't find it in the autocomplete on entity Y today" reports.
Authoring per-entity values
Three operations exist on the per-entity side, each gated by a distinct permission:
Create one or more new field values on an entity (each carrying the field name + type + value)
POST /api/dataentities/{id}/metadata
DATA_ENTITY_CUSTOM_METADATA_CREATE
Update an existing field value on an entity (by field id)
PUT /api/dataentities/{id}/metadata/{metadata_field_id}
DATA_ENTITY_CUSTOM_METADATA_UPDATE
Delete a field value from an entity (by field id)
DELETE /api/dataentities/{id}/metadata/{metadata_field_id}
DATA_ENTITY_CUSTOM_METADATA_DELETE
The Create path takes a list of field objects (name, type, value) in the request body — each entry either resolves against an existing INTERNAL field in the catalogue (matched by exact name + type) or mints a new INTERNAL field in the catalogue on the spot. The Update and Delete paths are by field id and operate on the per-entity value row only — they never touch the catalogue.
There is no operator-facing catalogue-maintenance UI: INTERNAL field rows are created as a side effect of the Create-value path, and the catalogue read endpoint (GET /api/metadata/fields) returns the full INTERNAL set with optional substring-filter parameter for autocomplete.
Permissions
Three permissions gate the per-entity surface; the catalogue read is not gated by a custom-metadata permission (see the unauthenticated-enumeration caveat below).
The Add affordance + POST /api/dataentities/{id}/metadata. Includes the auto-create-on-miss side-channel that mints new INTERNAL fields in the catalogue.
The edit affordance on each value row + PUT /api/dataentities/{id}/metadata/{metadata_field_id}.
The remove affordance on each value row + DELETE /api/dataentities/{id}/metadata/{metadata_field_id}.
All three are scoped to the data entity in the URL — granting _CREATE on entity X does not grant it on entity Y. The catalogue read (GET /api/metadata/fields) is reachable by every authenticated caller and, under auth.type=DISABLED, by every anonymous caller (see DISABLED authentication).
Activity trail
Custom-metadata mutations emit no Activity Feed event today. The ActivityEventTypeDto enum carries CUSTOM_METADATA_CREATED, CUSTOM_METADATA_UPDATED, and CUSTOM_METADATA_DELETED values, but no code path emits any of them — they are dead enum entries. Operators looking at an entity's Activity tab will see description updates, tag updates, owner updates, and term assignments, but not metadata-value changes. See the caveat in the next section for the forensic-silence implications.
Known limitations and operator caveats
The Update path is a silent no-op when the value row does not pre-exist. The platform's PUT /api/dataentities/{id}/metadata/{metadata_field_id} endpoint declares its operation as upsertDataEntityMetadataFieldValue in the OpenAPI spec, but the repository call behind it is a pure SQL UPDATE against metadata_field_value keyed on (data_entity_id, metadata_field_id) — no INSERT ... ON CONFLICT fallback. If the row does not exist (the field has never been assigned a value on this entity), the UPDATE matches zero rows, returns nothing, and the controller propagates an empty Mono. The HTTP response is 200 OK with an empty body; the UI toast reads "Metadata successfully updated." even though nothing was written.
What this means in practice. Reconciliation pipelines that issue PUTs assuming upsert semantics (the operationId implies replace-or-create) silently lose writes for any field not previously assigned on the target entity. The bootstrap path that does work is POST /api/dataentities/{id}/metadata (the Create endpoint), which mints the field on the catalogue and binds the value to the entity in one call.
Mitigation today. Issue a GET preflight against the entity's metadata before any PUT — if the field id is not in the response, switch to a POST. The platform-side fix (true upsert semantics, or rejecting the PUT with a meaningful error when no row matches) is on the roadmap; until it lands, the doc-only preflight is the operator-side workaround.
Every successful Update silently sets the active column on the value row to NULL. The service layer constructs the persistence pojo without calling setActive(...); the Java Boolean field stays null. The repository's UPDATE writes the null verbatim into the row's active column, overwriting whatever value the row carried previously. The column's database DEFAULT TRUE only fires on INSERT — it does not protect UPDATE — so every edited row ends up with active IS NULL.
What this means in practice. Any downstream code (in the platform, in a future feature, or in an external query) that filters WHERE active = TRUE will silently drop edited rows. The currently-shipping platform code does not appear to filter on active for the value rows, so this caveat is latent rather than user-visible today — but it is a foot-gun for anyone querying the table directly, building an external analytics view over it, or relying on future platform code to honour the column. Use WHERE active IS DISTINCT FROM FALSE (treats null as active) rather than WHERE active = TRUE when querying the table outside the platform's own service code.
The platform-side fix is to either set setActive(true) on the service-layer pojo before the UPDATE, or to exclude the active column from the UPDATE's SET clause entirely.
The API does not validate a value against its field's declared type. The declared type (INTEGER, BOOLEAN, DATETIME, and so on) drives the UI value editor and the display formatter, but the write endpoints store whatever string the request body carries — there is no server-side type check. A POST or PUT can persist "not a number" on an INTEGER field or "maybe" on a BOOLEAN field, and the platform accepts it with a 200.
What this means in practice. The UI editor is the only thing enforcing type shape; an SDK client, a curl, or a reconciliation pipeline writing directly to the API can land type-violating values that then render through a formatter expecting the declared type. Validate the value shape on the writer side before the call — the platform will not reject a mismatch for you.
The API lets an operator overwrite an EXTERNAL (collector-ingested) value; only the UI hides it. The UI suppresses edit affordances on EXTERNAL fields, but the per-entity write endpoints (POST / PUT /api/dataentities/{id}/metadata) do not check the field's origin. An operator with DATA_ENTITY_CUSTOM_METADATA_UPDATE can write a value onto an EXTERNAL field through the API directly.
What this means in practice. The overwrite is not durable — the next ingestion pass for that entity replaces the collector-owned value again, so a hand-edited EXTERNAL value silently reverts on the next collector run. Treat EXTERNAL fields as read-only in any integration even though the API does not enforce it; if a value needs to change permanently, change it at the source the collector ingests from.
The catalogue read is unauthenticated and unbounded, and the per-entity Create path mints new INTERNAL fields visible to every authenticated user. Two compounding shapes here:
GET /api/metadata/fieldshas no entry in the platform's security rules — the path falls through to the default.authenticated()matcher, so every authenticated caller can list the catalogue. Underauth.type=DISABLED(no authentication required at all), the endpoint is reachable by every anonymous caller too. The response carries every INTERNAL field name in the deployment.The same endpoint's SQL has no
LIMIT, noOFFSET, and noORDER BYclause. Every call returns the entire catalogue. The response'sPageInfois theatre —totalis computed asitems.size()on every call (so it always equals the response length, not the catalogue size) andhasNextis hardcodedfalse. SDK clients written from the OpenAPI spec build "load more" infinite-scroll workflows that never fire; the catalogue ships as a single response per call.
What this means in practice. Deployments with operator-named field schemas (finance_cost_centre, marketing_attribution, pii_redaction_rule, aml_review_status, anything that names team-internal taxonomy in the field name) leak the full vocabulary to every authenticated user — and to every anonymous user under DISABLED. A user with DATA_ENTITY_CUSTOM_METADATA_CREATE on a single entity can mint a new INTERNAL field through the Create-value path that becomes visible to every other user on their next autocomplete keystroke.
Production deployments with 10K+ INTERNAL field rows pay a 1–2 MB response on every autocomplete keystroke (the endpoint accepts a query parameter, but the filter is applied server-side after fetching the unbounded result set; there is no DB-level early termination).
Mitigation today. Treat custom metadata field names as deployment-public. If a field name itself encodes sensitive information about a team's workflow or taxonomy, do not put it in custom metadata — author it in a system the platform does not enumerate. Grant DATA_ENTITY_CUSTOM_METADATA_CREATE only to operators trusted to mint new vocabulary; the auto-create-on-miss side-channel makes the permission an indirect grant of catalogue-write access. The platform-side fix (introducing a CUSTOM_METADATA_FIELD_READ permission + adding a security rule for the catalogue endpoint + paginating the SQL + computing real total and hasNext) is on the roadmap.
Custom-metadata mutations leave no audit trail in the Activity Feed. Three dead enum values exist in the platform's ActivityEventTypeDto — CUSTOM_METADATA_CREATED, CUSTOM_METADATA_UPDATED, CUSTOM_METADATA_DELETED — but no code path emits any of them. The entity's Activity tab shows other mutations (description, tags, owners, terms) but not metadata-value writes or deletes. Same forensic-silence pattern as the DEG-membership write paths and the DATA_ENTITY_RELATION_UPDATED dead enum.
For compliance teams that need a who-changed-what-when trail on custom metadata, instrument it externally — an API-gateway access log records the authenticated POST / PUT / DELETE calls; the PostgreSQL WAL via pgaudit records the metadata_field_value row writes. See Audit trail scope for the compensating-controls catalogue across every silent-mutation surface the platform carries today.
Where to next
Entity description — the sibling per-entity Overview surface; same Add / Edit affordance shape but a single free-text Markdown field rather than a typed key/value catalogue. Carries its own load-bearing caveat (no write-time HTML sanitisation across six Markdown surfaces).
Data entity detail page — the parent container for the Metadata panel; covers how the panel composes with the rest of the Overview tab.
Activity Feed — the audit trail for entity-level mutations, and the canonical home for the forensic-silence framing that custom-metadata writes share with DEG-membership writes.
Audit trail scope — the compliance-facing summary of what the platform audits today and what it does not, including the compensating controls for the silent-mutation surfaces.
Permissions — the canonical home for the three
DATA_ENTITY_CUSTOM_METADATA_*permissions and the full per-resource gating story.
Last updated