Data Entity Statuses
Data Entity Statuses — UNASSIGNED / DRAFT / STABLE / DEPRECATED / DELETED lifecycle on every catalogued entity, plus the soft-delete TTL configured by the platform's housekeeping job.
Every catalogued entity carries a status that signals where it is in its lifecycle — newly ingested, in active use, deprecated for a planned removal, or soft-deleted pending permanent purge. The status is operator-set, surfaces on the Catalog page as a Search facet, and drives the Activity Feed's DATA_ENTITY_STATUS_UPDATED event.
The five statuses
UNASSIGNED
The default. When metadata collectors ingest data entities into the platform's database, every new entity lands as UNASSIGNED until an operator changes it.
Platform default; can be operator-overridden via the entity detail page.
DRAFT
The data entity is a draft / test entity in the data source — not yet ready for downstream consumption. The UI lets operators set a time period after which the status auto-transitions to DELETED.
Operator.
STABLE
The data entity is stable and fully operational — safe for downstream pipelines and BI reports.
Operator.
DEPRECATED
A warning marker — the entity is deprecated for planned removal. The UI lets operators set a time period after which the status auto-transitions to DELETED.
Operator.
DELETED
Soft-deleted. The entity is hidden from the Management → Datasources view by default and only reachable through filter-with-deleted-items.
Operator (manual) or auto-transition from DRAFT / DEPRECATED after the TTL window expires.


The soft-delete TTL
DELETED is soft delete. The platform's housekeeping job permanently deletes a DELETED entity (and every cascading row attached to it — metadata values, ownerships, lineage, tags, terms, alerts, messages, metrics, attachment files including objects in S3 / MinIO storage, task runs, group relations, and dataset structure / enum values for datasets) only after the entity's status-update timestamp is older than housekeeping.ttl.data_entity_delete_days. The default window is 30 days.
During the soft-delete window the operator can flip the status back (DELETED → STABLE or any non-deleted status) and the entity reappears in the catalog with its history intact. After the TTL expires the next housekeeping run hard-deletes the row; once that happens, recovery requires re-ingest from the source.
To change the retention window, see Housekeeping Settings Configuration → housekeeping.ttl.data_entity_delete_days. Raise the value before any planned bulk deprecation if the audit trail matters for compliance.
The 30-day purge is permanent and cascades to object storage — DELETED is not long-term parking. On a default install (housekeeping.enabled: true, housekeeping.ttl.data_entity_delete_days: 30), once a DELETED entity's status-update timestamp is older than the configured window, the next housekeeping run hard-deletes it: the data_entity row, its cascading rows across the child tables (lineage, metadata, ownerships, tags, terms, alerts, messages, metrics, task runs, group relations, and dataset structure / enum values), and its attachment files in object storage — including objects in S3 / MinIO. There is no restore path once this runs; recovering the entity means re-ingesting it from its source.
Do not treat DELETED as an archive or a recycle bin. The restore-by-flipping-status escape hatch works only within the retention window — after the window elapses the entity and its attachments are gone. To retire entities but keep them recoverable for longer, raise housekeeping.ttl.data_entity_delete_days (or disable the housekeeping job) before parking anything in DELETED. See Housekeeping Settings Configuration for the key and the partial-override risk that can shrink the window to zero.
Known defect — the status_updated_at timestamp is not refreshed on non-DELETED status transitions. A guard in the platform's status mapper (DataEntityMapperImpl.applyStatus) tests the new status id against a value it has just overwritten with that same id, so the branch that re-stamps status_updated_at never runs. The timestamp is therefore left unchanged when an entity moves between DRAFT / STABLE / DEPRECATED or is restored from DELETED. This is cosmetic and does not affect the soft-delete retention clock above: the transition to DELETED runs through a separate soft-delete code path that stamps status_updated_at correctly, so the 30-day purge always measures retention from the actual deletion time.
Known limitation — the scheduled DRAFT / DEPRECATED → DELETED auto-flip has no per-tick batch cap. When an operator schedules a future status switch (the auto-transition to DELETED after N days affordance on DRAFT and DEPRECATED), the platform records status_switch_time on the entity and a background job picks up every entity whose status_switch_time is in the past and flips it. The job fires every 10 minutes under a 9-minute distributed lock (ShedLock), and the underlying query selecting candidate entities has no LIMIT — every overdue entity is processed in a single transaction.
A bulk operation that schedules (say) 5,000 entities for a near-future flip triggers one transaction processing all 5,000 pojos plus their cascade fan-out. If the transaction runs longer than the 9-minute lock window, the lock releases mid-flight, the next tick acquires its own lock, and the effective cadence drifts from "every 10 minutes" to "as fast as one large transaction can finish." There is no per-batch metric or operator-side dial to throttle it.
When scripting bulk DRAFT / DEPRECATED assignments with auto-transition, stagger status_switch_time values across multiple windows (e.g. spread 5,000 entities across 50 ten-minute windows of 100 each) to avoid the burst. A platform-side per-tick LIMIT is on the roadmap.
Status changes propagate to data sources
If the status of a data entity and its parent data source are both set to DELETED and an operator then flips the entity status back to a visible state, the data source itself reverts to its original (non-deleted) condition. The platform mirrors entity-level status changes back up the data-source row when the two are in lockstep.
DELETED-state read-only surface
When an entity's status flips to DELETED, the platform makes it intentionally read-only in the UI. Several edit affordances disappear from the detail page in the same render — silently, without an info banner — until the status is flipped back to a non-deleted state.
The affordances hidden in DELETED state:
Add / Edit business name
Detail-page header, next to the entity name.
Edit group (manually-created Data Entity Groups only)
Detail-page header, right-hand action strip.
Edit tags
Overview tab → Tags sidebar panel.
Add to group
Overview tab → Groups sidebar panel.
The entity name, the class / type badges, the status badge, the Share to Slack button (when Data Collaboration is enabled), the read-only data on every panel, and every tab below the header all continue to render normally.
To edit a DELETED entity, restore it first. Soft-deleted entities are intentionally read-only — the platform hides the edit affordances rather than letting an operator modify a record that is queued for permanent deletion. Flip the status back to STABLE, DRAFT, or DEPRECATED from the status badge (the badge itself is always interactive for users with the DATA_ENTITY_STATUS_UPDATE permission, even in DELETED state); the hidden affordances reappear in the next render.
The same read-only treatment applies when the soft-delete TTL has not yet expired — see The soft-delete TTL above. Once the TTL elapses and the housekeeping job hard-deletes the row, the detail page itself returns a 404; restoring is only possible during the soft-delete window.
Detail-header authoring caveats
A few smaller behaviours on the detail-page header that are non-obvious on first encounter:
The business-name button toggles its label between "Add business name" and "Edit". Both labels open the same dialog. If the entity has no
internalNameset, the button readsAdd business name; once one is set, it readsEdit. Train teammates to look for either label rather than onlyAdd business name. The Overview tab's Tags row follows the same pattern (Add tagsvsEdit tags).The status badge renders identically for users without
DATA_ENTITY_STATUS_UPDATE— but clicking it does nothing. The badge has no lock icon, nocursor: not-allowed, and no "you can't change this" tooltip; non-permission holders see the same chip everyone else does and discover they can't interact with it by clicking. If your deployment uses strict RBAC, grantDATA_ENTITY_STATUS_UPDATEexplicitly to the roles that should change status — see Permissions.The "Share" button is silently absent unless Data Collaboration is enabled. The Slack share affordance is gated by the
DATA_COLLABORATIONfeature toggle (default off). When the toggle is off the button is unmounted entirely — no "feature disabled" placeholder, no install hint. After enabling Data Collaboration, the button appears on every entity detail page in the next page load.
Group statuses
Status assignments apply to entity groups as well as individual entities. When an operator changes a status on a Data Entity Group, the UI offers the option to apply the change to the group as a whole or to cascade it to every member entity. Use the group-level status when you want to deprecate or retire an entire pipeline; use member-level when only specific members are affected.


Where the status surfaces
Catalog page filter — Statuses is one of the seven Search facets. Selecting one or more statuses narrows the result set to entities matching any of them.

Statuses facet on the Catalog page Activity Feed — every status change emits a
DATA_ENTITY_STATUS_UPDATEDevent. The feed is the audit trail for who set what status and when.
Status-update event on the Activity feed Entity detail page — the current status is shown next to the entity name, with edit affordance gated by the
DATA_ENTITY_STATUS_UPDATEpermission (see Permissions).
RBAC
Operators need the DATA_ENTITY_STATUS_UPDATE permission to change a data entity's status. The permission is part of the Data entity permissions group; see Permissions for the full list and how to compose roles around it.
Where to next
Search and Filtering — the Statuses facet that surfaces these on the Catalog page.
Activity Feed — the audit trail of every status change (
DATA_ENTITY_STATUS_UPDATED).Data Entity Groups & Domains — the group-vs-member status apply-to choice.
Data entity detail page — the composition of the detail page that hosts every header affordance documented above.
Housekeeping Settings Configuration — the soft-delete TTL operator-side reference.
Permissions — the
DATA_ENTITY_STATUS_UPDATEpermission.Data Discovery overview — the bucket landing this page sits under.
Last updated