> For the complete documentation index, see [llms.txt](https://docs.opendatadiscovery.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.opendatadiscovery.org/features/data-discovery/tagging.md).

# Manual Object Tagging

Tags are the platform's lightweight labelling mechanism — apply them to tables, datasets, columns, and quality tests to drive faceted search, organise the catalog by domain or stage, and signal special handling (`PII`, `Important`, `Deprecated`).

This page is the **read-side canonical home** for tagging — how operators apply, browse, and filter by tags. The operator-mutating side (curate the tag vocabulary, set the Important flag, manage namespace-scoped tags) lives at [Management → Tags](/features/management.md).

## What tags are

A tag is an operator-curated label that can be attached to any data entity or to any individual column. Tags drive:

* **Discovery.** The Tag facet on [Search and Filtering](/features/data-discovery/search.md) is one of the seven catalog filters; selecting one or more tags narrows the search to entities carrying any of them.
* **Organisation.** Tags are how operators encode lightweight, cross-cutting groupings that do not justify their own [Data Entity Group](/features/data-discovery/groups-domains.md).
* **Catalog Overview surfacing.** The most-used tags surface as the **Top tags** chip strip on the Catalog Overview home page — one-click filter into the catalog.
* **Important-flag visibility.** A tag flagged as **Important** in [Management → Tags](/features/management.md) is rendered visually distinct on entity pages and result rows, surfacing high-priority labels (`PII`, `Restricted`, `Deprecated`) without requiring operators to scan every tag chip.

## Applying tags

Apply tags both to data assets as a whole and to individual columns of datasets.

![](/files/3r4fQBsjG3S6HpZxYNY4)

The same UI flow applies at both granularities — open the entity (or column) detail surface, click the tag-management control, and pick from the existing tag vocabulary or create a new tag inline.

The platform exposes three `TAG_*` RBAC permissions:

| Permission   | Action                                      |
| ------------ | ------------------------------------------- |
| `TAG_CREATE` | Create a new tag in the catalog vocabulary. |
| `TAG_UPDATE` | Edit a tag's name or its Important flag.    |
| `TAG_DELETE` | Remove a tag from the catalog vocabulary.   |

Plus the cross-cutting `TAG_ASSIGNMENT_UPDATED` activity-event marker emitted whenever tag assignments on an entity change. This is **not** an RBAC permission — it is an entry on the `ActivityEventType` enum, surfaced on the [Activity Feed](/features/active-platform-features/activity-feed.md) for the affected entity rather than gating who can mutate tags.

For the platform-wide permission catalog and how to compose roles around these permissions, see [Permissions](/configuration-and-deployment/enable-security/authorization/permissions.md).

## Tag-driven discovery

Once tags are applied, three discovery paths rely on them:

* [**Search → Tag facet**](/features/data-discovery/search.md) — multi-select filter; results match entities carrying any of the selected tags.
* **Catalog Overview → Top tags** — one-click chip strip filtering into the catalog by the most-used tags across the deployment, rendered on the home page.
* **Tag-based per-entity badges** — tags appear on entity detail pages and in search result rows; Important-flagged tags render visually distinct.

## Operator workflow

The full lifecycle of a tag splits across two surfaces by design:

1. **Author the vocabulary** — go to [Management → Tags](/features/management.md) to create the canonical tag list, set the Important flag where appropriate, and govern the vocabulary across teams.
2. **Apply tags** — on entity detail pages, attach tags from the curated vocabulary to specific entities and columns.
3. **Narrow searches** — use the Tag facet on the Catalog page to find tagged entities.

Tags appear in two places, each for a different user action. This page covers **applying tags to entities and finding entities by tag**. The [Management → Tags](/features/management.md) page is where operators **create and edit the tag vocabulary itself** — renaming, deleting, marking tags as `Important` for higher list ordering. Apply and find by tags here; manage the catalog of tags there.

## Known limitations and operator caveats

A few behaviours of the tagging surface are non-obvious from the UI alone. Each item below states what an operator might assume, what actually happens, and what to do today.

{% hint style="info" %}
**Fixed in 0.28.0 — "Top tags" and the Tag-facet seed list now rank by true popularity.** Releases up to 0.27.x truncated the tag directory to the requested page size **before** computing per-tag usage (the window ordered by `tag.id`), so once the directory exceeded the page size the strip showed the oldest tags re-ranked among themselves and younger, more-used tags never appeared (the empirical case: 35 tags, `size=30` — the 5 youngest absent regardless of usage). As of 0.28.0 the platform aggregates usage over the full directory first, then orders by usage count with tag id as a deterministic tiebreak, then paginates — the endpoint's "sorted by popularity" promise holds past one page and page boundaries are stable. No operator action needed; the pre-0.28.0 workaround (querying tag-to-entity relations directly for governance reviews) is no longer necessary.
{% endhint %}

{% hint style="warning" %}
**Five paths mint new tags into the global tag directory — not only `TAG_CREATE`.** An operator restricting `TAG_CREATE` to "vocabulary stewards" might assume that closes the directory to free-form additions. It does not. Every one of the following surfaces silently creates a new tag row for any name that does not already exist in the catalog:

The four `*_TAGS_UPDATE` permissions and the collector ingestion path all call the platform's shared `getOrCreateTagsByName` helper, which creates rows for any novel names before attaching them to the target entity. Any user holding per-entity / per-term / per-column tag-update on a single entity (or any collector ingestion) can therefore enlarge the global tag vocabulary visible to every user via `GET /api/tags`, the Top-Tags strip, and the Tag-facet seed list.

**Mitigation today:** if vocabulary governance matters in your deployment, withhold the `*_TAGS_UPDATE` permissions from rank-and-file users; do not rely on `TAG_CREATE` alone. The collector ingestion path is not gated by RBAC and cannot be locked down through permissions — restrict it via the upstream collector configuration or by reviewing ingested tags periodically.
{% endhint %}

| Surface                                                      | Permission gating the surface           | Effect on the tag directory                                                              |
| ------------------------------------------------------------ | --------------------------------------- | ---------------------------------------------------------------------------------------- |
| `POST /api/tags`                                             | `TAG_CREATE`                            | The documented path.                                                                     |
| `PUT /api/dataentities/{id}/tags`                            | `DATA_ENTITY_TAGS_UPDATE`               | A novel tag name on an entity mints a new tag in the directory.                          |
| `PUT /api/terms/{id}/tags`                                   | `TERM_TAGS_UPDATE`                      | A novel tag name on a term mints a new tag in the directory.                             |
| `PUT /api/datasetfields/{id}/tags`                           | `DATASET_FIELD_TAGS_UPDATE`             | A novel tag name on a column mints a new tag in the directory.                           |
| Collector ingestion (`ExternalTagIngestionRequestProcessor`) | Collector token (no per-tag permission) | An ingested entity carrying tag names that do not yet exist mints them in the directory. |

{% hint style="info" %}
**Tag names are case-sensitive — `finance` and `Finance` are two separate tags.** The platform stores tag names verbatim. Two tags with names that differ only in capitalisation are distinct rows; entities tagged with one are not surfaced by a Tag-facet filter on the other. When seeding the catalog vocabulary on Management → Tags, settle a casing convention up front (uniform lowercase, Title-case, or all-uppercase) and audit `GET /api/tags` periodically for accidental near-duplicates — particularly after a collector ingestion run, which often emits framework-specific casing different from the operator-curated style.
{% endhint %}

{% hint style="warning" %}
**Tag names are stored verbatim — there is no server-side trim, length cap, or character-set restriction on any write path.** None of the tag write paths normalises the incoming name: the create form's OpenAPI schema (`TagFormData.name`) is a bare string with no `maxLength` or `pattern`, and the shared service helper writes the raw name straight to the directory row. Two consequences beyond the casing caveat above:

* **Leading / trailing whitespace mints a distinct row.** Because matching is exact-string, `' tag '` (with surrounding spaces) and `'tag'` are two separate tags — the same trap as `finance` vs `Finance`, but harder to spot.
* **The global tag directory is a pollution / DoS surface.** Arbitrarily long or arbitrary-character names are accepted, and an over-long name or a flood of near-identical whitespace variants lands in `GET /api/tags`, the Catalog **Top tags** strip, and the Search Tag-facet seed list — surfaces every user sees, with no cap to bound them.

**Mitigation today:** settle a naming + casing + no-surrounding-whitespace convention up front, and audit `GET /api/tags` periodically for whitespace / over-long near-duplicates (especially after a collector ingestion run). The upstream platform fix is a server-side trim + length cap + a database `CHECK` constraint; a separate, sibling input-validation gap on the dataset-statistics ingestion endpoint is tracked independently.
{% endhint %}

{% hint style="warning" %}
**The audit trail for tag changes is non-uniform across the three tag-assign endpoints.** A nominally-symmetric set of three tag-assign actions emits three different things to the [Activity Feed](/features/active-platform-features/activity-feed.md):

Entity and dataset-field tag changes are both fully audited — each event carries the before-and-after tag lists, under the two different event types above. The gap is the term path: term tag changes are not in the Activity Feed at all — they are observable only by polling the term's current tag list and diffing externally. Compliance / audit workflows depending on tag-change history must instrument the term path separately until the platform-side fix lands.
{% endhint %}

| Action                                                                    | Audit-feed event                                                                                      |
| ------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
| Tagging a **data entity** (`PUT /api/dataentities/{id}/tags`)             | Emits a `TAG_ASSIGNMENT_UPDATED` event scoped to the entity, capturing the before-and-after tag list. |
| Tagging a **dataset field** / column (`PUT /api/datasetfields/{id}/tags`) | Emits a `DATASET_FIELD_TAGS_UPDATED` event capturing the before-and-after tag list.                   |
| Tagging a **term** (`PUT /api/terms/{id}/tags`)                           | Emits **no** activity event today.                                                                    |

## Where to next

* [Data entity detail page](/features/data-discovery/entity-detail-page.md) — the per-entity surface where the sidebar Tags panel lives and Important-flagged tags render visually distinct on entity rows.
* [Search and Filtering](/features/data-discovery/search.md) — where the Tag facet narrows the catalog.
* [Data Entity Groups & Domains](/features/data-discovery/groups-domains.md) — the heavier-weight grouping mechanism for related entities (datasets, transformers, quality tests).
* [Management](/features/management.md) — the operator-mutating side: tag vocabulary curation, Important flag, namespace scoping.
* [Activity Feed](/features/active-platform-features/activity-feed.md) — the audit trail for `TAG_ASSIGNMENT_UPDATED` + `DATASET_FIELD_TAGS_UPDATED` events (read the audit-asymmetry caveat above before relying on it).
* [Permissions](/configuration-and-deployment/enable-security/authorization/permissions.md) — the platform-wide permission catalog, including the three `TAG_*` rows plus the four `*_TAGS_UPDATE` side-channel rows.