> For the complete documentation index, see [llms.txt](https://docs.opendatadiscovery.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.opendatadiscovery.org/features/data-discovery/groups-domains.md).

# Data Entity Groups & Domains

A **Data Entity Group (DEG)** is the platform's logical-grouping primitive — a catalog-level container that gathers related data entities (datasets, transformers, quality tests, consumers) under one umbrella with their own metadata, owners, tags, and terms. A **Domain** is a particular use of a DEG: a DEG flagged as a domain surfaces in the Catalog Overview home page's Domains section as a top-level discovery surface.

This page covers both framings — DEGs as the underlying primitive, and Domains as the operator-flag use of them.

## What a DEG is

Create groups to gather similar entities (datasets, transformers, quality tests, etc.). Each group can be enriched with specific metadata, owners, and [terms](/features/data-glossary/business-glossary.md).

**Example.** An organisation has ingested metadata related to its finances into the ODD Platform. All the entities are united into the Finance **Namespace** by default. To categorise entities, one creates Revenue and Payrolls DEGs.

![](/files/IhwfVgWbYEYX0a7C2D4O)

A DEG is itself a Data Entity (of type `DATA_ENTITY_GROUP`) — it has a detail page, an ODDRN, and participates in [lineage](/features/data-lineage/data-objects.md) (Group lineage returns the union of the group's children's lineage).

## The Domain framing

Flagging a DEG as a **domain** unlocks one extra surface: the Catalog Overview page renders a **Domains** section listing all domain-flagged DEGs as quick-jump tiles. This makes domain-flagged DEGs the platform's first-class discovery axis on the home page.

Operationally:

1. Create or open a DEG.
2. Use the DEG's edit surface to flag it as a domain.
3. The DEG now appears on the Catalog Overview's Domains section (which is conditional — the section appears only when at least one domain-flagged DEG exists).

The Domains section is also reachable through the [Search](/features/data-discovery/search.md) Groups facet, where domain DEGs participate alongside non-domain DEGs.

## DEG metadata

A DEG carries the same metadata model as any other Data Entity — owners, tags, terms, descriptions, statuses — at the group level. The convention is:

* **Owners** — set on the DEG to mark domain-stewardship at the group level instead of duplicating per-child.
* **Tags** — applied at the DEG level surface the group on Tag-faceted searches.
* **Terms** — link [glossary terms](/features/data-glossary/business-glossary.md) to the DEG to capture meaning.
* **Description** — narrative authoring describing what the group represents.

Children of the DEG keep their own metadata; the DEG-level metadata supplements it rather than replacing it.

## Relationship to ML Experiments

In ODD, an **ML experiment** is a Data Entity Group of class `ML_EXPERIMENT` that collects the entities produced by a training run — input datasets, feature tables, training jobs, model instances, and resulting model artifacts — into one logical container. Lineage, ownership, tags, and alerts follow through at the experiment level instead of being scattered across each child entity.

ML experiments are DEGs of a particular shape — the same primitive, used for a specific workflow.

{% hint style="info" %}
ML experiments in ODD are a **catalog view** over the assets that participated in the run. The platform does not track metrics, compare runs, or select a "best" model — it has no experiment-tracking UI or API of its own. For tracking, keep using MLflow, Weights & Biases, Comet, or your tool of choice, and push the resulting entities (datasets, models, runs) into ODD through a push adapter or the [ODD Specification](/introduction/main-concepts.md#odd-specification) so the experiment and its lineage are browsable alongside the rest of your data platform.
{% endhint %}

Each experiment's training-run inputs and outputs participate in the catalog-wide [lineage graph](/features/data-lineage/data-objects.md) — an operator opening an ML experiment's Lineage tab sees its dataset / model / run-level edges alongside the rest of the data platform.

## Managing DEG Membership

Membership of a manually-created DEG is mutated through two `Data Entity` controller endpoints; the surface is bound to the **child entity**, not the parent group.

| Method   | Path                                                              | Permission                      |
| -------- | ----------------------------------------------------------------- | ------------------------------- |
| `POST`   | `/api/dataentities/{data_entity_id}/data_entity_group`            | `DATA_ENTITY_ADD_TO_GROUP`      |
| `DELETE` | `/api/dataentities/{data_entity_id}/data_entity_group/{group_id}` | `DATA_ENTITY_DELETE_FROM_GROUP` |

Both permissions are scoped against the **child** `data_entity_id` in the URL — the platform's authorisation rules do **not** consult the parent DEG. A caller holding `DATA_ENTITY_ADD_TO_GROUP` against entity X can place X into any manually-created DEG in the catalog, regardless of who owns that DEG.

{% hint style="danger" %}
**DEG membership is write-collaborative — there is no per-DEG authorisation today.** The two permissions above are bound to the child entity in the URL; the parent group id is not consulted by the authorisation layer. In a multi-team deployment that uses DEGs (or the Domain flag) to model organisational boundaries (`Finance Domain`, `Marketing Domain`, `Engineering Reference`), any caller with `DATA_ENTITY_ADD_TO_GROUP` against an entity they own can place that entity into another team's DEG — including domain-flagged DEGs visible on the Catalog Overview home page. There is no DEG-side gate, no per-DEG owner check, no notification to the DEG's stewards.

**Operator mitigation today:** treat DEG membership as **collaborative by design** rather than as a private-namespace-style isolation. Use a naming convention (`finance-internal-…`) and operator policy (a wiki page, a team-charter section) rather than relying on the platform's RBAC to enforce DEG-membership ownership. The upstream platform fix tracks a per-DEG permission scope; until then, the platform's contract is "anyone with `DATA_ENTITY_ADD_TO_GROUP` on a child can place it into any DEG."
{% endhint %}

{% hint style="warning" %}
**DEG-membership changes emit no Activity Feed event today — auditors cannot trace "who added entity X to DEG Y."** Both `addDataEntityToDEG` and `deleteDataEntityFromDEG` are transactional but **not annotated with `@ActivityLog`**. The Activity Feed event type `DATA_ENTITY_RELATION_UPDATED` exists in the platform's enum but is not emitted by any code path today — it is a **dead value** that the wider event-type enumeration mentions but no event ever triggers (see [Activity Feed → Scope](/features/active-platform-features/activity-feed.md) for the structural framing). Per-entity Activity tabs and the global Activity page show no record of membership changes.

**Operator mitigation today:** if you need an audit trail of DEG membership, instrument it externally — PostgreSQL `pgaudit` on the membership tables (`group_entity_relations`-shaped writes), or an API-gateway log on the two endpoints above. The platform-side activity-event ship is on the roadmap but not yet shipped.
{% endhint %}

{% hint style="warning" %}
**Add and Delete are asymmetric on idempotence — reconciliation scripts must branch.** Calling `POST` on an entity already in the target DEG raises **HTTP 400** with the body `Data entity is already in this DEG`. Calling `DELETE` on an entity that is not in the target DEG silently returns **HTTP 204 No Content** — the no-op success path. A reconciliation script that idempotently asserts "entity X is in DEG Y" cannot use the `POST` endpoint blindly (it will see 400s on every re-run); the safe pattern is a `GET` preflight on the entity's current group membership followed by `POST` only when missing. The `DELETE` side is naturally idempotent and does not need the preflight.

Combined with the forensic silence above, the caller cannot tell from HTTP response or activity feed whether anything actually changed on a `DELETE` call — both the no-op and the genuine removal return the same `204`.
{% endhint %}

A few smaller behaviours worth knowing before scripting DEG membership:

* **The optional `data_entity_group_id` parameter is not validated.** Missing or malformed values produce the generic `id null` error message rather than a typed `BadRequest`. Direct API callers should defensively validate the parameter before submitting.
* **Empty DEGs persist after the last member is removed.** Deleting the only member of a DEG leaves an empty group entity in the catalog — there is no automatic cleanup. Operators retiring a DEG must explicitly delete it through the DEG's own entity-detail surface after emptying its membership.
* **Under `auth.type=DISABLED`, both endpoints are reachable anonymously.** The membership endpoints inherit the platform's DISABLED-mode no-auth posture (see [DISABLED authentication](/configuration-and-deployment/enable-security/authentication/disabled-authentication.md)). Don't run DISABLED in production if DEG membership matters for organisational boundaries.
* **The 400 response on `POST` conflates three failure modes.** "Entity already in DEG", "target is not a manually-created DEG", and "invalid request body shape" all produce the same generic 400 with the same human-readable message. Operators debugging a failing `POST` should check the entity's current membership first (most common cause), then the target's `type` (must be a manually-created DEG, not an ingested one), then the request body shape.

## Group lineage

The dedicated [Group lineage](/features/data-lineage/data-objects.md#group-lineage) endpoint returns the lineage graph for the DEG's *children*, not the DEG itself. This is what an operator usually wants when reasoning about a domain or pipeline group — *"what does the Finance domain depend on, and what depends on it?"* is a question about the union of the children's edges.

## Where to next

* [Data entity detail page](/features/data-discovery/entity-detail-page.md) — the per-entity surface where the sidebar Groups panel lives, plus the visible-window truncation caveat on Groups membership.
* [Search and Filtering → Groups facet](/features/data-discovery/search.md) — narrow the catalog to entities that are members of selected DEGs.
* [Data Lineage → Data Objects → Group lineage](/features/data-lineage/data-objects.md#group-lineage) — the dedicated endpoint that returns DEG-children lineage.
* [Manual Object Tagging](/features/data-discovery/tagging.md) — the lighter-weight labelling counterpart for cross-cutting groupings that do not justify a DEG.
* [Main Concepts → Terms & Aliases — Data Entity Group](/introduction/main-concepts.md#terms-aliases) — the canonical-term reference.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/features/data-discovery/groups-domains.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
