> For the complete documentation index, see [llms.txt](https://docs.opendatadiscovery.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.opendatadiscovery.org/configuration-and-deployment/enable-security/audit-trail-scope.md).

# Audit trail scope

ODD Platform's audit posture is **architecturally bifurcated**. There are two recorded streams — the data-entity activity feed and the owner-association request log — and **everything else** (RBAC mutations, owner lifecycle, term lifecycle, namespace lifecycle, datasource lifecycle, collector lifecycle) leaves **no recoverable trace** on the platform side. The gap is not a missing-annotation oversight; it is rooted in the platform's `activity` table schema and would require a coordinated schema migration to close.

This page is the single reference for compliance auditors, security reviewers, and operators with SOX / HIPAA / GDPR / SOC2 obligations who need to know what the platform records, what it does not record, and what compensating controls to apply when the platform's recorded scope does not match an audit requirement.

The [Activity Feed](/features/active-platform-features/activity-feed.md) page documents the operational UI for the positive-half audit stream; this page covers the scope of both halves — including the negative half — so an operator can make informed decisions before depending on the platform's audit trail for compliance reporting.

## What IS audited (positive half)

| Subject                                                                                                                      | Where it is stored                                                                       | Event vocabulary                                                                                                                                                                                   | Operator surface                                                                                                 |
| ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
| Data entity changes (descriptions, tags, terms, ownership, metadata, alerts, custom-metadata, statuses, dataset-field edits) | `activity` table — partitioned by `created_at`, every row scoped to one `data_entity_id` | 27-value `ActivityEventTypeDto` enum (see the [Activity Feed → Event types](/features/active-platform-features/activity-feed.md#event-types) section for the full list with operator descriptions) | Per-entity Activity tab and the global [Activity Feed](/features/active-platform-features/activity-feed.md) page |
| Owner-association request workflow — user submits an association request, admin accepts or declines, sync events             | `owner_association_request_activity` table (dedicated)                                   | 5-value typed enum (`REQUEST_CREATED`, `REQUEST_APPROVED`, `REQUEST_DECLINED`, `REQUEST_MANUALLY_APPROVED`, `REQUEST_MANUALLY_DECLINED`)                                                           | Per-request audit on Management → Associations                                                                   |

Both streams are immutable from the API surface — the platform does not expose a "delete activity row" endpoint. They are append-only at the application tier.

## What is NOT audited (negative half — schema-rooted)

The following surfaces have **no recoverable audit trace anywhere on the platform**. None of the corresponding service implementations emit an activity row, an application log line via `@Slf4j`, or any other persistent record. The reason is architectural — see the next section.

* **RBAC mutations.** Creating, updating, or deleting a Policy or a Role. Renaming a Role. Re-binding a Policy to a different Role. Granting or revoking a Permission. The Policy and Role service implementations carry no audit emission and no logging annotation.
* **Owner lifecycle.** Owner CRUD (create / rename / soft-delete). Owner-to-Role binding changes. The Owner service implementation does not emit on these operations even though `OwnerService.update` is the operator's primary lever for changing who is bound to which Role bundle.
* **Term lifecycle.** Business-glossary Term CRUD — creation, rename, definition edits, deletion. (Term *assignments* on data entities ARE audited; the Term entity itself is not.)
* **Namespace lifecycle.** Namespace CRUD. The namespace identifier is referenced across the catalog (scoping tags, datasources, owners), so renaming or deleting a namespace silently affects downstream visibility without trace.
* **Datasource lifecycle.** Datasource creation, configuration edits, deletion. (Ingestion events from a datasource are surfaced through the data-entity-scoped activity stream on each entity the datasource produced; the datasource entity itself is not audited.)
* **Collector lifecycle.** Collector registration, configuration edits, deletion, token rotation. Token rotation in particular leaves no audit record — see the [`COLLECTOR_TOKEN_REGENERATE` permission](/configuration-and-deployment/enable-security/authorization/permissions.md#management-permissions) caveat.

## Why this gap exists (architectural framing)

The `activity` table's schema requires a `data_entity_id` foreign key on every row — the column is declared `NOT NULL` and constrained to reference an existing row in the `data_entity` table. The table physically cannot store an event that is not scoped to a single data entity. Adding an "RBAC change" event to the existing audit stream would require either a schema migration making `data_entity_id` nullable plus a discriminator column to identify the alternative subject, or a separate per-subject audit table modelled on the existing `owner_association_request_activity` pattern. Neither change exists today.

The 27-value `ActivityEventTypeDto` enum is similarly data-entity-scoped — every event type names a data-entity attribute (ownership, metadata, schema, tag assignment, status, etc.). Extending the enum with RBAC / Owner / Term / Namespace lifecycle values without the schema change is non-load-bearing — the `NOT NULL` foreign key on `data_entity_id` would still reject the row.

The positive-half pattern — the dedicated `owner_association_request_activity` table — demonstrates the architectural escape hatch. Closing the negative-half gap would require analogous tables (or one consolidated `platform_event` table with a subject discriminator) for each missing subject, plus matching service-tier emission and an operator-surface UI for browsing the new streams.

The architectural change is tracked upstream. Until it ships, the compensating controls below are the operator's remaining options.

## Compliance implications and compensating controls

For deployments with audit obligations (SOX, HIPAA, GDPR, SOC2, internal change-management policies), the negative-half gap means the platform's own logs are **insufficient** for reporting on RBAC, Owner, Term, Namespace, Datasource, and Collector changes. Three compensating controls are practical today; pick whichever matches your existing audit infrastructure:

* **Database-level audit.** Enable PostgreSQL's `pgaudit` extension (or your managed Postgres provider's equivalent) and configure it to capture all mutations against the `policy`, `role`, `owner`, `term`, `namespace`, `data_source`, and `collector` tables. This is the most surgical compensating control — it captures the actual write at the storage tier regardless of which API path triggered it. Be aware that pgaudit captures SQL statements; correlating them to operator identity requires the platform to surface the operator's username on the database connection, which depends on your connection-pool topology.
* **Kubernetes / service-mesh API-server audit.** If the platform runs in a Kubernetes cluster with an authenticating ingress (mTLS, ingress-side OAuth proxy), enable audit logging at the ingress / service-mesh layer. This captures the authenticated HTTP call to the platform with operator identity attached, but does not capture which row of which table actually changed.
* **Application-level logs are NOT a substitute.** The RBAC, Owner, Term, Namespace, Datasource, and Collector service implementations do not emit `@Slf4j` log lines on mutations. Adding application-side log levels at runtime does not surface a trail for these subjects. Reserve admin-tier access tightly and rely on database-level or ingress-level audit instead.

**Operator-side hardening recommendations:**

* **Reserve admin-tier permissions narrowly.** The combination of "no audit on RBAC mutations" with the platform's seeded-Role re-creation behaviour means an attacker with brief admin access can delete the seeded `Administrator` role, recreate it with attacker-chosen policy bindings, and leave the deployment in a compromised state with no platform-side trace.
* **Snapshot RBAC state out-of-band.** Periodically export the contents of the `policy`, `role`, `owner_to_role`, and `permission` tables via a read-only database account; diff the snapshots over time. This gives you a manual change log when no audit stream exists.
* **Document the bifurcation in your runbook.** If your incident-response procedure refers to "the platform's audit log", make the runbook explicit about which subjects are covered by the platform's recorded stream and which require a compensating control to investigate.

A platform-side audit-scope expansion — schema migration + enum extension + service-tier emission + an operator-surface UI for the new streams — is tracked upstream. The doc-side caveat on this page remains load-bearing until that work lands.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/configuration-and-deployment/enable-security/audit-trail-scope.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
