> For the complete documentation index, see [llms.txt](https://docs.opendatadiscovery.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.opendatadiscovery.org/configuration-and-deployment/enable-security.md).

# Enable security

ODD Platform has **two independent authentication surfaces**, each governed by its own configuration flag. Enabling one does not protect the other.

| Surface              | What it protects                                                            | Configuration                                                                                                                                                                                                                    |
| -------------------- | --------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| User interface / API | Human users browsing the catalog and programmatic clients calling `/api/**` | `auth.type` (DISABLED / LOGIN\_FORM / OAUTH2 / LDAP) — see [Authentication](/configuration-and-deployment/enable-security/authentication.md) and [Authorization](/configuration-and-deployment/enable-security/authorization.md) |
| Ingestion            | Collectors and push adapters calling `/ingestion/**`                        | `auth.ingestion.filter.enabled` (default `false`) — see below                                                                                                                                                                    |

A platform with OAuth2 enabled for the UI but the ingestion filter disabled is a platform with a protected catalog UI and an open write endpoint. Operators must configure both.

## Ingestion authentication

The `/ingestion/**` namespace is whitelisted in Spring Security (`SecurityConstants.WHITELIST_PATHS`), so it never traverses the UI authentication chain regardless of `auth.type`. Instead, two dedicated `WebFilter`s protect specific ingestion paths.

| Endpoint                                                                                                                                         | Filter                        | Active when                                     | Behavior                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| ------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------- | ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `POST /ingestion/datasources`                                                                                                                    | `IngestionDataSourceFilter`   | **always** (unconditional)                      | Requires `Authorization: Bearer <token>`; looks up the collector by token; responds 401 if the token is missing or unknown                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| `POST /ingestion/entities`                                                                                                                       | `IngestionDataEntitiesFilter` | only when `auth.ingestion.filter.enabled: true` | Requires `Authorization: Bearer <token>`; validates the token against the datasource's stored token (falls back to the collector's token)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| All other `/ingestion/*` paths (e.g. `/ingestion/alert/alertmanager`, `/ingestion/entities/degs/children`, `/ingestion/entities/datasets/stats`) | —                             | —                                               | **Unauthenticated under `auth.type` = `DISABLED`, `OAUTH2`, or `LDAP`** — the `/ingestion/**` glob in `SecurityConstants.WHITELIST_PATHS` carries all sibling paths through `permitAll` (under `DISABLED`, every exchange permits regardless). Under `auth.type=LOGIN_FORM`, sibling paths are instead **session-gated** by the catch-all `pathMatchers("/**").authenticated()` rule — but [LOGIN\_FORM is documented as dev-only](/configuration-and-deployment/enable-security/authentication/login-form.md), so this distinction does not change the operational guidance below. The filters above match exact path patterns; they do not cover sibling paths. |

{% hint style="danger" %}
**`auth.ingestion.filter.enabled` defaults to `false`.** With the default in place and the platform reachable on the network, any caller who can speak the ingress API can `POST /ingestion/entities` with a spec-valid `DataEntityList`. If the `data_source_oddrn` in the payload matches an existing datasource, fake entities, schemas, lineage edges, owners, and tags are upserted into the catalog and rendered to every user as authoritative metadata. The payload is spec-aligned, so this is not an exploit — it is the documented ingress flow being called by an unauthenticated party. ODDRN values follow predictable patterns (`//postgresql/host:port/databases/...`), so guessing is feasible.
{% endhint %}

Enable the ingestion filter for **any** deployment where the platform is reachable from an untrusted network — which, in practice, is any non-local-dev deployment. Collectors using `odd-collector-sdk` already attach `Authorization: Bearer <token>` on every call, so turning the flag on does not require collector-side changes as long as the collector's token is registered in the platform.

{% tabs %}
{% tab title="YAML" %}

```yaml
auth:
  ingestion:
    filter:
      enabled: true
```

{% endtab %}

{% tab title="Environment variables" %}

```
AUTH_INGESTION_FILTER_ENABLED=true
```

{% endtab %}
{% endtabs %}

### Ingestion paths the filter does not cover

`IngestionDataEntitiesFilter` uses an exact path matcher (`/ingestion/entities`, POST). Sibling endpoints under `/ingestion/*` — `/ingestion/alert/alertmanager`, `/ingestion/entities/degs/{degOddrn}/children`, `/ingestion/entities/datasets/stats`, `/ingestion/metrics` — remain outside the ingestion filter's coverage even when `auth.ingestion.filter.enabled: true` is set. The flag protects **one** of the two `WebFilter`-covered endpoints; it does not extend to siblings.

Whether these uncovered paths are reachable to an unauthenticated caller depends on `auth.type`:

* Under `auth.type=DISABLED`, `OAUTH2`, or `LDAP`, the `/ingestion/**` glob in `SecurityConstants.WHITELIST_PATHS` permits every sibling path before the UI auth chain runs — the endpoints are anonymously reachable on the network.
* Under `auth.type=LOGIN_FORM`, the catch-all `.pathMatchers("/**").authenticated()` rule session-gates the siblings — an anonymous caller is redirected to `/login`. This blocks collectors and push-clients that do not carry a UI session, so LOGIN\_FORM is rarely the right mode for production ingestion regardless.

The full per-endpoint × per-auth-config picture is enumerated in the [Deployment matrix](#deployment-matrix-per-endpoint-per-auth-config) below.

The AlertManager webhook is the most operationally relevant of these — see the warning on the [Configure ODD Platform](/configuration-and-deployment/odd-platform.md#authentication) page, `Prometheus AlertManager Integration → Authentication` section. Apply perimeter controls (network segmentation, authenticating reverse proxy, mTLS) for any deployment where these endpoints are reachable from outside the trusted network.

A platform-side fix to broaden the ingestion filter's coverage is tracked upstream.

## Statistics endpoint — write shape and replay behaviour

The `POST /ingestion/entities/datasets/stats` endpoint listed above carries a write contract that operators need to understand before exposing it on any network — the endpoint is uncovered by the ingestion filter under every `auth.type` value, and the platform does not validate the payload's parent-child consistency.

{% hint style="danger" %}
**Cross-dataset write surface.** The platform resolves field statistics writes **by field ODDRN only**, with no JOIN to the declared parent dataset. A payload of the shape

```json
{
  "items": [
    {
      "datasetOddrn": "//A/...",
      "fields": { "<oddrn-of-field-in-dataset-B>": { "stats": ... } }
    }
  ]
}
```

writes the `stats` value to dataset **B**'s field row while triggering full-text-search recompute on dataset **A** (the declared parent). Any caller on the network who knows a field's ODDRN can poison that field's statistics through any other dataset's parent declaration. Downstream consumers (the Quality Dashboard rings, the Dataset Structure tab, BI tools reading `dataset_field.stats`) render the attacker-controlled values without any indicator that the write originated through an unrelated parent.

Combined with the endpoint being unauthenticated under `DISABLED`, `OAUTH2`, and `LDAP` (and accessible to any authenticated user with a session under `LOGIN_FORM`), this is a trivial data-integrity attack on any dataset's field statistics. The doc-side caveat is the load-bearing mitigation until the upstream fix lands; in the meantime apply perimeter controls as above.

**The statistics write leaves no Activity Feed trace.** Unlike a field's internal-name change or a field's tag change — both of which emit an Activity Feed event — the statistics write is not recorded in the activity stream. A caller who poisons a field's statistics through this endpoint produces no entry an operator can find after the fact: there is no "who changed these stats, and when" record to audit. Treat detection of this write as a perimeter-and-monitoring concern (reverse-proxy access logs, network controls), not something the platform's own audit surface will show you.
{% endhint %}

**Replay-with-fewer-tags destroys the absent ones.** The endpoint accepts a tags list per field; the platform compares the incoming list against the existing `EXTERNAL_STATISTICS`-origin tag relations for that field and **deletes** every relation absent from the new payload before creating the new ones. There is no merge-semantics opt-in and no `replace_tags` flag — re-POSTing a stats payload with a shorter tags list silently removes the tags missing from the second call. Treat the endpoint as a destructive replace, not an additive update.

## Deployment matrix — per-endpoint × per-auth-config

The matrix below is the authoritative answer for "which ingestion endpoints are reachable on my deployment?" The per-property bullets on the [Configure ODD Platform](/configuration-and-deployment/odd-platform.md) page describe one knob each, not their interaction. When a per-property description and this matrix disagree, this matrix wins — it is derived from the runtime code, not the property name.

The columns assume `auth.s2s.enabled` is `false` (the default). When S2S is enabled and a request carries a valid `X-API-Key`, every endpoint accepts the request regardless of `auth.type` or the ingestion filter — see [Server-to-server (S2S) authentication](/configuration-and-deployment/enable-security/authentication/s2s.md).

`AUTH-token` means the endpoint requires a valid `Authorization: Bearer <token>` header against the platform's collector/datasource token; the request body is rejected at the `WebFilter` if the token is missing or wrong. `OPEN` means an anonymous request is accepted and acted on. `SESSION-gated` means an anonymous request is redirected to `/login`; a UI-session-authenticated caller's request reaches the handler (subject to the `WebFilter` if any).

| Endpoint                                                                      | `auth.type=DISABLED`              | `auth.type=OAUTH2` / `LDAP`       | `auth.type=LOGIN_FORM`                                       |
| ----------------------------------------------------------------------------- | --------------------------------- | --------------------------------- | ------------------------------------------------------------ |
| `POST /ingestion/datasources`                                                 | **AUTH-token** (filter always on) | **AUTH-token** (filter always on) | **AUTH-token** (filter always on)                            |
| `POST /ingestion/entities` — `auth.ingestion.filter.enabled: false` (default) | **OPEN**                          | **OPEN**                          | **SESSION-gated**                                            |
| `POST /ingestion/entities` — `auth.ingestion.filter.enabled: true`            | **AUTH-token** (filter applies)   | **AUTH-token** (filter applies)   | **AUTH-token** (filter applies; session gate also satisfied) |
| `POST /ingestion/entities/datasets/stats`                                     | **OPEN**                          | **OPEN**                          | **SESSION-gated**                                            |
| `POST /ingestion/metrics`                                                     | **OPEN**                          | **OPEN**                          | **SESSION-gated**                                            |
| `POST /ingestion/alert/alertmanager`                                          | **OPEN**                          | **OPEN**                          | **SESSION-gated**                                            |
| `GET /ingestion/entities/degs/{degOddrn}/children`                            | **OPEN**                          | **OPEN**                          | **SESSION-gated**                                            |

**Reading the matrix as an operator:** an `OPEN` cell on a production deployment is a place an unauthenticated caller can act. Either flip `auth.type` to `LOGIN_FORM` (rarely practical for collector traffic — they have no sessions), enable `auth.ingestion.filter.enabled` and/or `auth.s2s.enabled`, or apply perimeter controls (network segmentation, authenticating reverse proxy, mTLS) to bring those cells to a controlled state.

## Authentication and authorization

For details on the UI authentication options and the authorization model that governs what authenticated users can do, see the [Authentication](/configuration-and-deployment/enable-security/authentication.md) and [Authorization](/configuration-and-deployment/enable-security/authorization.md) sections. For the cross-cutting question "which user becomes ADMIN under each auth mode and provider", see the unified [Admin promotion across providers](/configuration-and-deployment/enable-security/admin-promotion.md) reference.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/configuration-and-deployment/enable-security.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
