Metrics Ingestion
Push time-series metrics via `/ingestion/metrics` and surface them on each entity's Metrics tab — covers payload shape, two storage backends (PostgreSQL + Prometheus), and operator caveats.
The Metrics Ingestion surface accepts time-series metrics pushed into the platform by collectors or custom integrations and surfaces them on each affected data entity's Metrics tab. The platform supports the OpenMetrics metric model — five metric types (COUNTER, GAUGE, HISTOGRAM, SUMMARY, GAUGE_HISTOGRAM) carrying labels and observation values — and stores them either in its own PostgreSQL database or in an external Prometheus instance, depending on the deployment's metrics.storage configuration.
Operators turning on metrics ingestion typically have one of two workflows in mind:
Catalog-side cardinality metrics — row counts, on-disk size, freshness gauges emitted by a collector that scrapes a source system and pushes the result to ODD. These appear as cards on the entity's Metrics tab so an operator opening the entity sees the current row count alongside its schema, ownership, and lineage.
Custom-framework metrics — an in-house pipeline (a daily ETL job, an external profiling tool, a side-process emitting per-entity health signals) pushes structured metrics into the catalog so the platform becomes the single source of truth for "what numbers describe this entity right now."
This page covers the inbound push surface (POST /ingestion/metrics), the entity-side read surface (the Metrics tab), and the three operator caveats that the configuration reference does not surface in full.
Where to find it
Two surfaces consume the same data:
POST /ingestion/metrics— the inbound push endpoint. Collectors and custom integrations call this with aMetricSetListpayload (one or moreMetricSets, each anchored to a data entity's ODDRN and carrying a list ofMetricFamilytime series).Per-entity Metrics tab — on a data entity's detail page, the Metrics tab renders the ingested time series as charts. The platform reads from
GET /api/dataentities/{id}/metrics, which fans out to either the internal PostgreSQL store or the configured Prometheus instance depending onmetrics.storage.
For the platform-side configuration of the two storage backends (metrics.storage, metrics.prometheus-host, the Prometheus remote-write requirement), see Configure ODD Platform → Enable Metrics. The configuration reference is the canonical home for the storage knobs; this page is the surface description and the operator-caveat list.
The endpoint
POST /ingestion/metrics
Content-Type: application/json
{
"items": [
{
"oddrn": "<data-entity-oddrn>",
"metric_families": [
{
"name": "<metric-family-name>",
"type": "COUNTER", /* or GAUGE / HISTOGRAM / SUMMARY / GAUGE_HISTOGRAM */
"metrics": [
{ "labels": {"<key>": "<value>", ...}, "metric_points": [ { "value": 123, ... } ] }
]
}
]
}
]
}Method
POST
Path
/ingestion/metrics
Request body
MetricSetList — one or more MetricSets; each MetricSet carries an oddrn (the target data entity) and a list of MetricFamily entries.
Body size cap
20 MB. The Spring WebFlux codec max-in-memory-size is set to 20MB in application.yml. A request larger than that does not get a clean rejection — the platform raises an internal buffer-limit error during body decoding and returns HTTP 500 (not a 413 "Payload Too Large"), with no body identifying the cap as the cause. Chunk large pushes into multiple MetricSetList calls to stay under the limit; a 500 from this endpoint on a large push is the symptom to look for.
Response
201 Created (no body). The platform does not return a per-metric acceptance result. See the empty-payload caveat below — an empty MetricSetList also returns 201, so a 201 is not by itself proof that any metric was written.
Failure on misconfigured storage
If metrics.storage is set to an invalid value (anything other than INTERNAL_POSTGRES or PROMETHEUS), the platform fails to start with NoSuchBeanDefinitionException because the storage-backed IngestionMetricsService bean cannot be wired. Misconfiguration is caught at boot, not at first request.
The exact OpenAPI shape (per-field schemas, the five metric-type discriminators) lives in the platform's ingestion contract; consult the ODD Specification for the field-level reference.
Known operator caveats
Several behaviours of the metrics ingestion surface are non-obvious from the configuration reference alone. Each item below states what an operator might assume, what the platform actually does, and what to do today.
Tenant isolation works only when metrics.storage=PROMETHEUS. The default INTERNAL_POSTGRES backend has no tenant column — two deployments sharing the same Postgres conflate metrics into one stream. The odd.tenant-id configuration key is only consulted on the Prometheus write / query path, where it is appended as a tenant_id={value} label on every series. On INTERNAL_POSTGRES, the tables backing metric storage (metric_series, metric_point, metric_entity) have no tenant_id column — the platform writes every metric to the shared schema without any tenant discriminator. Two ODD Platform deployments writing to the same Postgres instance see each other's metrics on every Metrics tab; there is no platform-side filter, and no operator-side mitigation short of running separate Postgres instances per deployment.
Operator workflow today. If your deployment needs multi-tenant metric isolation:
Switch
metrics.storagetoPROMETHEUSand setodd.tenant-idon every deployment so each one writes (and reads) only its own tenant-labeled series.Or run each deployment against its own dedicated PostgreSQL instance (or its own dedicated schema within an instance) so the data path is physically separate.
The "Ignored when metrics.storage=INTERNAL_POSTGRES" framing in the configuration reference is technically correct but understates the operator consequence: choosing the default storage backend forfeits tenant isolation entirely, not just one minor labelling convenience. The same class of silent-default risk that previously affected attachment storage on container restart applies here — read the storage section before adopting the default in any multi-tenant context.
POST /ingestion/metrics is unauthenticated under every auth.type value today. The platform's Spring Security configuration whitelists the entire /ingestion/** namespace, and the optional auth.ingestion.filter.enabled filter only matches the exact path /ingestion/entities — it does not cover /ingestion/metrics. Any caller with network reach to the platform can POST a MetricSetList carrying any oddrn they can guess; the platform writes the metrics to the configured backend and surfaces them on the named entity's Metrics tab. Under auth.type=DISABLED, the same POST is reachable anonymously; under LOGIN_FORM / OAUTH2 / LDAP the whitelist still applies.
Operator workflow today. Deploy the platform behind a reverse proxy (an authenticating ingress, a NetworkPolicy in Kubernetes restricting which pods can reach /ingestion/**, an mTLS-terminating load balancer) that performs the authentication you require on the /ingestion/metrics path before forwarding the request. The full picture of which /ingestion/* paths are covered by which filter under which auth.type is on Enable Security → Ingestion. The upstream platform fix adds a dedicated metrics-ingestion filter mirroring IngestionDataEntitiesFilter; until it lands, perimeter authentication is the only protection.
Switching metrics.storage after a deployment has been live is one-way — historical data does not migrate. The two storage backends (INTERNAL_POSTGRES and PROMETHEUS) are independent data stores; the platform writes to whichever is configured at any given moment and reads from the same one. There is no operator tooling to migrate metric history from one backend to the other:
INTERNAL_POSTGRES→PROMETHEUS— historical metric points inmetric_series/metric_pointremain in the PostgreSQL tables but become unreadable from the platform UI / API after the switch (the read path queries Prometheus). The Metrics tab on each entity shows only data points written after the switch; everything older is dark until a manual re-ingest or a direct SQL query against the Postgres tables outside the platform.PROMETHEUS→INTERNAL_POSTGRES— symmetric: historical points remain in Prometheus but are no longer visible through the platform; the Metrics tab starts fresh on the PostgreSQL side.
Operator workflow today. Treat the storage choice as a long-term commitment for any deployment that has been live long enough to accumulate historical metric data. If you must switch (for example to gain tenant isolation per the first caveat above), plan the cutover as a one-time event with a documented "history before this date is queryable from <old backend> directly" annotation in the platform's runbook. The platform does not surface the cutover boundary in the UI.
An empty MetricSetList returns 201 Created and writes nothing — a 201 does not confirm a metric landed. Unlike POST /ingestion/entities, which rejects an empty payload with 400 Bad Request (Ingestion payload is empty), the metrics endpoint accepts an empty items: [] body as a successful no-op and returns 201. If you use this endpoint as a liveness or smoke-test probe, a 201 only tells you the endpoint is reachable and your auth/proxy layer let the request through — it does not tell you that any series was actually persisted to the configured backend. To verify a real write, push at least one MetricSet with a known oddrn and then read it back from that entity's Metrics tab (GET /api/dataentities/{id}/metrics).
The endpoint does not check that a MetricSet.oddrn belongs to a real catalog entity — any caller can mint metric series for ODDRNs that do not exist. When a MetricSet arrives, the platform records its oddrn in a metric-entity table that has no foreign key to the data-entity catalog and performs no existence check; it then writes the series. A caller can therefore push metrics under arbitrary, fabricated ODDRNs and the platform will create rows (in INTERNAL_POSTGRES) or series (in PROMETHEUS) for every distinct one. Combined with the unauthenticated-endpoint caveat above, this is a cardinality / storage-exhaustion risk: an attacker (or a buggy collector emitting malformed ODDRNs) with network reach can pollute the metric store or the Prometheus series space without ever touching a real entity.
Operator workflow today. The same perimeter authentication that protects the unauthenticated endpoint (see the caveat above) is the only control — there is no platform-side ODDRN validation to lean on. If you operate the PROMETHEUS backend, additionally bound the blast radius with Prometheus-side series limits (--storage.tsdb.max-* / per-tenant limits) so a cardinality flood degrades gracefully rather than exhausting the time-series database. Validate ODDRNs in your collector before pushing so a misconfigured source cannot silently fan out junk series.
Where to next
Configure ODD Platform → Enable Metrics —
metrics.storage,metrics.prometheus-host, the OTLP export channel, the Prometheus tenant label.Enable Security → Ingestion — the per-
auth.typereachability matrix for every/ingestion/*path (including/ingestion/metrics).Notifications — the sibling subsystem that moves alerts out of the platform; the same WAL-driven pipeline reads ingested events.
API Reference — the per-feature HTTP-endpoint index. The metrics push side lives in the ODD Specification ingestion contract; the read side is covered by the per-entity API.
Last updated