Configure ODD Platform
This section defines how to configure ODD Platform in order to leverage all of its functionality and features.
This page is the post-deployment configuration reference for the running Platform — every application.yml key the Platform consumes. For the deployment path itself (Docker Compose, Helm, AWS EKS, build from source), start at Deployment Options.
Configuration approaches
There are two ways to configure the Platform:
Environment variables are used for simple entries
Configuring via YAML can come in handy when it is necessary to define a complex configuration block (e.g OAuth2 authentication or logging levels).
YAML entries VS environment variables
Here is an example of how to define the following block and configure the Platform with it using environment variables.
YAML:
spring:
datasource:
url: URL
username: USERNAME
password: PASSWORD
custom-datasource:
url: URL
username: USERNAME
password: PASSWORDTo configure the Platform using environment variables, replace semicolons with underscores and uppercasing words, like so:
SPRING_DATASOURCE_URL=URLSPRING_DATASOURCE_USERNAME=USERNAMESPRING_DATASOURCE_PASSWORD=PASSWORDSPRING_CUSTOM_DATASOURCE_URL=URLSPRING_CUSTOM_DATASOURCE_USERNAME=USERNAMESPRING_CUSTOM_DATASOURCE_PASSWORD=PASSWORD
Connect your database
For all of its features ODD Platform uses PostgreSQL database and PostgreSQL database only. These variables are needed to be defined to connect ODD Platform to database:
spring.datasource.url: JDBC string of your PostgreSQL database. Default value isjdbc:postgresql://127.0.0.1:5432/odd-platformspring.datasource.username: your PostgreSQL user's name. Default value isodd-platformspring.datasource.password: your PostgreSQL user's password. Default value isodd-platform-password
These variables are optional and will be used to connect to PostgreSQL and store Lookup Tables. Each of the three keys is declared in R2DBCConfiguration as @Value("${spring.custom-datasource.X:}") — the trailing colon with no value means the @Value default is the empty string, not the JDBC URL / username / password values listed below. When a key is unset (or blank), the bean factory falls back to the corresponding primary spring.datasource.* value at startup. The values below are therefore the fallback an operator observes with a default deployment, not the spring.custom-datasource.* keys' own defaults — so overriding spring.datasource.url will also change what spring.custom-datasource.url resolves to:
spring.custom-datasource.url: JDBC string of your PostgreSQL database where we store Lookup Tables. Falls back tospring.datasource.urlwhen unset; the platform's primaryspring.datasource.urldefault isjdbc:postgresql://127.0.0.1:5432/odd-platform. Note: you can specify any {database_host}, {database_port} or {database_name} but schema, where Lookup Tables are stored always is lookup_tables_schema.spring.custom-datasource.username: your PostgreSQL user's name for custom-datasource. Falls back tospring.datasource.usernamewhen unset; the platform's primaryspring.datasource.usernamedefault isodd-platform.spring.custom-datasource.password: your PostgreSQL user's password for custom-datasource. Falls back tospring.datasource.passwordwhen unset; the platform's primaryspring.datasource.passworddefault isodd-platform-password.
So that your database connection defining block would look like this:
Security
Please follow the Enable security section for enabling security in ODD Platform.
Select session provider
ODD Platform stores HTTP session state in one of three places: the platform JVM (in-memory), the platform's PostgreSQL database, or an external Redis data store. The provider is selected with session.provider (SESSION_PROVIDER env var) and accepts one of three values:
IN_MEMORY— sessions live in aConcurrentHashMapinside the JVM. ODD Platform defaults to this value.INTERNAL_POSTGRESQL— sessions are persisted to the platform's PostgreSQL database (SPRING_SESSION/SPRING_SESSION_ATTRIBUTEStables).REDIS— sessions are persisted to an external Redis data store via Spring Session's@EnableRedisWebSession.
Quick selection guidance:
Single-instance deployment, restart-tolerant logout acceptable →
IN_MEMORYMulti-instance deployment or persistence across restarts is required →
INTERNAL_POSTGRESQL(no extra infrastructure) orREDIS(if you already operate Redis or need sub-millisecond session reads)
Each provider has operator-visible characteristics that affect sizing, multi-instance behavior, and connection wiring. Read the relevant subsection before deploying.
IN_MEMORY (default)
IN_MEMORY (default)Sessions are kept in a ConcurrentHashMap inside the platform JVM, wrapped by Spring Session's ReactiveMapSessionRepository. Suitable for local development and single-instance evaluations where session loss on restart is acceptable.
Characteristics & caveats
Sessions are lost on every platform restart. The session map lives in heap; any restart (deploy, crash, container recycle) clears it and forces every authenticated user to log in again.
No multi-instance support. Two ODD Platform instances behind a load balancer each maintain a separate session map. A request that lands on a different instance than the one that authenticated the user appears unauthenticated.
Eviction is by Spring Session expiry only. The repository wraps a raw
ConcurrentHashMapwith no secondary eviction policy (no LRU, no max-entries cap). A long-running platform with many short-lived sessions accumulates map entries until each entry's TTL elapses; high-traffic deployments running with the shipped defaultspring.session.timeout: -1(no timeout) accumulate sessions indefinitely. Set a finitespring.session.timeout(see Session lifetime below) to bound the in-memory footprint.
INTERNAL_POSTGRESQL
INTERNAL_POSTGRESQLSessions are persisted in the platform's own PostgreSQL database, in the SPRING_SESSION and SPRING_SESSION_ATTRIBUTES tables. ODD Platform implements a custom JOOQ-based reactive JooqSessionRepository for this provider — the standard spring.session.jdbc.* Spring Session keys do not apply. Connection settings reuse the existing platform spring.datasource.* configuration; no additional database wiring is required.
Characteristics & caveats
Sessions survive platform restarts. Authenticated users remain logged in across deploys (until their session row's TTL has passed).
Multi-instance support. All ODD Platform instances point at the same database, share the session tables, and can serve requests for any authenticated user regardless of which instance answered the original login.
Expired-session cleanup runs hourly and is not configurable. A
@Scheduled(fixedRate = 1, timeUnit = HOURS)housekeeping job (PostgreSQLSessionHousekeepingJobHandler.deleteExpiredSessions) deletes rows whoseEXPIRY_TIMEis in the past from bothSPRING_SESSIONandSPRING_SESSION_ATTRIBUTES. Expired session rows therefore remain in the tables for up to one hour past their TTL before being cleaned. The cadence is hardcoded — there is no config key to tune it.Sizing implication. When sizing the database (connection pool, disk, vacuum schedule), assume the session tables hold the high-water-mark count of authenticated users plus up to one hour of post-expiry stragglers. For high-cardinality / short-TTL deployments (many users, short
spring.session.timeout), the post-expiry overhang can dominate steady-state row count.
REDIS
REDISSessions are persisted to an external Redis data store via Spring Session's @EnableRedisWebSession. Suitable for multi-instance deployments that already operate Redis, or that need sub-millisecond session reads. ODD Platform does not bundle Redis; the operator must provide a Redis 6+ instance and supply its connection settings under the spring.data.redis.* namespace (Spring Boot 3.x; the legacy spring.redis.* prefix from Spring Boot 2.x has been removed and will not bind).
Characteristics & caveats
Sessions survive platform restarts and span instances — same persistence behavior as
INTERNAL_POSTGRESQL, but reads and writes happen against Redis directly.Connection wiring is operator-supplied. Unlike
INTERNAL_POSTGRESQL(which reuses the platform's existing PostgreSQL connection), Redis settings must be configured separately. ODD Platform'sapplication.ymlships no Redis defaults — every operator deploying withREDISmust set at least the host and port, plus credentials and TLS for any production deployment.TLS, pool sizing, and command timeouts inherit Spring Data Redis defaults unless explicitly overridden. For managed Redis providers (AWS ElastiCache, Redis Cloud, Azure Cache for Redis) and any TLS-required Redis deployment, set
spring.data.redis.ssl.enabled: true. For high-concurrency deployments, tune the Lettuce connection pool withspring.data.redis.lettuce.pool.*.Eviction is delegated to Redis. ODD Platform does not run a housekeeping job for Redis-stored sessions; the Redis server's own per-key TTL and
maxmemory-policygovern session eviction. Configure your Redis instance accordingly.
Required and optional connection keys (Spring Boot 3.x — spring.data.redis.*)
spring.data.redis.*)spring.data.redis.host: Redis host. Defaults tolocalhost.spring.data.redis.port: Redis port. Defaults to6379.spring.data.redis.username: Redis ACL username. Optional; omit for password-only or no-auth Redis.spring.data.redis.password: Redis password. Optional but recommended for any production deployment.spring.data.redis.database: Redis logical database index. Defaults to0.spring.data.redis.ssl.enabled: enable TLS for the Redis connection. Boolean, defaults tofalse. Set totruefor any managed-Redis or TLS-terminated Redis deployment.spring.data.redis.timeout: command timeout. Duration string (for example5s). Defaults to Spring Data Redis's internal default.spring.data.redis.lettuce.pool.*: Lettuce connection-pool sizing (max-active,max-idle,min-idle,max-wait). Optional; tune for high-concurrency deployments.
ODD Platform does not extend or override Spring Boot's Redis property catalogue — the full set of keys recognized under spring.data.redis.* in your Spring Boot version applies as-is.
spring.redis.* (the Spring Boot 2.x prefix) is silently ignored. Spring Boot 3.x removed the spring.redis.* namespace and relocated all Redis properties under spring.data.redis.*. Configuration written against the older prefix will not bind, the platform falls back to localhost:6379 defaults, and the symptom is connection failures against your real Redis instance with no obvious "wrong key" error. Migrate any pre-3.x configuration to spring.data.redis.* (and SPRING_DATA_REDIS_* for env vars).
Session lifetime (spring.session.timeout)
spring.session.timeout)Spring Session's timeout controls how long an authenticated session remains valid between requests. ODD Platform's shipped default is -1, which means sessions never expire.
spring.session.timeout: -1 means sessions never expire. A user who logs in once remains authenticated until their session record is explicitly invalidated (logout, cache eviction, or — for IN_MEMORY — platform restart). For any deployment that is internet-facing or serves multiple users, set spring.session.timeout to a finite duration so stolen cookies and forgotten sessions eventually lapse.
spring.session.timeout: session idle timeout. Duration string (for example30m,8h,1d). Defaults to-1(no timeout). Applies to all three providers (IN_MEMORY,INTERNAL_POSTGRESQL,REDIS).
Enable Metrics
ODD Platform can represent some of the metadata it ingests as time-series charts — for example, row counts on a MySQL table or the on-disk size of a Redshift database. Metrics handling splits into two independent concerns that share the metrics.* config namespace but do different jobs:
Storage (
metrics.storage) — the storage tier the platform uses for ingested metrics. This selects where the platform writes metric points as they arrive from collectors and where it reads them back when rendering UI charts. Both directions hit the same backend — you cannot write to one and read from another.Export (
metrics.export.*) — where the platform pushes metrics out as OpenTelemetry telemetry, for long-term retention and dashboarding in your observability stack.
Configure the two independently; it is valid (and common) to run with INTERNAL_POSTGRES storage and no OTLP export, or with PROMETHEUS storage and OTLP export disabled, or any other combination.
Metric storage backend
metrics.storage selects the storage tier for metric writes and reads:
INTERNAL_POSTGRES(default) — metrics are written to and read from the ODD Platform's own PostgreSQL database (metric_series/metric_pointtables). Zero additional infrastructure; suitable for most single-cluster deployments.PROMETHEUS— metrics are remote-written to an external Prometheus instance (via the Prometheus remote-write protocol at/api/v1/write, using Snappy-compressed Protobuf-encoded write requests) and queried from the same instance (via the instant-query API at/api/v1/query). Suitable when you already run Prometheus for observability and want to avoid storing duplicate metric data in ODD's PostgreSQL.
metrics.prometheus-host is the base URL of the Prometheus instance and is only consulted when metrics.storage=PROMETHEUS. Both /api/v1/write and /api/v1/query are called on this single host. Defaults to http://localhost:9090.
metrics.storage=PROMETHEUS requires metrics.prometheus-host to be set. The platform validates this at startup — if metrics.prometheus-host is empty (or unset) while metrics.storage=PROMETHEUS, ODD Platform fails to start with IllegalStateException: Prometheus host is not defined. Set it to the Prometheus base URL (for example http://prometheus:9090) in the same configuration change that flips the storage backend.
The Prometheus instance must accept remote-write AND queries on the same endpoint. ODD Platform does not support splitting read and write paths across different hosts.
Prometheus server flag —
--web.enable-remote-write-receivermust be enabled on the Prometheus process. It is disabled by default in Prometheus v2.33+; without it, every ODD Platform metric write returns404 Not Foundand is silently dropped. The ingestion API still returns200to the collector because the remote-write happens downstream of the HTTP acknowledgement, so collector logs will not surface the failure — the symptom is empty charts in the UI.Endpoint must support both paths —
POST /api/v1/write(for writes) andGET /api/v1/query(for reads) must both resolve to the same Prometheus-compatible host.Read-only Prometheus-compatible backends do not work. A Thanos querier, Mimir in query-only mode, or any other backend that exposes
/api/v1/querybut rejects/api/v1/writecannot be used as ametrics.storage=PROMETHEUStarget. Pointmetrics.prometheus-hostat the write-accepting Prometheus instance itself (or at a Mimir distributor that terminates both paths).
Metric export to OTLP
Independent of where metrics are stored, ODD Platform can push metrics as OpenTelemetry telemetry to an OTLP collector. Downstream you can forward that stream to Prometheus, New Relic, or any backend that accepts OTLP exporters.
metrics.export.enabled: must be set totrueto build and wire the OTLP exporter bean. Defaults tofalse.metrics.export.otlp-endpoint: OTLP collector endpoint (gRPC). Defaults tohttp://localhost:4317.
Enable Alert Notifications
Any alert that is created inside the platform can be sent via webhook and/or Slack incoming webhook and/or email notifications (via Google SMTP, AWS SMTP, etc). Such notifications contain information such as:
Name of the entity upon which alert has been created
Data source and namespace of an entity
Owners of an entity
Possibly affected entities
ODD Platform uses the PostgreSQL replication mechanism to be able to send a notification even if there's a network lag occurred or the Platform crashes. In order to enable this functionality, an underlying PostgreSQL database needs to be configured as well.
For the user-facing description of the alerting feature — alert types, the per-entity alert tabs, the lifecycle, and per-entity halt configuration — see Active platform features → Alerting. For the user-facing description of the outbound notification channels (Slack incoming webhook, email, generic webhook) and the Prometheus AlertManager inbound webhook, see Active platform features → Notifications.
Slack here is the outgoing alert webhook, not the Discussions Slack app. The alert-notifications integration is a one-way Slack incoming webhook — the platform POSTs alert messages to a channel via notifications.receivers.slack.url. It is distinct from the full Slack app used by Data Collaboration for in-app per-entity discussion threads (OAuth + Events API; bidirectional). Each integration is configured separately: enabling the alert webhook does not surface the Discussions tab on data-entity pages, and enabling Data Collaboration does not route alerts. See Main Concepts → Terms & Aliases for the side-by-side comparison.
PostgreSQL Configuration
PostgreSQL database must be configured in order to leverage the replication mechanism of the Platform along with the granting the database user replication permissions.
Database settings
To configure the database, add the following entries to the postgresql.conf file:
Or if the replication mechanism is already configured, just increment the max_wal_senders and max_replication_slots numbers.
Database user permissions
ODD Platform database user must be granted with replication permissions:
User permissions and database configuration may vary from one on-demand/cloud provider to another.
For instance, In AWS RDS, PostgreSQL instances are managed services where certain aspects of replication management are automated. This is done to minimize the risk of misconfiguration. Due to this managed nature, some settings are either not exposed or are altered differently compared to a standard PostgreSQL setup. To enable notifications in such an environment, follow these steps (only differences are mentioned): 1. Alter the rds.logical_replication parameter in your database instance's Parameter Group by setting it to 1, instead of directly modifying the wal_level parameter. 2. Ensure the ODD user connecting to the database has the rds_replication role. The Master username of the database typically already has this role by default. If using a different username, you may need to assign the necessary role using the command GRANT rds_replication TO {your_database_username}; 3.If you changed max_wal_senders to 5 (as it's mentioned as a minimal value in Parameter Group) and then constantly getting messages like "The parameter max_wal_senders was set to a value incompatible with replication. It has been adjusted from 5 to 55" in the events list of the database instance, please, consider adjusting the parameter from 5 to the mentioned value in the parameter group to exclude automatic change done by RDS.
ODD Platform configuration
Following variables need to be defined:
notifications.enabled: must be set totrue. Defaults tofalsenotifications.message.downstream-entities-depth: limits the amount of fetching of affected data entities in terms of lineage graph level. Defaults to 1notifications.wal.advisory-lock-id: ODD Platform uses PostgreSQL advisory lock in order to make sure that in a case of horizontal scaling only one instance of the Platform processes alert messages. This setting defines advisory lock id. Defaults to100notifications.wal.replication-slot-name: PostgreSQL replication slot name will be created if it doesn't exist yet. Defaults toodd_platform_replication_slotnotifications.wal.publication-name: PostgreSQL publication name will be created if it doesn't exist yet. Defaults toodd_platform_publication_alertnotifications.receivers.slack.url: Slack incoming webhook URL. The clickable links rendered inside Slack messages useodd.platform-base-url— there is nonotifications.receivers.slack.*base-URL setting.notifications.receivers.webhook.url: Generic webhook URLnotifications.receivers.email.host: the SMTP server.notifications.receivers.email.port: the port used for the email protocol (SMTP, IMAP, or POP3)notifications.receivers.email.protocol: the email protocol (e.g., SMTP, SMTPS, IMAP, IMAPS, POP3, POP3S)notifications.receivers.email.smtp.auth: a boolean value (true or false) indicating whether the SMTP server requires authenticationnotifications.receivers.email.smtp.starttls: a boolean indicating whether to use STARTTLS, a security protocol that upgrades an unencrypted connection to an encrypted onenotifications.receivers.email.password: the password used for email authenticationnotifications.receivers.email.sender: the email address sending the notificationsnotifications.receivers.email.notification.emails: the list of recipients for the email notifications
odd.platform-base-url
odd.platform-base-urlODD Platform URL exposed to three internal consumers — the Slack-notification sender, the email-notification sender, and the integration-parameter substitution context. The two notification senders use it to build clickable links inside alert messages (the generic webhook receiver does not consume this key — it gets the full alert payload directly and is expected to construct any URLs it needs from that payload). The platform also substitutes the resolved value as the platform_url parameter in integration configurations — this is how Airflow plugins, dbt artifacts, and similar integrations resolve their reference to the ODD platform URL at runtime. Defaults are inconsistent across consumers: the notification senders default to http://localhost:8080, while the integration-substitution context defaults to the placeholder string http://your.odd.platform. Both defaults are unreachable from outside the host machine; set this key to your real deployment URL (for example https://odd.your-domain.com) in any non-local environment.
Operators deploying integrations must set ODD_PLATFORM_BASE_URL even if alert notifications are disabled. The integration-parameter substitution context reads the same key to populate the platform_url parameter exposed to integration configurations. If the key is unset, integrations that reference platform_url receive the literal string http://your.odd.platform — a placeholder that will not connect to anything — and the integration will fail in confusing ways at runtime with no error from ODD Platform itself.
ODD Platform configuration would look like this:
Example: Gmail SMTP
A minimal, working configuration for Gmail's SMTP over STARTTLS. Gmail requires an app password (generated from your Google account with 2-Step Verification enabled) — your regular account password will not work.
Known limitations
ODD Platform builds its JavaMailSender with only the keys documented above. The JavaMail session inherits defaults for every other SMTP parameter, and several of those defaults are operator-hostile in production deployments. None of the following is currently exposed as an ODD configuration key — where a workaround exists it is noted, but the limitations are real and should drive your choice of SMTP relay.
SMTP timeouts are unset — an unreachable SMTP server will hang notification delivery. The JavaMail defaults for mail.smtp.connectiontimeout, mail.smtp.timeout (read), and mail.smtp.writetimeout are infinite. If the configured SMTP host is unreachable, slow, or stalls mid-response, the notification thread blocks until the TCP stack eventually tears the connection down — there is no application-level timeout to cut it short. Use an SMTP relay you control (or a trusted managed service) and monitor its availability separately from ODD Platform.
Only STARTTLS is supported — implicit-TLS ports (e.g. Gmail port 465, many corporate relays) will not work. ODD Platform exposes notifications.receivers.email.smtp.starttls but does not expose mail.smtp.ssl.enable, which is the JavaMail flag required to open an implicit-TLS connection. If your SMTP server only accepts connections on an implicit-TLS port, you must front it with a STARTTLS-capable relay (port 587 is the common choice). Gmail over port 587 with STARTTLS (the example above) works; Gmail over port 465 does not.
Self-signed or internal-CA SMTP certificates require a JVM-level workaround. mail.smtp.ssl.trust is not exposed as an ODD configuration key. If your SMTP relay presents a certificate signed by a private CA, the connection will fail certificate validation unless you either (a) add the CA to the JVM truststore of the ODD Platform container ($JAVA_HOME/lib/security/cacerts or a -Djavax.net.ssl.trustStore=... override) before starting the process, or (b) use an SMTP relay with a publicly-trusted certificate. There is no configuration-file path to this.
Non-ASCII subjects and bodies may be mangled. The MIME message is built without an explicit charset, so JavaMail falls back to the JVM default. Containers that do not set file.encoding or LANG explicitly can end up with US-ASCII defaults, which corrupt non-Latin alert content. If your alert text includes non-ASCII characters, set JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8 on the ODD Platform container.
Silent partial delivery: if one recipient fails, subsequent recipients are skipped. EmailNotificationSender iterates over the recipient list in notifications.receivers.email.notification.emails and calls the SMTP server once per recipient. If recipient N fails (bad address, mailbox full, server-side policy rejection), the exception is wrapped as a RuntimeException and the loop terminates — recipients N+1, N+2, … never receive the alert. There is no retry and no partial-failure metric. Keep the recipient list short, use distribution lists on the SMTP side for fan-out, and validate addresses before adding them to the list.
Cleaning up
ODD Platform doesn't clean up replication slot it has created. If you need to disable Alert Notification functionality, please perform the following steps along with disabling a feature on a ODD Platform side
In order to remove replication slot and publication, these SQL queries must be run against the database:
where
<>is a name of replication slot defined in the ODD Platform. Default isodd_platform_replication_slotwhere
<>is a name of publication defined in the ODD Platform. Default isodd_platform_publication_alert
Prometheus AlertManager Integration
In addition to raising alerts internally (failed jobs, data-quality tests, schema changes, distribution anomalies — see the Alerting feature), ODD Platform exposes an inbound webhook that accepts Prometheus AlertManager notifications. Each inbound alert becomes a Distribution Anomaly alert on the referenced data entity, visible in the Alerts section and on the entity's page.
Endpoint
Response: 204 No Content on success. The endpoint consumes the AlertManager webhook body and always returns empty.
Payload shape
The platform accepts a subset of the AlertManager webhook schema — specifically alerts[].labels, alerts[].generatorURL, and alerts[].startsAt. Other top-level AlertManager fields (version, status, receiver, groupLabels, commonLabels, …) are accepted and ignored.
The entity_oddrn label is required for the alert to route to a data entity. ODD Platform reads alerts[].labels["entity_oddrn"] to determine which data entity the alert belongs to. An alert submitted without this label is stored with an empty owner, will not appear on any entity's page, and is effectively orphaned. Configure your AlertManager route or your alerting rules to include the target entity's ODDRN as a label.
Example AlertManager receiver configuration
A minimal alertmanager.yml receiver forwarding every alert to ODD Platform:
The reference example shipped with the platform is at docker/examples/config/alertmanager.yaml in the odd-platform repo. To make an alert route to a specific entity, attach entity_oddrn as a label in your Prometheus alerting rules — for example:
Authentication
The AlertManager webhook endpoint is not authenticated. ODD Platform whitelists the entire /ingestion/** namespace in Spring Security, and the ingestion auth filter controlled by auth.ingestion.filter.enabled only guards /ingestion/entities (POST) — it does not cover /ingestion/alert/alertmanager. Anyone with network reach to the platform can POST arbitrary AlertManager-shaped payloads and create alerts on any data entity whose ODDRN they can guess. Toggling auth.ingestion.filter.enabled has no effect on this endpoint.
Because no application-level authentication is enforced on this endpoint today, protect it at the perimeter. Any of these approaches works:
Network segmentation — expose ODD Platform only on a private network or VPN; in Kubernetes, keep AlertManager and the platform in the same cluster and use a NetworkPolicy so only the AlertManager pod can reach
/ingestion/alert/alertmanager.Reverse proxy with its own authentication — put an authenticating proxy in front of ODD Platform (for example, nginx with
auth_requestdelegating to an SSO sidecar, or Envoy withext_authz) and require AlertManager to present a proxy-validated credential on every webhook call.mTLS termination — require client certificates on
/ingestion/alert/alertmanagerat the ingress or load balancer layer, and issue a certificate only to the AlertManager pod.
A platform-side fix to extend the ingestion auth filter to cover this endpoint is tracked upstream. Until it ships, apply one of the perimeter controls above for any deployment where the platform's network is not fully trusted.
For the broader ingestion-auth model — what auth.ingestion.filter.enabled does cover and how ingestion API keys are provisioned for /ingestion/entities — see Enable security and Server-to-server (S2S) API keys.
Enable Data Collaboration
Data collaboration feature allows users to initiate discussion about specific data entity in messengers directly from the ODD Platform. Thread replies are tracked by ODD Platform and saved in it, allowing users to retrieve conversation's context and decisions from one place.
For the user-facing description of the feature — the per-entity Discussions tab, how a discussion flows from the platform out to Slack and back, the message-lifecycle model — see Active platform features → Data Collaboration.
At the moment ODD Platform supports only Slack as a target messenger. It uses Slack APIs to send messages and Slack Events API to receive message's thread replies.
Slack here is the full Slack app for in-app discussions, not the alert webhook. The Data Collaboration integration uses an OAuth-token-driven Slack app (datacollaboration.slack-oauth-token) and the Slack Events API webhook to read replies back into the platform — bidirectional. It is distinct from the outgoing alert webhook used by alert notifications (notifications.receivers.slack.url, one-way write only). Each integration is configured separately: enabling this one does not route alerts, and enabling the alert webhook does not surface the Discussions tab on data-entity pages. See Main Concepts → Terms & Aliases for the side-by-side comparison.
Creating Slack application
Go to the Slack apps website and click on Create New App -> From an app manifest

Select a workspace you want to add an application to and click Next

Enter the following manifest into the YAML section, replace the <ODD_PLATFORM_BASE_URL> with URL of your ODD Platform deployment and click Next

Review your application's scopes and permissions and click Create

Proceed with Slack instructions on how to install application into workspace and you should be good to go.
ODD Platform configuration
Following variables need to be defined:
datacollaboration.enabled: must be set totrue. Defaults tofalsedatacollaboration.receive-event-advisory-lock-id: PostgreSQL advisory lock id for a job, which translates events from messengers to messages. Defaults to110datacollaboration.sender-message-advisory-lock-id: PostgreSQL advisory lock id for a job, which sends messages created in the platform to messengers. Defaults to120datacollaboration.message-partition-period: time interval in days for a message table partition in PostgreSQL. Defaults to30datacollaboration.sending-messages-retry-count: how many times the Platform will attempt to send a message to provider. Cannot be less than zero. Defaults to3datacollaboration.slack-oauth-token: Slack application OAuth token used for communicating with Slack. Can be retrieved in theOAuth & Permissionssection of a Slack application.\
Retrieving OAuth Token
datacollaboration.message-partition-period (default 30) is read by MessageTablePartitionManager (@Value("${datacollaboration.message-partition-period:30}")) — separate from DataCollaborationProperties, which only carries the two advisory-lock IDs and the retry count. The partition manager creates a new PostgreSQL partition for the messages table every N days; lowering the value increases partition churn, raising it reduces partition count but enlarges each partition.
API surface
The full HTTP API for Data Collaboration is documented at API Reference → Data Collaboration — 7 routes across three groups (outbound to the provider, per-entity threads & history, inbound webhook from Slack), all gated by @ConditionalOnDataCollaboration and returning 404 Not Found when datacollaboration.enabled=false.
Housekeeping Settings Configuration
ODD Platform runs a background housekeeping job that permanently deletes stale data on a schedule. The job fires every 15 minutes, is guarded by a ShedLock so only one platform instance runs it at a time in a multi-instance deployment, and iterates through three cleanup tasks: resolved alerts, search-facet history, and soft-deleted data entities.
Configuration keys
housekeeping.enabled: enables the background job. Defaults totrue. See the caveat below before disabling.housekeeping.ttl.resolved_alerts_days: how many days an alert inRESOLVED_AUTOMATICALLYstatus is kept after its status-update timestamp before the housekeeping job permanently deletes it (alongside its chunk records). Integer, days. Defaults to30. Note: the retention window is intended to apply to bothRESOLVED(manual) andRESOLVED_AUTOMATICALLY(system) states, but a known platform bug currently exempts manual resolutions from the retention check — manualRESOLVEDalerts are hard-deleted on the next housekeeping run regardless of this value. See Alerting → Auto-cleanup of resolved alerts for the operator-side workaround.housekeeping.ttl.search_facets_days: how many days a saved search-facet entry is kept past itslast_accessed_attimestamp before being deleted. Integer, days. Defaults to30.housekeeping.ttl.data_entity_delete_days: how many days a data entity with statusDELETEDis kept after its status-update timestamp. After this, the entity and its cascading related rows — metadata values, ownerships, lineage, tags, terms, alerts, messages, metrics, attachments, task runs, group relations, and (for datasets) dataset structure and enum values — are permanently deleted. Integer, days. Defaults to30.
For the user-facing entity lifecycle (how operators set DELETED and the other status states from the UI), see Data entity statuses.
Disabling housekeeping (housekeeping.enabled: false) stops all three cleanup jobs. Resolved alerts, search-facet history, and soft-deleted data entities will accumulate indefinitely and the PostgreSQL database will grow without bound. Leave the job enabled in production; disable only for debugging or offline migrations, and re-enable (or run a manual cleanup) afterwards.
Platform-level settings (odd.*)
odd.*)The odd.* namespace groups four platform-wide settings that do not belong to any subsystem: stale-metadata detection, the optional Prometheus tenant label, the Activity-feed partitioning period, and a list of additional navigation links surfaced in the App Info menu. A fifth key in the same namespace, odd.platform-base-url, is documented above in Enable Alert Notifications → odd.platform-base-url — that section is the primary operator-facing context where the key is introduced, but the same key is also consumed by the integration-parameter substitution context, so any non-local deployment must set it regardless of which subsystems (notifications, integrations, or both) are enabled.
Detecting stale metadata
Stale metadata is metadata that has not been refreshed from its source for longer than an operator-defined window. This typically happens when a collector is paused, deactivated, or failing to reach the source system. When the platform judges an entity to be stale, the UI surfaces it with a "Stale" indicator so users can distinguish data whose freshness is uncertain from actively-maintained metadata. For the user-facing surface (where the indicator appears, how the freshness signal differs from runtime alerts), see Stale-metadata indicator.
odd.data-entity-stale-period: number of days after the entity's last successful ingestion before it is labeled "Stale" in the UI and API. Integer, days. Defaults to7.
Operators running collectors on schedules longer than a week should raise this value to match the collector cadence — otherwise entities that were ingested successfully will be flagged stale between runs.
Prometheus tenant label (odd.tenant-id)
odd.tenant-id)When metrics.storage is set to PROMETHEUS, the platform appends tenant_id={value} as a label on every Prometheus instant query it issues. This lets a single shared Prometheus instance serve metric data for multiple ODD Platform deployments without their metric series colliding — each deployment queries only its own tenant-labeled series.
odd.tenant-id: tenant identifier appended as a Prometheus query label. String, no default (empty means no label is applied, and the Prometheus query returns series across all tenants). Ignored whenmetrics.storage=INTERNAL_POSTGRES.
Activity-feed partitioning (odd.activity.partition-period)
odd.activity.partition-period)The ODD Platform activity table is range-partitioned on a rolling date window; odd.activity.partition-period sets the partition width in days. The default creates a new partition every 30 days, which is appropriate for most deployments. Operators running high-volume deployments (millions of activity events per day) can tune this downward to narrow partitions — smaller partitions speed up vacuum and partition-prune operations on the activity feed.
odd.activity.partition-period: partition width in days for theactivitytable. Integer, days. Defaults to30.
Additional navigation links (odd.links)
odd.links)Operators can attach a list of arbitrary navigation links — pointers to internal wikis, runbooks, dashboards, or any other page teams should reach from inside ODD Platform. The platform UI surfaces them inside the App Info menu (the popup behind the information icon in the top-right toolbar). Each link renders as a menu item showing its title and opens the configured URL in a new tab when clicked.
odd.links: list of link objects. Each entry has two required fields:title: the menu-item label shown in the App Info menu. String, required.url: the absolute URL the menu item opens in a new tab. String, required.
Defaults to an empty list — when unset, the App Info menu omits the additional-links section entirely.
The links are exposed to the UI through the authenticated GET /api/links endpoint and are visible to every user signed in to the platform. Use them for navigation hints only — do not embed credentials, session tokens, or one-time secrets in link URLs, since any logged-in user can read them.
Attachment Storage Configuration
ODD Platform allows users to attach files and links to data entities from the UI. This section covers the operator-facing configuration for where those uploaded files are stored. For the user-facing upload workflow (what users can attach, the per-entity Attachments tab, the DATA_ENTITY_ATTACHMENT_MANAGE permission), see Attachments and links.
The default LOCAL storage mode is ephemeral. Attachments are written to /tmp/odd/attachments inside the ODD Platform container filesystem. Any container or pod restart — routine deployment, node drain, crash, Kubernetes eviction — permanently deletes all uploaded files.
Use REMOTE (S3 / MinIO) storage for any Kubernetes or Docker deployment where users will actually upload attachments. LOCAL mode is suitable only for single-host evaluations or local development where losing attachments on restart is acceptable.
Configuration keys
attachment.storage: storage backend. One ofLOCALorREMOTE. Defaults toLOCAL.attachment.max-file-size: maximum size per uploaded file, in megabytes. Defaults to20. See the hint below if raising this above 20 MB.attachment.local.path: filesystem directory where attachments are written whenstorage=LOCAL. Defaults to/tmp/odd/attachments(ephemeral — see warning above).attachment.remote.url: S3-compatible endpoint URL whenstorage=REMOTE(for examplehttps://s3.us-east-1.amazonaws.comfor AWS S3 orhttp://minio:9000for a MinIO service). See the Known limitations (REMOTE mode) subsection below before choosing your endpoint — in particular theus-east-1restriction for AWS S3 and the chunked-upload staging behavior.attachment.remote.access-key: access key for the S3-compatible bucket.attachment.remote.secret-key: secret key for the S3-compatible bucket.attachment.remote.bucket: bucket name used to store attachment objects. The bucket must already exist — ODD Platform does not create it.spring.codec.max-in-memory-size: platform-wide cap on the in-memory buffer Spring WebFlux uses when reading a request body. Defaults to20MB. This is the transport-layer ceiling —attachment.max-file-sizecannot effectively exceed it. Accepts a size string (20MB,100MB,1GB).
attachment.max-file-size must not exceed spring.codec.max-in-memory-size. Both ship with the same 20 MB default, so the attachment cap is effective out of the box. If you raise attachment.max-file-size to allow larger uploads — for example 100 MB — you must raise spring.codec.max-in-memory-size to at least the same value, otherwise uploads above 20 MB fail at the WebFlux codec layer with DataBufferLimitException before the attachment validation runs.
Example: REMOTE storage with S3-compatible backend (MinIO or AWS S3)
Known limitations (REMOTE mode)
ODD Platform builds its MinioAsyncClient with only the endpoint and credentials documented above. The MinIO Java SDK inherits defaults for every other parameter, and the attachment-upload code path carries a small amount of additional behavior that is not configurable. None of the following is currently exposed as an ODD configuration key — plan your deployment around these limits rather than assuming a config flag will fix them.
AWS S3 region pinned to us-east-1. The attachment client is built without an explicit region, so it uses the MinIO Java SDK's default region (us-east-1) for request signing. Against AWS S3 this means only buckets in us-east-1 work — buckets in any other region fail signature validation with errors such as AuthorizationHeaderMalformed or PermanentRedirect. If you need AWS S3 in another region, either host your bucket in us-east-1 or use a MinIO server in front of it. Self-hosted MinIO and most other S3-compatible services ignore the region header and are unaffected.
HTTP client timeouts are the MinIO SDK defaults (~5 minutes), not configurable. ODD Platform does not supply a custom OkHttpClient to the MinIO builder, so the SDK's built-in defaults apply: roughly a 5-minute read/write timeout. A single large upload whose end-to-end wall time (network transfer + S3 ingest) exceeds that limit fails with a socket-timeout error even though the content was being streamed successfully. If your users upload near the attachment.max-file-size limit over a slow link, keep attachment.max-file-size below the size a typical upload can complete inside 5 minutes at your network's real throughput.
Chunked uploads are assembled on the container's local filesystem before they are sent to REMOTE storage — a mid-upload container restart loses the staged chunks. The UI splits large files into chunks and uploads each chunk individually; the platform writes each chunk to a local directory (the same directory family that backs attachment.local.path) and reassembles the full file there before streaming it to the S3-compatible backend. This is true even when attachment.storage=REMOTE. If the ODD Platform container is restarted, evicted, or rescheduled during an in-flight chunked upload, the local directory is wiped and the partial upload is unrecoverable — the user must re-upload from scratch. In Kubernetes deployments, either mount a persistent volume at the chunk-staging directory or limit the maximum upload size so single-request uploads are the norm. The LOCAL-mode ephemeral warning above applies to chunk-staging in REMOTE mode as well.
No retry on transient S3 / MinIO errors. Put, get, and remove operations against the bucket do not retry on transient failures — a single 503 from S3, a connection reset from the network, or a short MinIO outage surfaces as a failed operation with no automatic recovery. If your alerting pipeline treats attachment failures as user-impacting errors, add retry at the infrastructure layer (for example an S3-proxy sidecar with retry) rather than expecting the platform to paper over it.
Example: LOCAL storage (single-host / local evaluation only)
If you keep LOCAL mode, override attachment.local.path to a persistent volume mount rather than the default /tmp/odd/attachments, and confirm the volume is actually persistent across restarts in your deployment topology.
Logging Settings Configuration
Logs provide detailed information about errors in the application helping its users quickly identify and fix problems. Setting up logging is recommended for ensuring operational excellence, system reliability, effective monitoring and troubleshooting. Here is a code snippet for setting up logs in ODD Platform:
Setting the logging level to info allows you to see useful messages about the platform’s functioning without being overwhelmed by too much detail as with trace or debug or missing important issues as with warn or higher level.
However, feel free to adjust the logging level as needed to get more or less information based on your specific requirements.
GenAI Configuration
The platform can proxy natural-language questions to an external AI service via three keys under the genai prefix (@ConfigurationProperties("genai") per GenAIProperties.java). The feature is disabled by default and is API-only today (no in-app UI affordance calls the endpoint).
genai.enabled(boolean, envGENAI_ENABLED) — feature toggle. Defaultfalse(set explicitly atapplication.ymlline 18). Whenfalse,POST /api/genai/askreturns HTTP 400 with the message "Gen AI is disabled".genai.url(string, envGENAI_URL) — base URL of the external AI service. The platform'sgenAiWebClientis built at startup with this asbaseUrland POSTs each request to{genai.url}/query_data. No@ConfigurationPropertiesdefault — the field has no initializer inGenAIProperties.java, so its Java default isnull. The example inapplication.ymlline 19 (# url: http://localhost:5000) is commented out, not a default.genai.request_timeout(integer, envGENAI_REQUEST_TIMEOUT) — outbound response timeout, in minutes. Wired intoWebClientConfiguration.java:23asDuration.ofMinutes(genAIProperties.getRequestTimeout()). No@ConfigurationPropertiesdefault — the Java primitiveintdefault is0, which means immediate timeout. The example inapplication.ymlline 20 (# request_timeout: 2) is commented out, not a default.
Setting only genai.enabled=true will silently misconfigure the feature. With url defaulting to null and request_timeout defaulting to 0, the WebClient is built with no baseUrl and a Duration.ofMinutes(0) timeout — every POST /api/genai/ask will fail before the external service has a chance to respond. Always set all three keys when enabling.
WebClientConfiguration reads genai.url and genai.request_timeout once at startup when constructing the genAiWebClient Spring bean. Changing those values requires a Platform restart.
A working configuration block:
The platform sends no authentication to the external AI service and does not retry. See the dedicated GenAI assistant page for the external service contract (POST /query_data with JSON {"question": "..."}), the platform's /api/genai/ask request/response schemas, and the per-error behavior.
Machine-to-Machine (M2M) Tokens Configuration
ODD Platform supports a static API-key authentication mode for non-UI callers (CI/CD jobs, ingestion pipelines, automation scripts) — also referred to as Machine-to-Machine (M2M) tokens. It is disabled by default.
For the full configuration keys, the header contract, the curl example, and security considerations (token rotation, HTTPS, blast radius), see Server-to-server (S2S) authentication.
Last updated