Configure ODD Platform
This section defines how to configure ODD Platform in order to leverage all of its functionality and features.
Configuration approaches
There are two ways to configure the Platform:
Environment variables are used for simple entries
Configuring via YAML can come in handy when it is necessary to define a complex configuration block (e.g OAuth2 authentication or logging levels).
Connect your database
For all of its features ODD Platform uses PostgreSQL database and PostgreSQL database only. These variables are needed to be defined to connect ODD Platform to database:
spring.datasource.url
: JDBC string of your PostgreSQL database. Default value isjdbc:postgresql://127.0.0.1:5432/odd-platform
spring.datasource.username
: your PostgreSQL user's name. Default value isodd-platform
spring.datasource.password
: your PostgreSQL user's password. Default value isodd-platform-password
These variables are optional (by default, they have the same value as spring.datasource
) and will be used to connect to PostgreSQL and store Lookup Tables :
spring.custom-datasource.url
: JDBC string of your PostgreSQL database where we store Lookup Tables. Default value isjdbc:postgresql://127.0.0.1:5432/odd-platform
. Note: you can specify any {database_host}, {database_port} or {database_name} but schema, where Lookup Tables are stored always is lookup_tables_schema.spring.custom-datasource.username
: your PostgreSQL user's name for custom-datasource. Default value isodd-platform
spring.custom-datasource.password
: your PostgreSQL user's password for custom-datasource. Default value isodd-platform-password
So that your database connection defining block would look like this:
Security
Please follow Enable security section for enabling security in ODD Platform.
Select session provider
ODD Platform is able to keep users' sessions in several places such as in memory, PostgreSQL database or Redis. A session provider can be set via session.provider
variable with following expected values:
IN_MEMORY
: Local in-memory storage. ODD Platform defaults to this valueINTERNAL_POSTGRESQL
: Underlying PostgreSQL databaseREDIS
: Redis data-store.
If you'd like to use only one instance of ODD Platform and you're ready to tolerate users' logouts each time the Platform restarts, the best choice would be IN_MEMORY
If you already have a Redis in your infrastructure or you're willing to install it, the best choice would be REDIS
Otherwise INTERNAL_POSTGRESQL
is the best pick
In memory (default)
Internal PostgreSQL
Redis
In order to connect to Redis following variables are needed to be defined:
spring.redis.host
: Redis hostspring.redis.port
: Redis portspring.redis.username
: Redis user's namespring.redis.password
: Redis user's passwordspring.redis.database
: Redis database index
YAML for Redis session provider
Enable Metrics
Some of metadata ODD Platform ingests can be conveniently represented in a shape of time-series chart, for example, an amount of data in a MySQL table or a physical size of a Redshift database. ODD Platform pushes metadata to the OTLP collector as a telemetry in order to be able to create charts in Prometheus, New Relic or any other backend that supports OTLP Exporters. These variables are needed to be set in order to leverage this functionality:
metrics.export.enabled
: Must be set totrue
metrics.export.otlp-endpoint
: OTLP Collector endpoint
Enable Alert Notifications
Any alert that is created inside the platform can be sent via webhook and/or Slack incoming webhook and/or email notifications (via Google SMTP, AWS SMTP, etc). Such notifications contain information such as:
Name of the entity upon which alert has been created
Data source and namespace of an entity
Owners of an entity
Possibly affected entities
ODD Platform uses the PostgreSQL replication mechanism to be able to send a notification even if there's a network lag occurred or the Platform crushes. In order to enable this functionality, an underlying PostgreSQL database needs to be configured as well.
PostgreSQL Configuration
PostgreSQL database must be configured in order to leverage the replication mechanism of the Platform along with the granting the database user replication permissions.
Database settings
To configure the database, add the following entries to the postgresql.conf
file:
Or if the replication mechanism is already configured, just increment the max_wal_senders
and max_replication_slots
numbers.
Database user permissions
ODD Platform database user must be granted with replication permissions:
User permissions and database configuration may vary from one on-demand/cloud provider to another.
For instance, In AWS RDS, PostgreSQL instances are managed services where certain aspects of replication management are automated. This is done to minimize the risk of misconfiguration. Due to this managed nature, some settings are either not exposed or are altered differently compared to a standard PostgreSQL setup. To enable notifications in such an environment, follow these steps (only differences are mentioned): 1. Alter the rds.logical_replication
parameter in your database instance's Parameter Group by setting it to 1
, instead of directly modifying the wal_level
parameter. 2. Ensure the ODD user connecting to the database has the rds_replication
role. The Master username of the database typically already has this role by default. If using a different username, you may need to assign the necessary role using the command GRANT rds_replication TO {your_database_username}; 3.
If you changed max_wal_senders to 5 (as it's mentioned as a minimal value in Parameter Group) and then constantly getting messages like "The parameter max_wal_senders was set to a value incompatible with replication. It has been adjusted from 5 to 55" in the events list of the database instance, please, consider adjusting the parameter from 5 to the mentioned value in the parameter group to exclude automatic change done by RDS.
ODD Platform configuration
Following variables need to be defined:
notifications.enabled
: must be set totrue
. Defaults tofalse
notifications.message.downstream-entities-depth
: limits the amount of fetching of affected data entities in terms of lineage graph level. Defaults to 1notifications.wal.advisory-lock-id
: ODD Platform uses PostgreSQL advisory lock in order to make sure that in a case of horizontal scaling only one instance of the Platform processes alert messages. This setting defines advisory lock id. Defaults to100
notifications.wal.replication-slot-name
: PostgreSQL replication slot name will be created if it doesn't exist yet. Defaults toodd_platform_replication_slot
notifications.wal.publication-name
: PostgreSQL publication name will be created if it doesn't exist yet. Defaults toodd_platform_publication_alert
notifications.receivers.slack.url
: Slack incoming webhook URLnotifications.receivers.webhook.url
: Generic webhook URLnotifications.receivers.email.host
: the SMTP server.notifications.receivers.email.port
: the port used for the email protocol (SMTP, IMAP, or POP3)notifications.receivers.email.protocol
: the email protocol (e.g., SMTP, SMTPS, IMAP, IMAPS, POP3, POP3S)notifications.receivers.email.smtp.auth
: a boolean value (true or false) indicating whether the SMTP server requires authenticationnotifications.receivers.email.smtp.starttls
: a boolean indicating whether to use STARTTLS, a security protocol that upgrades an unencrypted connection to an encrypted onenotifications.receivers.email.password
: the password used for email authenticationnotifications.receivers.email.sender
: the email address sending the notificationsnotifications.receivers.email.notification.emails
: the list of recipients for the email notificationsodd.platform-base-url
: ODD Platform URL to be used in alert messages' hyperlinks.
ODD Platform configuration would look like this:
Cleaning up
ODD Platform doesn't clean up replication slot it has created. If you need to disable Alert Notification functionality, please perform the following steps along with disabling a feature on a ODD Platform side
In order to remove replication slot and publication, these SQL queries must be run against the database:
where
<>
is a name of replication slot defined in the ODD Platform. Default isodd_platform_replication_slot
where
<>
is a name of publication defined in the ODD Platform. Default isodd_platform_publication_alert
Enable Data Collaboration
Data collaboration feature allows users to initiate discussion about specific data entity in messengers directly from the ODD Platform. Thread replies are tracked by ODD Platform and saved in it, allowing users to retrieve conversation's context and decisions from one place.
At the moment ODD Platform supports only Slack as a target messenger. It uses Slack APIs to send messages and Slack Events API to receive message's thread replies.
Creating Slack application
Go to the Slack apps website and click on Create New App -> From an app manifest
Select a workspace you want to add an application to and click Next
Enter the following manifest into the YAML section, replace the <ODD_PLATFORM_BASE_URL>
with URL of your ODD Platform deployment and click Next
Review your application's scopes and permissions and click Create
Proceed with Slack instructions on how to install application into workspace and you should be good to go.
ODD Platform configuration
Following variables need to be defined:
datacollaboration.enabled
: must be set totrue
. Defaults tofalse
datacollaboration.receive-event-advisory-lock-id
: PostgreSQL advisory lock id for a job, which translates events from messengers to messages. Defaults to110
datacollaboration.sender-message-advisory-lock-id
: PostgreSQL advisory lock id for a job, which sends messages created in the platform to messengers. Defaults to120
datacollaboration.message-partition-period
: time interval in days for a message table partition in PostgreSQL. Defaults to30
datacollaboration.sending-messages-retry-count
: how many times the Platform will attempt to send a message to provider. Cannot be less than zero. Defaults to3
datacollaboration.slack-oauth-token
: Slack application OAuth token used for communicating with Slack. Can be retrieved in theOAuth & Permissions
section of a Slack application.\
Housekeeping Settings Configuration
The Housekeeping module is enabled (enabled: true
) allowing for automated maintenance tasks. The Time-To-Live (TTL) settings define the retention period for following data categories:
Resolved Alerts
: data related to resolved alerts will be retained for 30 days.Search Facets
: historical search facets data will be maintained for 30 days.Data Entity Deletion
: information about deleted data entities will be preserved for 30 days.
These settings ensure that unnecessary or stale data is automatically cleaned up after the specified time periods. Adjusting these TTL values allows for customization based on specific business needs and data retention policies.
Detecting Stale Metadata
Stale metadata refers to the situation where the metadata becomes outdated within ODD Platform. This indicates a lack of information regarding updates from the source, which could occur due to issues such as collector not functioning as planned, the collector being deactivated or other issues in the source data system that result in the unavailability of that metadata.
By default, the refresh period is set to 7 days in the configuration file.
This indicates that if the platform received information from the source over 7 days ago, the item would be labeled as "Stale" within the platform. ODD users have the flexibility to adjust this period to better suit their needs - whether opting for a shorter or longer timeframe.
Logging Settings Configuration
Logs provide detailed information about errors in the application helping its users quickly identify and fix problems. Setting up logging is recommended for ensuring operational excellence, system reliability, effective monitoring and troubleshooting. Here is a code snippet for setting up logs in ODD Platform:
Setting the logging level to info
allows you to see useful messages about the platform’s functioning without being overwhelmed by too much detail as with trace
or debug
or missing important issues as with warn
or higher level.
However, feel free to adjust the logging level as needed to get more or less information based on your specific requirements.
Machine-to-Machine (M2M) Tokens Configuration
For M2M communication with the API, a secret
is provided to the ODD platform before deployment. This allows the platform to bypass identity providers. When a user request is sent to the API with the correct secret, the API will respond without any issues.
This functionality is not the preferred method and is disabled by default, but it can be enabled and configured when needed.
Last updated