Overview
This page is the one-page index of ODD Platform's most important features — a quick-scan surface for discovering what the platform does. For step-by-step walkthroughs and configuration detail, follow the cross-links from each feature's section to its dedicated page: each feature below points at (a) its canonical home — for read-oriented features, one of the six governance pillars (Data Discovery, Data Modelling, Master Data Management, Data Quality, Data Lineage, Data Glossary); for operator-mutating UI workflows, the Management section; for the platform's event-driven, opt-in behaviours (alerts, notifications, activity tracking, data collaboration, GenAI), the Active platform features section; (b) its API surface under the API Reference hub; and (c) its operator-configuration keys under Configure ODD Platform. For the broader vocabulary and the Data Governance map, see Main Concepts.
Metadata Storage Search and Filtering End-to-end Data Objects Lineage End-to-end Microservices Lineage Data Quality Test Results Import Alerting ML Experiments Manual Object Tagging Data Entity Groups & Domains Catalog Overview page Directory Dictionary terms Activity Feed for Monitoring Changes Data Collaboration Dataset Quality Statuses (SLA) Dataset schema diff Associating Terms with Data Entities through Descriptive Information Adding Business Names for Data Entities and Dataset Fields Integrating Vector Store Metadata GenAI assistant Data Modelling Query Examples Relationships and ERDs Data Quality Dashboard Filters to Include and Exclude Objects from Ingest Data Entity Statuses Alternative Secrets Backend Lookup Tables Integration Wizards Data Entity Attachments "Recommended" panel on the main page Custom navigation links Metadata stale Machine-to-Machine (M2M) tokens
Looking for the HTTP API? Every Platform endpoint is documented at API Reference — grouped by feature area, with operation IDs, paths, and back-links to each feature page.
Metadata Storage
The platform's metadata storage is a single PostgreSQL database that holds every catalog entity, every lineage edge, every term, and the full-text search index — no extra Elasticsearch / Solr / Neo4j services to deploy. Metadata is processed near-real-time as collectors and push adapters report it; storage capacity scales with the underlying PostgreSQL cluster.
For the deployment topology and how metadata flows from a source system to the catalog screen, see Architecture → Data flow; for the operator-side configuration of the database itself, see Configure ODD Platform → PostgreSQL Configuration.
Search and Filtering
The catalog's query-oriented entry path — type a term, narrow by seven facets (Datasource / Type / Namespace / Owner / Tag / Groups / Statuses), and find data entities across names and metadata in seconds. Search is available across the Catalog, Query Examples, Master Data, Management, and Dictionary tabs.
See the dedicated Search and Filtering page under Data Discovery for the seven facets, the per-result information / question icons, and the indexing / ranking technical detail.
End-to-end Data Objects Lineage
Upstream and downstream lineage across the full ODD entity model — datasets, transformers, transformer runs, quality tests, consumers, data inputs, data entity groups (including ML experiments), and entity relationships — rendered as a per-entity Lineage tab plus the dedicated Group lineage endpoint that returns the union of a Data Entity Group's children's lineage.
See the dedicated Data Objects Lineage page under Data Lineage for the entity-class participation table, the lineage_depth and expanded_entity_ids query parameters, and the Group lineage endpoint.
End-to-end Microservices Lineage
Microservice call lineage rendered alongside data-object lineage — sourced from OpenTelemetry traces ingested through odd-tracing-gateway (the platform's only standalone gateway push adapter today).
See the dedicated Microservices Lineage page under Data Lineage for the OpenTelemetry-to-ODD path and the gateway's role in the architecture chain.
Data Quality Test Results Import
The Platform ingests test results from Great Expectations and dbt tests (both push-clients in the ODD ecosystem), plus statistical profiles generated by odd-collector-profiler (which uses Capital One's DataProfiler under the hood). Custom frameworks can push their own test results through the POST /ingestion/entities/datasets/stats endpoint of the ODD Specification.
See the dedicated Test Results Import page under Data Quality for the per-integration setup paths and the custom-framework escape hatch.
Alerting
The platform watches each catalogued entity for failed jobs, failed data-quality tests, backwards-incompatible schema changes, and externally-injected distribution anomalies — and tracks every alert through an OPEN → RESOLVED lifecycle with per-entity halt configuration and three navigation views (All / My Objects / Dependents). For the alert types and what triggers each, the lifecycle and auto-cleanup rules, the halt-notification UI and its Distribution anomaly caveat, the manual-resolve-deletion bug, and the API surface, see the dedicated Alerting page under Active platform features. For how alerts get out of the platform — Slack, email, generic webhook, plus the Prometheus AlertManager inbound webhook — see Notifications.
ML Experiments
In ODD, an ML experiment is a Data Entity Group of class ML_EXPERIMENT that collects the entities produced by a training run — input datasets, feature tables, training jobs, model instances, and resulting model artifacts — into one logical container. ML Experiments are not a separate feature surface; they reuse the DEG primitive for a specific workflow.
See Data Entity Groups & Domains → Relationship to ML Experiments under Data Discovery for the framing, the catalog-view-not-experiment-tracker positioning, and the lineage cross-link.
Manual Object Tagging
Lightweight labelling for data entities and columns — apply tags to drive faceted search, surface Important-flagged labels visually, and feed the Catalog Overview's Top tags chip strip.
See the dedicated Manual Object Tagging page under Data Discovery for the tag application workflow, the three TAG_* RBAC permissions, and the read-side / Management-side split (this page is read-side; vocabulary curation lives at Management → Tags).
Data Entity Groups & Domains
Logical containers that gather related entities (datasets, transformers, quality tests, consumers) under one umbrella with their own metadata, owners, tags, and terms. Flagging a DEG as a domain surfaces it on the Catalog Overview page's Domains section as a top-level discovery surface.
See the dedicated Data Entity Groups & Domains page under Data Discovery for the DEG metadata model, the Domain framing, the relationship to ML Experiments, and the Group lineage cross-link.
Catalog Overview page
The Overview page is the catalog's home page — a unified surface that combines Search, the Directory level-1 cards, Top tags, Domains, the per-class Entities report, the Recommended quick-jumps, and (when authentication is on) an Owner-association request.
See the dedicated Catalog Overview page page under Data Discovery for the per-section walkthrough, the Recommended panel sub-surface, and the disambiguation between the catalog's Overview page and a data entity's Overview tab.
Directory
The Directory is the catalog's browse-oriented entry point. It complements Search: where Search is query-driven, the Directory walks down a four-level hierarchy — data source types → data sources → entity types → entities — so an operator can drill into the catalog without typing a query. Reach for it when you know the kind of source you want to explore (PostgreSQL, Snowflake, Kafka, ...) but not the specific entity, or when you want a per-source coverage view.
See the dedicated Directory page under Data Discovery for the level-by-level walkthrough, the four backing API endpoints (/api/directory, /api/directory/datasources, /api/directory/datasources/{id}/types, /api/directory/datasources/{id}), and the relationship to the Catalog Overview page (which surfaces the Directory's level-1 cards inline on the home page).
Dictionary terms
Operator-curated term entities that name and describe the concepts your data represents. Terms are first-class catalog citizens with their own descriptions, owners, namespaces, RBAC, and links to data entities (description-text mentions and direct term-to-term links).
See the dedicated Business Glossary page under Data Glossary for the full term-creation flow, the seven TERM_* permissions, term-to-term linking modes, the term-to-entity descriptive walkthrough, and the API surface.
Activity Feed for Monitoring Changes
The platform records every metadata change as a typed event on a global Activity page and on each entity's own Activity tab — entity lifecycle transitions, ownership changes, tag and term assignments, dataset-field edits, data-entity-group changes, and the alert-state transitions described above. The feed is the catalog's audit trail and its change-driven discovery surface.
See the dedicated Activity Feed page under Active platform features for the seven facets on the global filter panel, the full event-type catalogue (grouped by the metadata area each event describes), and the configuration entry for retention partitioning.
Data Collaboration
ODD Platform's Data Collaboration feature lets users start in-app discussion threads anchored to specific data entities, with replies tracked back from a Slack workspace via OAuth + the Slack Events API. Conversations stay attached to the entity that anchored them, so an operator returning to a dataset months later can read the original threads — context, decisions, and follow-ups — without leaving the catalog. The feature is disabled by default (datacollaboration.enabled=false); for the per-entity Discussions-tab visibility caveat when disabled, the message-flow model, the disambiguation between this Slack app and the Slack alert webhook, and the operator-side setup, see the dedicated Data Collaboration page under Active platform features.
Dataset Quality Statuses (SLA)
Operator-set Minor / Major / Critical severities on dataset test results, aggregated into a single dataset-level SLA colour (Green / Yellow / Red) that downstream BI reports import directly via the /api/datasets/{id}/sla endpoint.
See the dedicated Dataset Quality Statuses (SLA) page under Data Quality for the severity-setting workflow, the BI-report URL pattern, and the actual SLA-colour computation logic (which is not a direct severity-to-colour mapping — colours come from SLACalculator based on aggregate severity weights).
Dataset schema diff
The platform compares each dataset's metadata between revisions and surfaces every change — added columns, removed columns, type changes, renames — as a visual side-by-side diff on the dataset's Structure tab. Backwards-incompatible changes additionally raise a Backwards-incompatible schema change alert.
See the dedicated Dataset schema diff page under Data Discovery for the revision history walkthrough, the diff illustrations, and the link to the alert rule.
Associating Terms with Data Entities through Descriptive Information
The Wikipedia-About-style walkthrough — adding business terms to the Dictionary, designating term owners, authoring rich descriptions, linking terms inline using the required format, and the reverse-search capability that surfaces every entity and column linked to a term.
See the dedicated Business Glossary → Term-to-entity associations section under Data Glossary for the full step-by-step walkthrough with screenshots.
Adding Business Names for Data Entities and Dataset Fields
Operators can assign business names to data entities and to individual dataset fields — alternative human-readable labels that surface alongside the original technical names everywhere the entity is rendered.
See the dedicated Business names for data entities and dataset fields page under Data Discovery for the per-entity and per-field workflows, the activity-feed audit-trail event, and the RBAC permissions.
Integrating Vector Store Metadata
The platform recognises vector-typed datasets as a first-class catalog primitive — a dedicated Vector Store dataset type plus a Vector column data type — so vector tables sit alongside relational ones in search, lineage, and ownership. Today the recognition is wired up for PostgreSQL pgvector columns via odd-collector; other adapters can emit the same types by following the specification.
See the dedicated Vector Store metadata page under Data Discovery for the specification source, the adapter coverage, and how the type surfaces in the catalog.
GenAI assistant
The platform ships an opt-in GenAI assistant that proxies natural-language questions to an external AI service the operator runs separately. Disabled by default (genai.enabled=false); when turned on, exposes a single platform endpoint POST /api/genai/ask that forwards each question to POST {genai.url}/query_data and returns the answer. API-only today — no in-app UI affordance currently calls it.
See the dedicated GenAI assistant page under Active platform features for the configuration keys (with the operator-relevant defaults caveat — genai.url empty + genai.request_timeout=0 will silently misconfigure if only enabled=true is set), the external AI service contract (POST /query_data with JSON {"question": "..."}), and the platform's /api/genai/ask request / response schemas.
Data Modelling
The Data Modelling section of the platform houses operator-curated artefacts that describe how data is intended to be used — canonical query examples and the entity-to-entity relationships that collectors extract or that operators define. It opens from the top-level navigation Data Modelling and exposes two sub-surfaces: Query Examples (the snippets below) and Relationships / ERDs.
See the Data Modelling overview for the section's structure, the UI entry points, and the RBAC permissions that gate each surface.
Query Examples
Query Examples are operator-curated SQL / KQL / Spark snippets attached to data entities and glossary terms — the canonical "how this dataset is used" surface. Snippets carry a description that doubles as a prompt-style explanation of intent, link to one or more datasets, and link to terms; a dedicated faceted search and a per-entity / per-term lookup surface them across the catalog.
See the dedicated Query Examples page under Data Modelling for the UI walkthrough, the seven QUERY_EXAMPLE_* RBAC permissions, the 16-endpoint API surface, and the term-linking workflow.
Relationships and ERDs
ODD Platform tracks entity-to-entity relationships as first-class catalog objects: ERD edges between table-class entities (derived from foreign-key constraints in the source) and graph edges between graph-store nodes. The dedicated Data Modelling → Relationships surface lists every relationship across all data sources, with a tab strip for filtering by ERD vs graph and a search input scoped to relationship names.
See the dedicated Relationships page under Data Modelling for the cardinality model (ONE_TO_EXACTLY_ONE / ONE_TO_ZERO_OR_ONE / ONE_TO_ONE_OR_MORE / ONE_TO_ZERO_ONE_OR_MORE), the API surface, and the per-adapter ingestion coverage (PostgreSQL and Snowflake adapters surface ERD today; no adapter currently emits graph relationships).
Data Quality Dashboard
Catalog-wide quality view at /data-quality — three breakdown rings (Table Health / Test Results / Monitored Tables), six anomaly-class metrics (Assertion Tests / Column Values Anomalies / Freshness Anomalies / Schema Changes / Unknown Category / Volume Anomalies), and per-side filter sets (one for tables, one for tests; AND-only conjunction).
See the dedicated Quality Dashboard page under Data Quality for the per-anomaly-class definitions, the monitored-vs-unmonitored breakdown, and the two-side filtering model.
Filters to Include and Exclude Objects from Ingest
Pull adapters in ODD's collectors ingest everything they can see by default. Per-plugin ingestion filters scope a plugin to a regex-defined slice — schemas_filter (PostgreSQL, Snowflake), filename_filter (S3, Azure Blob, GCS), datasets_filter (BigQuery), pipeline_filter (Azure Data Factory), and similar — so the catalog stays focused on what teams care about.
See the dedicated Ingestion filters page under Integrations for the include / exclude shape, the worked PostgreSQL example, and the per-adapter coverage matrix.
Data Entity Statuses
Every catalogued entity carries a status — UNASSIGNED (default), DRAFT, STABLE, DEPRECATED, or DELETED — that signals where it sits in its lifecycle. Statuses surface as a Search facet, drive an Activity-feed event, and trigger a soft-delete TTL handled by the platform's housekeeping job.
See the dedicated Data Entity Statuses page under Data Discovery for the per-status semantics, the operator workflow, the soft-delete TTL configuration, and the RBAC permission.
Alternative Secrets Backend
Store collector configuration — the Platform token, per-plugin database passwords, cloud-provider credentials — in an external secrets backend instead of plaintext in collector_config.yaml. Currently the AWS Systems Manager Parameter Store provider is supported; values from the backend override values set in the local YAML file.
This feature is also known as the collector secrets backend. For the configuration reference, the region-resolution order, a worked SSM example, the required IAM permissions, and the known limitations (10-parameter pagination cap, no custom SSM endpoint, no timeout / retry override), see Collector secrets backend.
Lookup Tables
Lookup Tables are operator-curated reference tables that live inside the ODD Platform itself rather than in an external source system — managed end-to-end (schema, data, versioning, RBAC), exposed in the catalog as standard Data Entities, and reachable both through the platform API and directly via PostgreSQL's lookup_tables_schema. Find them in the platform UI under the Master Data top-level tab → Lookup Tables.
See the dedicated Lookup Tables page under Master Data Management for the creation flow, supported field types, the Data tab walkthrough, the 9 LOOKUP_TABLE_* RBAC permissions, and the full /api/referencedata/ API surface.
Integration Wizards
The Integration Wizard is an in-app UI under Management → Integrations that helps operators bootstrap collector_config.yaml. Pick an integration, fill in a handful of source-specific fields, copy the rendered YAML snippet into your collector config — the wizard is a template generator, not an installer.
See the dedicated Integration Wizard page under Integrations for the per-card flow, the META-INF/wizard/*.yaml registry that backs it, the API surface (GET /api/integrations, GET /api/integrations/{integration_id}), the static-parameter substitution context (today only platform_url, sourced from odd.platform-base-url), and the wizard-vs-collector_config.yaml boundary.
Data Entity Attachments
Operators and users can attach files (images, PDFs, CSVs, TXT) and links (remote URLs) to any data entity for additional context — runbook PDFs, sample CSVs, dashboard screenshots, ticketing references. Attachments persist across re-ingests and can be edited or deleted at any time.
See the dedicated Data Entity Attachments page under Data Discovery for the upload workflow, the storage-backend caveat (LOCAL is ephemeral; use REMOTE S3 / MinIO in production), and the Attachment Storage Configuration operator reference.
"Recommended" panel on the main page
The Recommended panel is a sub-surface of the Catalog Overview page that surfaces personalised quick-jumps for the signed-in user — recently-ingested owned entities, lineage neighbours, and the catalog's most-popular entities.
See Catalog Overview page → Recommended under Data Discovery for the four-column layout, the freshness indicator, the disambiguation from the Alerts → My Objects tab, and the User-owner association prerequisite.
Custom navigation links
Operators can populate the App Info menu (the popup behind the information icon in the top-right toolbar) with their own links — runbooks, internal wikis, support channels, anything that helps users navigate from the catalog into the rest of the platform's surrounding ecosystem. Links are configured once on the platform side via the odd.links[] setting and surface to every signed-in user.
For the configuration detail (YAML and env-var form, including the visibility caveat that every signed-in user can read the configured URLs), see Configure ODD Platform → Additional navigation links. Operationally part of the Management section.
Metadata stale
Entities that have not been re-ingested for longer than odd.data-entity-stale-period (integer days; default 7) are flagged with an orange clock icon next to their name everywhere they appear in the catalog — a discovery-time prompt that the metadata's freshness is uncertain.
See the dedicated Metadata stale page under Data Discovery for the freshness window's meaning, what the indicator does (and doesn't) signal, and the operator-side reference for tuning the value.
Machine-to-Machine (M2M) tokens
ODD Platform supports server-to-server (S2S) API-key authentication — a single shared static token presented in the X-API-Key header — for non-UI programmatic callers such as CI/CD jobs, ingestion pipelines, and automation scripts. Disabled by default; designed for trusted non-human callers and grants ADMIN-role access.
For configuration, the header contract, the curl example, and security considerations (token rotation, HTTPS, blast radius), see Server-to-server (S2S) authentication. Operationally part of the Management section.
Last updated