Vector Store metadata

Vector Store cataloguing — datasets containing vector-typed columns recognised as a first-class dataset type, with per-column metadata reflecting the vector data type.

ODD recognises vector-typed datasets as a first-class entity in the catalog — a dedicated dataset type plus a dedicated column data type — so vector tables (Postgres pgvector, dedicated vector databases) sit alongside relational tables in search, lineage, and ownership without bespoke handling.

The platform ships two complementary additions for vector data:

Dataset type — Vector Store

The platform exposes a dedicated dataset type literal — Vector Store — that an adapter can emit when the source is a vector-oriented store. Vector Stores show up on Catalog search facets, the Directory, and entity lists, carrying a Vector store type label and a dedicated value on the Type filter facet.

A Vector Store does not carry a distinct entity badge. The coloured class badge on a catalog row is keyed by the entity class, and Vector Store is a type within the Dataset class — so it shows the same Dataset badge as a relational table. The way to tell a Vector Store apart is the type label on the entity and the Type facet on Search, not the badge colour or icon.

A Vector Store recognised in the ODD catalog

The classification is a strong signal for operators: a Vector Store row in the catalog is the same primitive as a relational dataset (it has fields, owners, tags, lineage edges) but the workload it serves is different (similarity search, retrieval-augmented generation, embeddings storage), and surfacing the type makes it explicit.

Column data type — Vector

Each column on a dataset carries a data type. The platform's data-type taxonomy includes Vector as a recognised primitive — adapters that ingest a column declared as a vector type (e.g., pgvector's vector(N) column type) emit it as Vector rather than as a generic Array or Unknown type. The Structure tab renders Vector columns explicitly, and downstream surfaces (lineage, schema diff, search filtering by data type) recognise them.

A Vector-typed column on a dataset's Structure tab

Specification source

Both additions — the Vector Store dataset type and the Vector column data type — are defined in the opendatadiscovery-specification repo's entities.yaml. Adapters speak the specification and emit either type when applicable; the platform ingests whatever the adapter declares.

Adapter coverage

The first adapter to surface vector cataloguing was the PostgreSQL adapter in odd-collector. When the adapter encounters a PostgreSQL table containing at least one column with a vector data type (pgvector extension), it classifies the table as Vector Store during ingestion.

Other adapters that ingest vector-storing systems can emit the same dataset type by following the specification. Coverage grows as more adapters add vector recognition; consult per-adapter pages under Integrations for the latest matrix.

Where to next

Last updated