# GitHub organization overview

**odd-platform** [\[link\]](https://github.com/opendatadiscovery/odd-platform)

Serving as the backbone of the Open Data Discovery initiative, the `odd-platform` repository hosts the core application for the Next-Generation Data Discovery and Data Observability Platform. It is the central hub for all the components within the ODD ecosystem, orchestrating the discovery, observability, and monitoring of data across an organization.

**documentation** [\[link\]](https://github.com/opendatadiscovery/documentation)

The source repository for the published technical manual at `docs.opendatadiscovery.org`. Authored in GitBook-flavoured markdown; pull requests against `main` trigger a rebuild of the live site. File documentation bugs and contribute new pages here.

**odd-models-package** [\[link\]](https://github.com/opendatadiscovery/odd-models-package)

This project generates Python models and API clients using Pydantic from a given specification. Whenever the specification is updated, the tool automatically triggers a GitHub action to build and publish the package to PyPI.

**opendatadiscovery-specification** [\[link\]](https://github.com/opendatadiscovery/opendatadiscovery-specification)

The OpenAPI specification for the Ingestion API and the ODD metadata model. Foundational to every collector and every push adapter — payloads conform to this spec, and `odd-models-package` regenerates its Pydantic models from it on each update. The same contract is what `odd-tracing-gateway` exposes through its standard adapter-contract `GET /entities` endpoint, so the gateway is consumable by ODD pull collectors as if it were any other source. The repo also defines [ODDRN](/introduction/main-concepts.md#oddrn) as the canonical entity-identifier format the spec relies on.

**oddrn-generator** [\[link\]](https://github.com/opendatadiscovery/oddrn-generator)

This is a collection of helper classes designed to assist in generating unique Oddrn identifiers for various data source entities. Oddrn, or the Open Data Descriptor Resource Name, is a standardized naming convention for identifying data resources. These classes provide a streamlined way to create and manage Oddrn values, ensuring consistency and uniqueness across different data sources and their associated entities. With these tools at your disposal, you can easily create and maintain Oddrn identifiers, making data resource management more efficient and organized.

**odd-collectors** [\[link\]](https://github.com/opendatadiscovery/odd-collectors)

A curated compilation of collectors, intelligently categorized by the types of data sources they interface with:

1. **odd-collector** [\[link\]](https://github.com/opendatadiscovery/odd-collectors/tree/main/odd-collector)

   A versatile collector designed to seamlessly handle various data sources such as databases, BI tools, APIs, and more.
2. **odd-collector-aws** [\[link\]](https://github.com/opendatadiscovery/odd-collectors/tree/main/odd-collector-aws)

   An AWS cloud-based service collector.
3. **odd-collector-gcp** [\[link\]](https://github.com/opendatadiscovery/odd-collectors/tree/main/odd-collector-gcp)

   A GCP cloud-based service collector.
4. **odd-collector-azure** [\[link\]](https://github.com/opendatadiscovery/odd-collectors/tree/main/odd-collector-azure)

   An Azure cloud-based service collector.
5. **odd-collector-sdk** [\[link\]](https://github.com/opendatadiscovery/odd-collectors/tree/main/odd-collector-sdk)

   The repository houses common classes utilized by collectors. One of these key classes is the "Collector," which is responsible for reading the "collector-config.yaml" file, dynamically importing adapter modules, configuring the scheduler, and executing adapters. This centralized component streamlines the collector's setup and operation.

**odd-collector-profiler** [\[link\]](https://github.com/opendatadiscovery/odd-collector-profiler)

The repository leverages DataProfiler to perform critical tasks, encompassing the generation of tags and the execution of extensive statistical analyses on the dataset. This multifaceted approach guarantees a thorough evaluation and categorization of the data contained within the repository, enabling precise data management and analysis.

**odd-cli** [\[link\]](https://github.com/opendatadiscovery/odd-cli)

This project offers a range of functional capabilities accessible through the command-line interface (CLI). These functionalities include retrieving metadata for local files, generating collector tokens, and various other commands designed to enhance the project's usability and versatility.

**odd-dbt** [\[link\]](https://github.com/opendatadiscovery/odd-dbt)

A dedicated project aimed at efficiently fetching and ingesting data related to dbt tests and model lineage. This project serves as a vital tool for tracking and analyzing the quality and lineage of your data models.

**odd-great-expectations** [\[link\]](https://github.com/opendatadiscovery/odd-great-expectations)

Creating a custom action for GreatExpectations, this tool is designed to capture and seamlessly ingest test results into OpenDataDiscovery. This integration streamlines the process of transferring critical insights from data quality tests to your OpenDataDiscovery platform, ensuring a more efficient and comprehensive data management workflow.

**odd-spark-adapter** [\[link\]](https://github.com/opendatadiscovery/odd-spark-adapter)

The push adapter for Apache Spark, distributed as a JVM JAR. Runs as a [Spark Listener](https://spark.apache.org/docs/latest/api/scala/org/apache/spark/scheduler/SparkListener.html) attached to the driver, captures lineage from each job's read / write operations (RDD, JDBC, Kafka batch, Snowflake, S3 Delta) and pushes the resulting metadata to the ODD Platform. v0.0.1 supports Spark 3.3.1 only; Spark Structured Streaming is on the roadmap. See the [odd-spark-adapter integration page](/integrations/integrations/odd-spark-adapter.md) for requirements, configuration, and known limitations.

**odd-airflow-2** [\[link\]](https://github.com/opendatadiscovery/odd-airflow-2)

The push adapter for Apache Airflow 2.5.1 and later. Captures DAG, task, and task-run metadata via Airflow Listeners, with lineage derived from each task's `inlets` and `outlets`. This is the canonical Airflow integration today; deploy by `pip install odd-airflow2-integration` into the same environment as the Airflow scheduler. See the [odd-airflow-2 integration page](/integrations/integrations/odd-airflow-2.md) for installation, configuration, and known limitations.

**odd-airflow** (legacy — Airflow 1.x only) [\[link\]](https://github.com/opendatadiscovery/odd-airflow-adapter)

The earlier integration for Apache Airflow 1.x (up to 1.10.15). Apache Airflow 1.x has reached end-of-life upstream; for any Airflow 2.x deployment use `odd-airflow-2` above. This repo remains available for the narrow case of an unmigrated Airflow 1.x cluster and is not on the same release cadence as the rest of the ODD ecosystem.

**odd-tracing-gateway** [\[link\]](https://github.com/opendatadiscovery/odd-tracing-gateway)

An optional standalone Java service that bridges OpenTelemetry distributed-tracing infrastructure into the ODD catalog. Operator microservices export OpenTelemetry traces to the gateway over OTLP/gRPC; the gateway processes the spans (HTTP, JDBC, Kafka, gRPC, AWS SDK) to infer the services involved and their dependencies, caches the result, and exposes the inferred entities through the standard ODD adapter-contract entities API for the Platform to pull. Ships as a Docker image (`ghcr.io/opendatadiscovery/odd-tracing-gateway`) and as a Helm chart at [`charts/odd-tracing-gateway`](https://github.com/opendatadiscovery/charts/tree/main/charts/odd-tracing-gateway). For the operator-facing setup and configuration, see [`odd-tracing-gateway`](/integrations/integrations/odd-tracing-gateway.md) under Integrations.

**charts** [\[link\]](https://github.com/opendatadiscovery/charts)

The canonical Helm chart repository for the ODD ecosystem. Hosts the production [`odd-platform`](https://github.com/opendatadiscovery/charts/tree/main/charts/odd-platform) chart, the [`odd-collector`](https://github.com/opendatadiscovery/charts/tree/main/charts/odd-collector) chart, the all-in-one [`odd-quicklaunch`](https://github.com/opendatadiscovery/charts/tree/main/charts/odd-quicklaunch) chart, and the [`odd-tracing-gateway`](https://github.com/opendatadiscovery/charts/tree/main/charts/odd-tracing-gateway) chart. Operators using a Helm-based deployment add the repo with `helm repo add opendatadiscovery https://opendatadiscovery.github.io/charts`; see the [Deployment page](/configuration-and-deployment/deployment.md) for the per-path install steps.

**odd-examples** [\[link\]](https://github.com/opendatadiscovery/odd-examples)

A compilation of Docker Compose files tailored to various use cases, offering flexible and convenient configurations for diverse deployment scenarios. Whether you need to set up development environments, orchestrate multi-container applications, or streamline specific tasks, these files provide a versatile toolkit to simplify your Docker deployments.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/developer-guides/github-organization-overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
