Metadata Storage End-to-end Data Objects Lineage End-to-end Microservices Lineage Data Quality Test Results Import Pipeline Monitoring and Alerting ML Experiment Logging Manual Object Tagging Data Entity Groups Data Entity Report Dictionary Terms Activity Feed for Monitoring Changes Dataset Quality Statuses (SLA)
The Storage is a data catalog which gathers metadata from your sources. Data processing is based on the near real-time approach. A storage space is not limited. ODD & PostgreSQL provide saving metadata, lineage graphs and full text search, so extra integrations (Elasticsearch, Solr, Neo4j etc.) are not required.
In your Platform account you may find any metadata element using the following options:
- Full-text search
- Filtering by datasources, owners and tags
The Platform supports a lineage diagram, so you can easily track movement and change of your data entities. ODD supports the following data objects:
- Data providers (third-party integrations)
- ETL and ML training jobs
- ML model artifacts and BI dashboards
This feature helps trace data provenance of your microservice-based app. ODD represents microservices as objects and shows their lineage as a typical diagram. The picture below shows the process of metadata ingestion.
Monitor test suite results in the Platform and at the same don't think about masking or removing sensitive data. Your datasets don't migrate to your ODD Platform installation, it gathers test results only. The Platform is compatible with Pandas and Great expectations.
Running your pipelines are easier with manageable parameters available in ODD. For example, you may track modifications of your dataset using a revision history option. Also the Platform represents metadata of your entities such as table structure, field type, description and versions.
In the Platform you may find two types of alerts:
- Notifications for cases when somthing goes wrong with entities you assigned to as an owner
- Notifications for upstream and downstream items
Dataset alerts and job alerts detect backward-incompatible changes of schemas and source targets.
You may configure alert notifications using Slack or Webhook. It allows send third-party notifications when alerts appear or have been resolved.
The Platform helps track and compare your experiments. It enables you to:
- Explore a list of your experiment's entities (tables, datasets, jobs and models)
- Log the most successful experiments
Manage your metadata by tagging tables, datasets and quality tests. Tags provide easy filtering and searching.
You may apply tags to metadata entities or use labels to mark elements of these entities.
Example: an organization has ingested metadata related to its finances into the ODD Platform. All the entities are united into the Finance Namespace by default. To categorize entities, one creates Revenue and Payrolls groups.
A report collects statistical information about data entities on the main page of the Platform. It represents:
- Total amount of entities
- Unfilled entities that have only titles and lack metadata, owners, tags, related terms and other descriptive information
Give an extra information about your data entities by creating terms that define these entities or processes related to them. You may see all terms connected to a data entity on its overview page. All created terms are gathered in the Dictionary tab.
Track changes of your data entities by monitoring the Activity page or Activity tabs placed on pages of data entities. Also, to search needed changes, you may filter events by datasources, namespaces, users and date.
UPDATED– an existing data entity or a descriptive field related to data entity was edited.
ASSIGNED) – an existing tag or term was linked to a data entity.
- 1.Go to the dataset main page and select the Test reports tab.
- 2.Click on a job and then, on the right panel, select a status.
- 3.To add the status into your BI report, use the following URL:
Result: statuses are displayed in the BI-report as color indicators (Minor = green, Major = yellow, Critical = red).