# Visibility for Data Quality Engineer

**Key words**: data quality metrics, Great Expectations, dbt tests, DataProfiler, custom DQ frameworks.

### Challenge

As a Quality Assurance Engineer, I cannot cover all data quality monitoring activities. I know that some book orders can be mapped to wrong dimensions or even miss crucial fields associated with an order. I want to automate the DQ monitoring process and have a place where my team and our users can monitor pipeline health on a given day.

### Solution

The ODD Platform ingests test results from [Great Expectations](https://github.com/opendatadiscovery/odd-great-expectations) and [dbt tests](https://github.com/opendatadiscovery/odd-dbt) (both push-clients), plus statistical profiles from [odd-collector-profiler](https://github.com/opendatadiscovery/odd-collector-profiler) (powered by Capital One's [DataProfiler](https://github.com/capitalone/DataProfiler)). Teams with a custom DQ framework can push results through the `POST /ingestion/entities/datasets/stats` endpoint of the [ODD Specification](/main-concepts.md#odd-specification). See the [Data Quality Test Results Import](/features.md#data-quality-test-results-import) feature for the platform-side view.

### Scenario

1. My team’s pipeline is processing more than two billion book orders daily and uses two OLTP systems and ten dimensional tables as its sources.
2. I want to check the following DQ KPIs based on six DQ dimensions: \\

* **Timeliness**: how much time does it take for an order to become available in my product? \\
* **Completeness**: do I have any missing values in the most crucial fields, e.g. date, book ID, amount, etc.? \\
* **Uniqueness**: do I have any duplicated book orders in my dataset? \\
* **Validity**: do the values comply with expected value format, e.g. book ISBN has an expected number of digits? \\
* **Consistency**: when I do a lookup on dimensional table to return a book name, do I get all book IDs covered? \\
* **Accuracy**: does my sales data reconcile with other sources?

3. I cover the Timeliness, Completeness, Uniqueness and Validity dimensions with Great Expectations test suites and statistical profiles produced by `odd-collector-profiler`, both of which land in ODD alongside every other dataset's metadata.
4. For the Consistency and Accuracy dimensions I need to compare several profiles across datasets, which the out-of-the-box frameworks don't cover — I write a small SQL script, run it on a schedule, and push its results through the `POST /ingestion/entities/datasets/stats` endpoint so the custom KPIs show up next to the framework-produced ones.
5. I import test suite results from Great Expectations to ODD.
6. As ODD allows a DQ import not only from pre-defined libraries but also from custom frameworks, I add my custom test suite results to the Platform as well.
7. I can expose all my DQ KPIs to the ODD Platform and share it with my stakeholders: both my team and my users.

**Result**: I provide a transparent and accessible way of pipeline health monitoring and also use this feature when assessing reliability of other sources of my interest.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/use_cases/dq_visibility.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
