Test Run History
Per-test history of every individual run — every status, every duration, every status-reason diagnostic. The drill-in counterpart to the Quality Dashboard's tests-by-latest-status summary.
The Test Run History surface lists every individual run of a single Data Quality test — the full timeline of pass / fail / skip / abort / broken outcomes, the upstream framework's status_reason diagnostic text on each run, and the per-run start time, end time, and duration. It is the drill-in for any test where the catalog-wide Quality Dashboard's tests-by-latest-status summary is not enough — the dashboard tells you a test is currently failing; this surface tells you how many times in a row, when each failure happened, and what the upstream framework reported as the reason.
Where to find it
Two places surface per-run history:
/dataentities/{id}/history— full history with infinite scroll. Open a Quality Test entity's detail page and navigate to the History tab. The list shows every run, paginated 100 at a time, ordered most-recent-first. A status filter at the top of the table narrows the list to a single status (Show all statuses, Success, Failed, Skipped, Broken, Aborted, Unknown).The first 10 runs as a preview on
/test-report. A Quality Test's main report surface includes a recent-runs strip — the same endpoint, scoped to the first 10 runs. Use it as a quick "did this test just flip" glance; use the History tab for everything else.
The History tab is hidden when the test entity's status is DELETED — the route redirects to the entity's Overview. Restore the entity via the status badge to access its run history (see Data entity statuses).
The endpoint
Both UI surfaces consume the same endpoint:
GET /api/dataentities/{data_entity_id}/runs?page={page}&size={size}&status={status}data_entity_id
yes
The Quality Test's data-entity id (numeric). The same id that appears in the test's catalog URL.
page
no
1-based page number. Defaults to the platform's standard pagination default if omitted.
size
no
Page size. The UI passes 100 on /history and 10 on /test-report.
status
no
One of SUCCESS / FAILED / SKIPPED / BROKEN / ABORTED / UNKNOWN. Omit (or pass null) for all statuses.
The response is a DataEntityRunList — a PageInfo (total, hasNext) plus an items array of DataEntityRun objects (each carrying id, oddrn, startTime, endTime, status, statusReason).
status_reason is a free-form string that the upstream DQ framework — Great Expectations, dbt, odd-collector-profiler, or any custom framework that pushes to the Test Results Import endpoint — populates. Common framework behaviour: Great Expectations writes a JSON summary of the failed expectation including a sample of failing rows; dbt writes the failing test's compiled SQL and the row count above the threshold; odd-collector-profiler writes the metric and the observed value. The platform does not enforce a schema on this field — it is rendered verbatim on the History tab as the "Status reason" column.
Sort, pagination, status filter
The list is sorted by end-time, most recent first — the run that finished last is at the top. Pagination is server-side; the /history tab uses an infinite-scroll renderer that calls the next page when the user scrolls to the end of the visible rows.
The status dropdown above the list is independent of the search facets used on the Catalog page; it only narrows the runs on this test. Selecting a status fires a new request with the status query param.
Known limitations and operator caveats
A few behaviours on this surface are non-obvious from the UI alone. Each item below states what an operator might assume, what the platform actually does, and what to do today.
The page returns HTTP 500 while a test is currently running. The platform's database stores seven run-status values (SUCCESS, FAILED, SKIPPED, BROKEN, ABORTED, RUNNING, UNKNOWN), but the API's wire schema declares only six (no RUNNING). The endpoint's response mapper performs a strict string-to-enum conversion that throws IllegalArgumentException the moment a RUNNING row appears in the result set; the platform's controller-advice maps the exception to HTTP 500.
Consequence. The History tab is unavailable exactly when an operator most wants to consult it — during an in-flight test execution. The error page on the UI shows a generic "Something went wrong" message; there is no signal that the cause is "a row in your DB has a status the wire format does not understand." The same 500 affects the /test-report first-10-runs preview.
Workaround until the upstream platform fix lands: refresh the page after the in-flight run completes — once the run's end_time populates and the row's status transitions out of RUNNING, the result set no longer contains the unmappable value and the History tab loads normally. If the test runs frequently enough that the page is always "in flight," there is no operator-side workaround beyond network-layer scraping of the raw endpoint and parsing the JSON before any wire-enum mapping.
The endpoint is reachable to any authenticated user — there is no per-owner gate on read. GET /api/dataentities/{id}/runs is not enumerated in the platform's authorization rules; the catch-all "any authenticated user" rule covers it. Authenticated callers under LOGIN_FORM, OAUTH2, or LDAP read every Quality Test's full run history regardless of dataset ownership; under auth.type=DISABLED the same reads are reachable anonymously.
This matters for the status_reason field specifically. The free-form diagnostic text upstream DQ frameworks emit on failure routinely contains team-confidential information:
Great Expectations writes samples of failing data values into the failed-expectation summary (a row's PII column value, a customer id, a transaction amount).
dbt writes the failing test's compiled SQL, which exposes column names, table names, and the test thresholds.
Custom-framework pipelines often append free-form text — internal table aliases, environment identifiers, ticket numbers.
In a multi-tenant deployment, any signed-in user from one team reads every other team's per-run diagnostic text indefinitely (run history is retained at least as long as the underlying test entity exists). Operators planning for cross-team data-shape isolation should treat status_reason as a catalog-read-collaborative field — same posture as Owner / Namespace / Datasource directories — and either configure upstream DQ frameworks not to emit failing-row samples (the GE / dbt / profiler configuration knobs are framework-side; consult their docs), or enforce isolation at the network perimeter (reverse-proxy rules on /api/dataentities/*/runs). The platform's RBAC layer does not gate this read today.
In-flight runs (when they don't trigger the 500) appear at the top of the list with an empty Duration column. The default sort is end_time DESC. Rows whose end_time is NULL (in-flight runs that have started but not finished) sort to the top under Postgres' default NULLS FIRST ordering for DESC. The UI labels each row by startTime and computes the Duration column from endTime - startTime; with endTime null, the Duration column renders empty.
An operator scanning the History tab from top to bottom can see one or more "undated-looking" rows at the top with no Duration, no Status-reason text, and a status that the wire enum recognises (if the framework writes a non-RUNNING status before finishing). Treat any row with an empty Duration as in-flight; the platform will repopulate the row when the run completes.
This behaviour interacts with the HTTP 500 caveat above — if the in-flight run's row carries a RUNNING status, the History tab is unavailable entirely; if the in-flight row carries any of the six wire-enum values and a NULL end_time, the page loads with the row at the top.
Where to next
Quality Dashboard — the catalog-wide tests-by-latest-status summary that this page drills into.
Dataset Quality Statuses (SLA) — operator-set severities on test results that feed the dataset-level SLA colour; the History page is where you would investigate why a given test's latest run drove the SLA flip.
Test Results Import — the push-client integrations (Great Expectations, dbt,
odd-collector-profiler, custom frameworks) that produce the runs surfaced on this page.Alerting — DQ-test-failed alerts (the alert lifecycle that fires on the same run-completed event the History page reads from).
Activity Feed — the audit trail of every test-result import that this History list reflects.
Last updated