> For the complete documentation index, see [llms.txt](https://docs.opendatadiscovery.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.opendatadiscovery.org/developer-guides/architecture-decision-log/adr-0028-range-partition-lifecycle.md).

# ADR-0028: High-volume tables are range-partitioned ahead of need

## Status

**Accepted.** Reconstructed from the codebase on 2026-05-30; the decision is live in the source today.

## Context

The activity audit trail and the data-collaboration message table grow without bound and are append-heavy. PostgreSQL range partitioning keeps them manageable — but only if a partition always exists for the date a row lands on; a missing partition is an insert failure. So partitions must be created *ahead* of need, must be created once across a multi-replica cluster (not once per replica), and must be created for several tables without the job hard-coding each one. There are also two distinct moments coverage can lapse: right after a deployment or scale-up (a fresh replica may start against a schema whose partitions have not been extended), and on the rolling daily boundary.

## Decision

**The same partition-creation job runs at two triggers — at application boot and nightly — each serialised so only one creator runs at a time across the cluster, but by a different single-runner mechanism appropriate to each trigger.**

* **At boot,** a `@PostConstruct init()` acquires a **Postgres advisory lock** (`partition.advisory-lock-id`, default `90`) via the leader-election manager, then creates any missing partitions. The advisory lock serialises concurrent replica boots, and running at startup means a freshly deployed or scaled replica gets coverage immediately rather than waiting for the next nightly run.
* **Nightly,** a `@Scheduled(cron = "0 1 0 * * *")` `run()` is guarded by **ShedLock** (`@SchedulerLock(name = "partitionCreationJob")` plus a `LockAssert.assertLocked()` check), so exactly one replica performs the daily extension even though every replica's scheduler fires.

Both triggers call the same per-table creation logic, which uses **double-width, single-cadence forward coverage**: each created partition spans `2 × partitionDaysPeriod` days while a new partition is appended every `partitionDaysPeriod` days, walking from the last existing partition up to `now + partitionDaysPeriod`. The 2:1 width-to-cadence ratio guarantees a partition always exists for any near-future insert — there is no boundary window where a row has nowhere to land.

Two further choices complete the design:

* **List-injection extensibility.** The job consumes `List<PartitionManager>` — every Spring `@Component` extending `AbstractPartitionManager` is discovered automatically. `ActivityTablePartitionManager` registers unconditionally; `MessageTablePartitionManager` registers only when Data Collaboration is enabled (`@ConditionalOnDataCollaboration`). Adding a partitioned table is an add-a-class change.
* **Continue-on-failure across tables.** A failure creating one table's partitions is caught per-manager, logged at ERROR, and the loop proceeds to the next table — the job maximises coverage across all tables rather than aborting on the first failure.

## Consequences

* Inserts into the activity and message tables always find a partition: coverage is created ahead of cadence at both boot and nightly, and the 2× overlap absorbs the gap between runs. A freshly deployed/scaled replica is covered at startup, not only after the next nightly cron.
* Two single-runner mechanisms coexist by design — an **advisory lock** at boot (every replica boots and contends; the lock picks one) and **ShedLock** for the scheduled run (cluster-wide election for the cron). They use distinct primitives because the triggers differ; the advisory-lock id (`90`) lives in its own per-subsystem namespace.
* The job **only creates partitions — it never drops them.** There is no retention or DROP path here, so partitions and their data accumulate until something else removes them. Reclaiming space is a separate concern: the housekeeping subsystem drops *empty* past partitions (see [ADR-0045](/developer-guides/architecture-decision-log/adr-0045-housekeeping-partition-separation.md)), and dropping *non-empty* aged partitions for retention is operator action.
* Continue-on-failure means a single table's partition-creation failure surfaces only in ERROR logs, not as a request failure — coverage for the failing table can silently lapse until a later successful run. (A connection-level failure in the boot `init()` does throw and fail startup; a per-table failure inside the loop is caught and skipped.)

## Evidence

* `odd-platform-api/.../partition/PostgreSQLPartitionCreationJob.java:26-27` — `@Value("${partition.advisory-lock-id}") private long activityLockId;`; `:29-38` — `@PostConstruct init()` opens `leaderElectionManager.acquire(activityLockId, false)` and creates partitions at boot under the advisory lock.
* `odd-platform-api/.../partition/PostgreSQLPartitionCreationJob.java:40-43` — `@Scheduled(cron = "0 1 0 * * *")` + `@SchedulerLock(name = "partitionCreationJob", lockAtLeastFor = "10m", lockAtMostFor = "10m")` + `LockAssert.assertLocked()`: the nightly ShedLock-guarded run.
* `odd-platform-api/.../partition/PostgreSQLPartitionCreationJob.java:22` — `private final List<PartitionManager> partitionManagers;`; `:53-61` — `createPartitionIfNotExists(...)` wraps each manager in `try { … } catch (Exception e) { log.error(...); }` (continue-on-failure).
* `odd-platform-api/.../partition/manager/AbstractPartitionManager.java:35` — `new TablePartition(lastPartitionDate, lastPartitionDate.plusDays(partitionDaysPeriod * 2L))` (double width); `:30` `bufferDate = baseline.plusDays(partitionDaysPeriod)`, `:33` `while (lastPartitionDate.isBefore(bufferDate))`, `:37` `lastPartitionDate = lastPartitionDate.plusDays(partitionDaysPeriod)` (single cadence).
* `odd-platform-api/.../partition/manager/ActivityTablePartitionManager.java:9-13` (`@Component`, reads `${odd.activity.partition-period:30}`) and `.../MessageTablePartitionManager.java:16-21` (`@Component @ConditionalOnDataCollaboration`, reads `${datacollaboration.message-partition-period:30}`), both extending `AbstractPartitionManager`.
* `odd-platform-api/src/main/resources/application.yml:197-198` — `partition: advisory-lock-id: 90`; `:212-213` — `odd.activity.partition-period: 30`.

## See also

* [ADR-0045 — Housekeeping is a separate subsystem from partition management](/developer-guides/architecture-decision-log/adr-0045-housekeeping-partition-separation.md) — where partition *cleanup* (empty-partition drop) lives, deliberately apart from this *creation* path.
* [Activity Feed](/features/active-platform-features/activity-feed.md) — the audit trail this partitions.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/developer-guides/architecture-decision-log/adr-0028-range-partition-lifecycle.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
