ADR-0028: High-volume tables are range-partitioned ahead of need
ODD Platform creates range partitions ahead of need — one job runs at boot (Postgres advisory lock) and nightly (ShedLock), making double-width partitions per high-volume table.
Status
Accepted. Reconstructed from the codebase on 2026-05-30; the decision is live in the source today.
Context
The activity audit trail and the data-collaboration message table grow without bound and are append-heavy. PostgreSQL range partitioning keeps them manageable — but only if a partition always exists for the date a row lands on; a missing partition is an insert failure. So partitions must be created ahead of need, must be created once across a multi-replica cluster (not once per replica), and must be created for several tables without the job hard-coding each one. There are also two distinct moments coverage can lapse: right after a deployment or scale-up (a fresh replica may start against a schema whose partitions have not been extended), and on the rolling daily boundary.
Decision
The same partition-creation job runs at two triggers — at application boot and nightly — each serialised so only one creator runs at a time across the cluster, but by a different single-runner mechanism appropriate to each trigger.
At boot, a
@PostConstruct init()acquires a Postgres advisory lock (partition.advisory-lock-id, default90) via the leader-election manager, then creates any missing partitions. The advisory lock serialises concurrent replica boots, and running at startup means a freshly deployed or scaled replica gets coverage immediately rather than waiting for the next nightly run.Nightly, a
@Scheduled(cron = "0 1 0 * * *")run()is guarded by ShedLock (@SchedulerLock(name = "partitionCreationJob")plus aLockAssert.assertLocked()check), so exactly one replica performs the daily extension even though every replica's scheduler fires.
Both triggers call the same per-table creation logic, which uses double-width, single-cadence forward coverage: each created partition spans 2 × partitionDaysPeriod days while a new partition is appended every partitionDaysPeriod days, walking from the last existing partition up to now + partitionDaysPeriod. The 2:1 width-to-cadence ratio guarantees a partition always exists for any near-future insert — there is no boundary window where a row has nowhere to land.
Two further choices complete the design:
List-injection extensibility. The job consumes
List<PartitionManager>— every Spring@ComponentextendingAbstractPartitionManageris discovered automatically.ActivityTablePartitionManagerregisters unconditionally;MessageTablePartitionManagerregisters only when Data Collaboration is enabled (@ConditionalOnDataCollaboration). Adding a partitioned table is an add-a-class change.Continue-on-failure across tables. A failure creating one table's partitions is caught per-manager, logged at ERROR, and the loop proceeds to the next table — the job maximises coverage across all tables rather than aborting on the first failure.
Consequences
Inserts into the activity and message tables always find a partition: coverage is created ahead of cadence at both boot and nightly, and the 2× overlap absorbs the gap between runs. A freshly deployed/scaled replica is covered at startup, not only after the next nightly cron.
Two single-runner mechanisms coexist by design — an advisory lock at boot (every replica boots and contends; the lock picks one) and ShedLock for the scheduled run (cluster-wide election for the cron). They use distinct primitives because the triggers differ; the advisory-lock id (
90) lives in its own per-subsystem namespace.The job only creates partitions — it never drops them. There is no retention or DROP path here, so partitions and their data accumulate until something else removes them. Reclaiming space is a separate concern: the housekeeping subsystem drops empty past partitions (see ADR-0045), and dropping non-empty aged partitions for retention is operator action.
Continue-on-failure means a single table's partition-creation failure surfaces only in ERROR logs, not as a request failure — coverage for the failing table can silently lapse until a later successful run. (A connection-level failure in the boot
init()does throw and fail startup; a per-table failure inside the loop is caught and skipped.)
Evidence
odd-platform-api/.../partition/PostgreSQLPartitionCreationJob.java:26-27—@Value("${partition.advisory-lock-id}") private long activityLockId;;:29-38—@PostConstruct init()opensleaderElectionManager.acquire(activityLockId, false)and creates partitions at boot under the advisory lock.odd-platform-api/.../partition/PostgreSQLPartitionCreationJob.java:40-43—@Scheduled(cron = "0 1 0 * * *")+@SchedulerLock(name = "partitionCreationJob", lockAtLeastFor = "10m", lockAtMostFor = "10m")+LockAssert.assertLocked(): the nightly ShedLock-guarded run.odd-platform-api/.../partition/PostgreSQLPartitionCreationJob.java:22—private final List<PartitionManager> partitionManagers;;:53-61—createPartitionIfNotExists(...)wraps each manager intry { … } catch (Exception e) { log.error(...); }(continue-on-failure).odd-platform-api/.../partition/manager/AbstractPartitionManager.java:35—new TablePartition(lastPartitionDate, lastPartitionDate.plusDays(partitionDaysPeriod * 2L))(double width);:30bufferDate = baseline.plusDays(partitionDaysPeriod),:33while (lastPartitionDate.isBefore(bufferDate)),:37lastPartitionDate = lastPartitionDate.plusDays(partitionDaysPeriod)(single cadence).odd-platform-api/.../partition/manager/ActivityTablePartitionManager.java:9-13(@Component, reads${odd.activity.partition-period:30}) and.../MessageTablePartitionManager.java:16-21(@Component @ConditionalOnDataCollaboration, reads${datacollaboration.message-partition-period:30}), both extendingAbstractPartitionManager.odd-platform-api/src/main/resources/application.yml:197-198—partition: advisory-lock-id: 90;:212-213—odd.activity.partition-period: 30.
See also
ADR-0045 — Housekeeping is a separate subsystem from partition management — where partition cleanup (empty-partition drop) lives, deliberately apart from this creation path.
Activity Feed — the audit trail this partitions.
Last updated