> For the complete documentation index, see [llms.txt](https://docs.opendatadiscovery.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.opendatadiscovery.org/developer-guides/architecture-decision-log/adr-0020-decoupled-outbound-slack-delivery.md).

# ADR-0020: Outbound Slack delivery is decoupled via a Postgres queue

## Status

**Accepted.** Reconstructed from the codebase on 2026-05-30; the decision is live in the source today.

## Context

When a user posts a Data Collaboration message to Slack, the platform calls an external API (`chat.postMessage`) that can be slow, rate-limited, or temporarily down. Doing that call inline on the request thread would couple the user's HTTP latency to Slack's, and would lose the message if the request failed mid-call. The platform also runs in multi-replica deployments, so naive background delivery would risk two replicas sending the same message twice.

The platform needed durable, decoupled, exactly-once-per-message delivery — ideally without adding a message broker to its runtime dependencies (ODD's posture is Postgres-as-only-runtime-dependency).

## Decision

**The post-message request persists the message and returns `202 Accepted`; a background worker delivers it later under a Postgres advisory lock.** The controller does not call Slack inline — it creates the message row and responds `202` (`ResponseEntity.status(HttpStatus.ACCEPTED)`), signalling "accepted, delivery is asynchronous."

Delivery is handled by `DataCollaborationMessageSenderJob`, which **acquires a Postgres advisory lock** (`leaderElectionManager.acquire(senderMessageAdvisoryLockId, true)`, the blocking form) before draining the queue. The advisory lock is the cross-replica coordination primitive: only the lock-holding replica sends, so a message is delivered once across the whole deployment. The lock id is an operator-tunable property (`datacollaboration.sender-message-advisory-lock-id`, default `120`), drawn from a disjoint per-subsystem namespace so the platform's several single-leader workers don't collide.

The worker drains candidates one at a time, calls the provider client, and on failure **retries up to `datacollaboration.sending-messages-retry-count`** (default `3`, incrementing a per-message try-count) before marking the message failed; on success it records the provider message timestamp. Choosing a Postgres advisory lock over Redis/Kafka/SQS is the load-bearing decision — it keeps delivery coordination inside the database the platform already requires.

## Consequences

* The user's request latency is decoupled from Slack's — the post returns as soon as the message is durably queued, and a transient Slack outage delays delivery rather than failing the request.
* Delivery is **once-per-message cluster-wide** without a broker: the advisory lock serialises sending to a single replica. The same single-leader-via-Postgres-advisory-lock mechanism coordinates the notifications WAL consumer (ADR-0043); the two use distinct lock ids from the shared namespace.
* Because sending is single-leader, adding replicas does **not** increase Slack delivery throughput — outbound delivery is intentionally serialised, not horizontally scaled.
* A caller that received `202` cannot observe final delivery success from that response; terminal state lives on the message row (delivered, or failed after the retry budget). Surfacing post-`202` failure to the user is a known limitation of the decoupled model, not a property of this decision.

## Evidence

* `odd-platform-api/.../datacollaboration/controller/DataCollaborationController.java:34-39` — `postMessageInSlack` creates the message and returns `ResponseEntity.status(HttpStatus.ACCEPTED).body(message)`; no inline Slack call.
* `odd-platform-api/.../datacollaboration/job/DataCollaborationMessageSenderJob.java:93-95` — `acquireLeaderElectionConnection()` calls `leaderElectionManager.acquire(dataCollaborationProperties.getSenderMessageAdvisoryLockId(), true)` (blocking) before the drain loop.
* `odd-platform-api/.../datacollaboration/job/DataCollaborationMessageSenderJob.java:36-67` — the drain loop: poll `getSendingCandidate()`, `postMessage(...)`, and on exception retry (`incrementMessageTryCount`) or `markMessageAsFailed`; `:89-91` — `shouldRetry` bounds retries by `getSendingMessagesRetryCount()`.
* `odd-platform-api/src/main/resources/application.yml:202,204` — `sender-message-advisory-lock-id: 120` and `sending-messages-retry-count: 3` as operator-tunable properties.

## See also

* [Data Collaboration](/features/active-platform-features/data-collaboration.md) — the feature and its message lifecycle.
* [ADR-0019 — Data Collaboration ships disabled by default](/developer-guides/architecture-decision-log/adr-0019-data-collaboration-disabled-by-default.md) — the feature must be enabled before this delivery path runs.
* [ADR-0043 — Notification WAL consumer is a leader-elected singleton](/developer-guides/architecture-decision-log/adr-0043-notification-wal-single-leader.md) — the same Postgres-advisory-lock single-leader mechanism, applied to notification delivery.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.opendatadiscovery.org/developer-guides/architecture-decision-log/adr-0020-decoupled-outbound-slack-delivery.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
