Using coldfront
ColdFront offers two operating modes. Pick one per table; both can coexist in the same database. The two modes compare as follows:
| Tiered (hot + cold) | Decoupled (iceberg-only) | |
|---|---|---|
| Where rows live | Hot in PG heap (recent), cold in Iceberg (archived) | Everything in Iceberg |
| Setup | Create a partitioned table; let the archiver convert it on first run | One SQL call: coldfront.create_iceberg_table(...) |
| Archiver | Required (cron, moves old partitions to cold) | Not used |
| Best when | Workload has a recent-row OLTP part that benefits from PG indexes + transactional ergonomics | Pure analytic / append-mostly; you want zero PG storage and stateless compute |
Once the table exists, the SQL surface is identical: SELECT,
INSERT, UPDATE, DELETE all work normally against the relation name
(e.g. events).
Prerequisites (both modes)
The stack must already be running with PG + pg_duckdb + coldfront + Lakekeeper + S3-compatible storage: three services - PostgreSQL + pg_duckdb, Lakekeeper, and any S3-compatible object store (SeaweedFS, MinIO, GCS, etc.). The one-time setup below brings it up and bootstraps it.
One-time setup
Bring up the end-user stack (the example uses SeaweedFS; host ports are
published so the localhost commands below work directly). For the image
build itself, see installation.md:
docker compose up -d --build
Then bootstrap Lakekeeper, create the warehouse, and pre-create the Iceberg namespace:
# 1. Bootstrap Lakekeeper
curl -X POST http://localhost:8181/management/v1/bootstrap \
-H "Content-Type: application/json" -d '{"accept-terms-of-use":true}'
# 2. Create warehouse (adjust endpoint/credentials for your S3 store)
curl -X POST http://localhost:8181/management/v1/warehouse \
-H "Content-Type: application/json" -d '{
"warehouse-name": "wh",
"storage-profile": {
"type": "s3", "bucket": "iceberg", "region": "us-east-1",
"endpoint": "http://seaweedfs:8333", "path-style-access": true,
"flavor": "s3-compat", "sts-enabled": false,
"remote-signing-enabled": false
},
"storage-credential": {
"type": "s3", "credential-type": "access-key",
"aws-access-key-id": "admin", "aws-secret-access-key": "adminsecret"
}
}'
# 3. Create the Iceberg namespace in the new warehouse.
# REQUIRED for decoupled (iceberg-only) mode: the Iceberg CREATE SCHEMA is
# deferred to transaction COMMIT but CREATE TABLE is POSTed eagerly, so
# coldfront.create_iceberg_table — which runs both in one
# transaction — would 404 against a cold warehouse. Pre-creating the
# namespace here (its own committed REST call) makes the function's in-txn
# CREATE SCHEMA IF NOT EXISTS a no-op so the table create succeeds. The
# archiver (tiered mode) creates the namespace itself and does not need this.
WID=$(curl -s http://localhost:8181/management/v1/warehouse \
| grep -oE '"warehouse-id":"[^"]+"' | head -1 | cut -d'"' -f4)
curl -X POST "http://localhost:8181/catalog/v1/$WID/namespaces" \
-H "Content-Type: application/json" -d '{"namespace": ["default"]}'
Then install the extensions and set the cold-tier credentials, once per database:
CREATE EXTENSION IF NOT EXISTS pg_duckdb;
CREATE EXTENSION IF NOT EXISTS coldfront;
SELECT coldfront.set_storage_secret('admin', 'adminsecret', 'seaweedfs:8333');
The secret is stored in the coldfront.storage_secret table (excluded
from pg_dump, replicated by value across a Spock mesh) and materialized
as a DuckDB PERSISTENT SECRET that loads at instance init. There is no
per-session setup: the Iceberg catalog ice attaches lazily by the
coldfront C hook on the first query that touches a tiered/decoupled view
(read or write).
For a real cloud-S3 setup, see object_store.md.
Mode 1 - Tiered (hot + cold)
Create a partitioned table normally:
CREATE TABLE events (
id bigint GENERATED ALWAYS AS IDENTITY,
ts timestamptz NOT NULL,
status text,
data jsonb,
PRIMARY KEY (id, ts)
) PARTITION BY RANGE (ts);
CREATE TABLE p_2026_04 PARTITION OF events
FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');
List it in the archiver config:
# config.yaml
archiver:
tables:
- source_table: events
partition_period: monthly
hot_period: 1 month # tier hot PG → cold Iceberg past this age
# retention_period: 5 years # optional: DROP cold data past this age
# # (must exceed hot_period; omit = keep forever)
Run the archiver (typically via cron):
./bin/archiver --config config.yaml
The first run renames events → _events, creates the unified view
events, and registers it. From then on every cycle (1) tiers
partitions older than hot_period from hot PG to cold Iceberg and
advances the watermark, and (2) if retention_period is set, drops cold
Iceberg rows older than it. The data lifecycle is hot → hot_period →
cold → retention_period → gone; omit retention_period to keep cold
data forever.
2-level (LIST → RANGE) tiered tables
A table partitioned LIST (region) → RANGE (ts) can be tiered too - the
same sub_partition block as the standalone partition manager (Mode 3),
so a partition-manager-managed table can be "upgraded" to tiered by
pointing the archiver at it:
- source_table: regional
partition_column: ts # the RANGE (time) column — required for 2-level
partition_period: monthly
hot_period: 1 month
# retention_period: 5 years # optional cold expiry (region-agnostic, by ts)
sub_partition:
values_source: "SELECT region FROM regions"
One Iceberg table holds every region (region is just a column); the
archiver tiers leaves a whole ts period at a time across all
regions before advancing the shared hot/cold watermark, so a period only
becomes cold once it is cold for every region. id mode is not supported
in tiered mode (the cold tier is time-keyed).
Mode 2 - Decoupled (iceberg-only)
Single call:
SELECT coldfront.create_iceberg_table(
p_schema => 'public',
p_table => 'events',
p_columns => '[
{"name":"id", "type":"bigint"},
{"name":"ts", "type":"timestamptz"},
{"name":"status", "type":"text"},
{"name":"data", "type":"jsonb"}
]'::jsonb
);
That single statement provisions:
ice.default.eventson the attached Iceberg catalog- a PG-side wrapper view
public.eventswith proper PG-typed columns - a
coldfront.tiered_viewsregistry row - every INSERT, UPDATE, and DELETE on the view is intercepted by the coldfront C hook and rewritten to a singleduckdb.raw_query(...)againstice.default.events
Spock's ddl_sql repset replicates the CREATE VIEW and the registry
row, so the helper only needs to run on one node - peers pick up the
wrapper view, the coldfront.tiered_views row, and the
coldfront.claims repset registration automatically.
Mode 3 - Standalone partition manager (no cold tier)
You don't need Iceberg at all. If automated PostgreSQL partition
maintenance is all you want - declarative time- or id-based RANGE
partitioning with a premade forward window and automatic age-out of old
partitions - ColdFront's partitioner binary is the whole product: stock
PostgreSQL (or a Spock mesh), no cold tier, no DuckDB, no Iceberg, nothing
to preload. Each invocation makes one reconcile pass per managed table:
premake the forward window, ensure the partition covering now exists,
and detach-then-drop partitions past retention (DETACH ...
CONCURRENTLY, never a bare DROP of attached data). Build it with make
build and run it from cron:
go build -o bin/partitioner ./cmd/partitioner # or: make build
./bin/partitioner --config config.yaml
A partition-only config omits the iceberg: and s3: sections entirely
(supply either all of them or none - a half-filled cold config is
rejected):
postgres:
dsn: "host=localhost dbname=mydb"
archiver:
tables:
- source_table: events
partition_column: ts
partition_period: monthly # monthly | daily
retention_period: 12 months # any PostgreSQL interval (12 months, 90 days, 1 year)
future_partitions: 3 # premake window kept ahead of now
expiration_strategy: drop # drop (DETACH+DROP, destroy; default)
# | detach (DETACH only, keep as a
# standalone table — data preserved)
Operating it
Schedule one pass per period or more often - a cron line, or a systemd
oneshot service plus timer (a failed pass then surfaces as a failed
unit, so alerting is free):
17 * * * * postgres /usr/local/bin/partitioner --config /etc/coldfront/partitioner.yaml >> /var/log/coldfront-partitioner.log 2>&1
Keep the following operational behaviour in mind when scheduling it:
- Exit codes.
0= every table reconciled (a self-healed behind condition still exits0); non-zero = at least one table failed (N table(s) failed), each logged with its[schema.table]prefix. Alert on non-zero. - Behind-detection. If the table already has a past partition but
none covers now at the start of a pass (a lagging cron - live inserts
had no home), the pass heals it (creates the current partition), logs a
WARNING (self-healed)line, and still exits0. Monitor for that warning in the log rather than via the exit code, and widenfuture_partitionsor run more often. A fresh table (only just-premade future partitions) is not behind: its first reconcile succeeds cleanly. - Retention strategy. With the default
expiration_strategy: drop, expiry isDETACH CONCURRENTLY+DROP TABLE- the data is gone, so back up before shrinkingretention_period. Setexpiration_strategy: detachto instead leave the expired partition as a standalone table (detached from the parent, data preserved) and reclaim it yourself. A partition is expired only once its entire range is older thannow − retention_period, computed with calendar-accurate PostgreSQL interval arithmetic (a real month, leap years correct).detachis partition-only - the tiered archiver always drops after exporting to cold.
Primary keys on time-partitioned tables (id mode)
PostgreSQL requires a unique index to include the partition key, so a
plain PRIMARY KEY (id) is impossible on a table partitioned by a
separate ts column. Partition instead by RANGE on a time-ordered id
and the partition key is the key - a single-column PRIMARY KEY (id):
- source_table: events
partition_column: id
partition_period: monthly
retention_period: 12 months
part_mode: id
id_scheme: snowflake # snowflake | uuidv7
uuidv7 reads the RFC 9562 leading-millisecond timestamp; snowflake
decodes the pgEdge snowflake extension's layout. Either way the manager
computes the month/day partition bounds as id values, so id-order equals
time-order and the forward/retention schedule is unchanged.
Two-level (LIST → RANGE) sub-partitioning
For a table partitioned by LIST (region) whose children are themselves
RANGE-partitioned by time, add a sub_partition block. values_source
is a query returning the current level-1 values; the manager provisions
and maintains a RANGE sub-tree under each, creating sub-trees for newly
appearing values automatically:
- source_table: events
partition_column: ts # the level-2 RANGE column
partition_period: monthly
retention_period: 12 months
sub_partition:
values_source: "SELECT code FROM regions"
Managing partitioned tables (CLI)
ColdFront splits configuration into two kinds. Connection config -
the Postgres DSN, and (tiered archiver only) the Iceberg/S3 connection -
stays in a small per-node YAML and is never replicated. Per-table
lifecycle lives in coldfront.partition_config, a name-keyed table
that replicates by value across a Spock mesh (like
tiered_views/archive_watermark), so every node reads identical config
- no per-node file syncing. Manage it with the CLI below (both
partitioner and archiver expose these subcommands; with no subcommand
they do their normal reconcile/archive run).
The data lifecycle is hot PG → hot_period → cold Iceberg →
retention_period → dropped (tiered) or hot PG → retention_period
→ dropped (partition-only). Setting hot_period makes a table tiered;
omitting it makes it partition-only.
hot_period and retention_period are native PostgreSQL intervals -
use any interval syntax (1 month, 90 days, 1 year 2 mons, 5
years). Expiry boundaries are computed with calendar-accurate interval
arithmetic (now() - period: real months, leap years), and
retention_period must exceed hot_period (validated at register/set
time).
The CLI exposes the following subcommands:
| Command | Purpose |
|---|---|
register |
add/adopt a table - validates the PRIMARY KEY covers the partition key |
list |
show managed tables and their lifecycle |
set |
change fields, or --disable/--enable a table |
remove |
stop managing a table (the table itself is left intact) |
import |
seed partition_config from a YAML's archiver.tables (migration) |
export |
dump the active (enabled) config to YAML or SQL - a git-reviewable copy |
# Partition-only: keep 3 future partitions, drop those older than 12 months.
partitioner register --config cf.yaml --table events --period monthly --retention "12 months"
# Partition-only, but DETACH (preserve) expired partitions instead of dropping them.
partitioner register --config cf.yaml --table events --period monthly \
--retention "12 months" --strategy detach
# Tiered: tier to cold Iceberg after 1 month, then drop cold data after 5 years.
archiver register --config cf.yaml --table events --period monthly \
--hot-period "1 month" --retention "5 years"
# id mode — a real single-column PRIMARY KEY (id) on a snowflake-keyed table.
partitioner register --config cf.yaml --table events --period monthly \
--column id --part-mode id --id-scheme snowflake --retention "1 year"
# 2-level LIST(region) → RANGE(ts), tiered; region values come from a table.
archiver register --config cf.yaml --table regional --period monthly --column ts \
--hot-period "1 month" --sub-values-source "SELECT region FROM regions"
partitioner list --config cf.yaml # what's managed
partitioner set --config cf.yaml --table events --retention "24 months"
partitioner set --config cf.yaml --table events --disable # pause (keeps the row)
partitioner remove --config cf.yaml --table events # unregister, keep the table
partitioner import --config legacy.yaml # migrate a YAML's tables
partitioner export --config cf.yaml > managed.yaml # active config, git-reviewable (--format sql for INSERTs)
Run partitioner (or archiver) with no arguments, help, or --help
for the command overview; every subcommand has detailed --help with
worked examples. The write commands accept --print-sql (emit the SQL
without running it - review/commit it) and --dry-run. set
--enable/--disable (mutually exclusive) pause/resume a table without
removing it; a disabled table is skipped by reconcile and omitted from
export. Per table, only the cadence and a destroy boundary are
required; partition_column is auto-detected from pg_catalog for flat
tables (required for 2-level). register writes a row whose CHECK
constraints enforce the lifecycle rules at write time.
YAML archiver.tables still works as a deprecation bridge: when
partition_config is empty the binaries fall back to a YAML table list.
Move off it with import.
Storage backends
Configure exactly one cold-store backend:
- S3 - any S3-compatible store (SeaweedFS, MinIO). Set
endpoint,use_ssl: truefor a TLS endpoint, andurl_style: path(default) orvhost. - Virtual-hosted cloud S3 (AWS S3 is the canonical one) - omit
endpoint(and theendpointarg toset_storage_secret) so DuckDB uses the native per-Region virtual-hosted + HTTPS addressing; just setregionto your bucket's Region. This is required for Regions launched after 2019-03-20 (e.g.ap-south-2), whose DNS does not route path-style requests and returns HTTP 400. The Lakekeeper warehouse profile must be a virtual-hosteds3profile (flavor: aws,path-style-access: false, no custom endpoint); the full walkthrough is object_store.md. - Google Cloud Storage - not a separate backend: use
s3:pointed at GCS's S3-interoperability endpoint with an HMAC key pair (endpoint: storage.googleapis.com,use_ssl: true,access_key/secret_key= the HMAC id/secret). Lakekeeper's warehouse uses ans3profile (flavor: s3-compat,path-style) at the same endpoint. Verified end-to-end (iceberg read+write over interop). Lakekeeper's nativegcsprofile is service-account only and is not used. - Azure ADLS Gen2 - a supported cold-store backend; the access key
rides inside
connection_string.
For an Azure ADLS Gen2 cold tier, set the credential with
set_storage_secret_azure() instead of set_storage_secret() - it takes
a CONFIG-provider connection string. The storage-account access key rides
inside AccountKey=…; the DuckDB azure secret has no separate
account-key parameter, so shared-key auth lives entirely in the
connection string:
SELECT coldfront.set_storage_secret_azure(
'DefaultEndpointsProtocol=https;AccountName=<account>;AccountKey=<key>;EndpointSuffix=core.windows.net');
It writes the same coldfront.storage_secret row (replicated,
pg_dump-excluded) and materializes a TYPE azure PERSISTENT SECRET.
The Azure cold tier is subject to the soft-delete / change-feed
restriction in Gotchas.
Reading + writing (identical for both modes)
The same SQL works against the relation name in either mode, as the following examples show:
-- Reads — pg_duckdb handles the iceberg side; PG handles the heap side
SELECT count(*) FROM events;
SELECT id, status, data->>'k' FROM events WHERE ts >= '2026-04-01';
-- Inserts, updates, and deletes all go through the coldfront C hook,
-- which rewrites the query into one PG-set-based statement (hot side)
-- + one duckdb.raw_query (cold side). For iceberg-only mode every write
-- goes cold; for tiered mode the hook splits by ts vs the watermark.
INSERT INTO events (ts, status, data) VALUES (now(), 'ok', '{"k":1}');
UPDATE events SET status = 'fixed' WHERE id = 123;
DELETE FROM events WHERE ts < '2025-01-01';
-- Bulk INSERT shapes are all set-based — no per-row work:
INSERT INTO events (ts, status, data) VALUES (...), (...), (...);
INSERT INTO events (ts, status, data) SELECT ts, status, data FROM staging;
INSERT INTO events (ts, status, data) SELECT now() + i*'1s'::interval, 'ok', '{}'
FROM generate_series(1, 1000) i;
-- Transactions work; ROLLBACK undoes Iceberg writes too
BEGIN;
UPDATE events SET status = 'pending' WHERE id = 1;
SELECT status FROM events WHERE id = 1; -- sees 'pending' (read-your-own-write)
ROLLBACK;
SELECT status FROM events WHERE id = 1; -- back to whatever it was
Supported column types
The following PostgreSQL column types are supported:
bigint · integer · smallint · real · double precision ·
boolean · timestamp with time zone · timestamp without time zone ·
date · time without time zone · uuid · text · varchar(N) ·
char(N) · bytea · oid · numeric(P,S) (P ≤ 38) · jsonb / json ·
interval
Anything else (unbounded numeric, xml, tsvector, range/multirange
types, custom enums, arrays, composite types) is rejected at
table-creation time. We refuse silent fallback to varchar - losing
precision/identity is worse than no support.
jsonb, json and interval are stored as varchar in Iceberg (no
native primitive) and view-cast back to the rich PG type on read. Queries
like data->>'key' work; jsonb-only operators (?, @>) need an
explicit data::jsonb cast.
inet/cidr are not supported: pg_duckdb cannot process PG inet
(Oid 869) in any Iceberg-backed query, and every cross-tier read is
planned by pg_duckdb - so no cast makes them readable. Store IP data as
text (you can still index/compare it; cast to inet in your own
queries on the hot side only if needed).
Gotchas
Keep the following caveats in mind when running either mode:
jsonbreads: surface asjson, notjsonb. Most operators work; the binary-only ones don't.- Cross-tier isolation: a long-running
SELECTthat touches the Iceberg side multiple times within one transaction may see writes from other sessions interleaved between scans. PG's repeatable-read does not extend across the pg_duckdb boundary. Read-your-own-write within one tx works (verified) - it's only cross-statement consistency vs. concurrent writers that's weaker. - Crash-mid-commit (decoupled mode): if a backend crashes between Iceberg snapshot commit and PG commit, S3 objects can be orphaned. Iceberg housekeeping reclaims them - not corrupting, but a real failure mode.
- Concurrent writes from multiple PG nodes (decoupled mode):
serialized PG-side by the bakery protocol - every iceberg-only INSERT
goes through
coldfront._exec_iceberg_with_claim, which holds a globally-ordered snowflake ticket and waits for its turn before committing to Lakekeeper. No 409 conflicts, no app-level retry. The protocol is Lamport-1978 mutex with the Ricart-Agrawala (1981) deferred-reply optimisation; claims and acks replicate as Spock rows and it stays safe under Spock's asymmetric apply (modelled in docs/formal/Bakery_v2.tla). The bakery requires thedblink+snowflakeextensions, thecoldfront.dblink_selfGUC, and a one-timeSELECT coldfront._ensure_claims_replicated()call on every node after spock mesh setup; see architecture_decoupled.md. Sync-rep is not required. The throughput ceiling is Lakekeeper's commit rate, not the writer count. - Direct table access:
_eventsis the hot heap (tiered mode only).ice.default.<name>is the Iceberg table - only addressable viaiceberg_scan(...)orduckdb.raw_query('… ice.… …'), never via PG-native 3-part names. - Tiered INSERT with omitted IDENTITY column (e.g.
INSERT INTO events (ts, status, data) VALUES …whereidisGENERATED ALWAYS AS IDENTITY): the cold side falls back to a plpgsql cursor loop that callsnextval()per row so cold ids share the hot side's sequence. Correctness is full; throughput is lower than the set-based fast path. Either supplyidexplicitly in the INSERT, or use a partition-column predicate that proves the rows are all hot, to stay on the fast path. For very large historical seeds (mostly-cold), prefer iceberg-only mode where ids come from your source data.
Distributed setup (3-node mesh, decoupled mode)
For multi-writer iceberg workloads, run coldfront on N PG nodes in a Spock mesh against the same Lakekeeper + S3. The bakery serializes commits PG-side so writers never collide at the catalog.
Per-node postgresql.conf
The configuration below applies to each node in the mesh:
wal_level = logical
shared_preload_libraries = 'snowflake,spock,pg_duckdb,coldfront'
# Keep pg_stat_replication.reply_time fresh on every walsender. PG
# default is 10 s; with the bakery's 5 s liveness window that would
# false-positive every idle peer as "dead" on the first claim after a
# quiet period. 1 s leaves comfortable margin.
wal_receiver_status_interval = 1s
# Per-node — distinct integer 1..1023. MUST equal
# (hashtext(<spock node_name>) & 1023); the bakery alignment check raises
# at first claim if these disagree.
snowflake.node = 1
# DSN used by the bakery's autonomous-tx claim INSERT/DELETE via dblink.
# Unix socket avoids TCP overhead. The claim session only touches the
# coldfront.claims/claim_acks heap tables, so it never attaches the
# Iceberg catalog (the lazy catalog-attach hook fires only on a tiered
# view). application_name=coldfront_dblink marks the session as bakery
# traffic.
coldfront.dblink_self = 'host=/tmp dbname=coldfront user=coldfront application_name=coldfront_dblink'
coldfront.warehouse = 'wh'
coldfront.lakekeeper_endpoint = 'http://lakekeeper:8181/catalog'
The bakery has no peer-ack timeout. R-A's only failure mode is a dead
peer (would wait forever), closed by a liveness check inside the
wait-loop: a peer whose pg_stat_replication.reply_time is older than
coldfront.peer_alive_window_ms (default 5000) is implicitly treated
as already-acked. Raise this on slow/lossy WAN links if false-positive
dead-peer rulings become a problem. An alive peer that hasn't acked is
either deferring legitimately (R-A's defer rule) or about to ack - either
way, waiting is correct, not a failure.
Sync-rep (synchronous_standby_names) is not required by the bakery
- the R-A ack barrier replaces it. You can still enable it cluster-wide
if you want stronger durability for non-bakery writes, but it's no longer
load-bearing for iceberg-commit serialisation.
One-time mesh setup - must be done in this order on every node,
because coldfront._ensure_claims_replicated() calls
spock.repset_add_table and so requires the local spock node to already
exist:
-- 1. Extensions, in dependency order. dblink + snowflake are bakery
-- prereqs (R-A's autonomous-tx claim/ack INSERTs go through dblink;
-- claim tickets come from snowflake.nextval).
CREATE EXTENSION IF NOT EXISTS dblink;
CREATE EXTENSION IF NOT EXISTS snowflake;
CREATE EXTENSION IF NOT EXISTS spock;
CREATE EXTENSION IF NOT EXISTS pg_duckdb;
CREATE EXTENSION IF NOT EXISTS coldfront;
-- 2. Spock node + full-mesh subscriptions (each node has N-1 subs).
SELECT spock.node_create('n<i>', 'host=<this_node_priv_ip> user=coldfront dbname=coldfront port=5432');
-- on n1 (n2 and n3 symmetric):
SELECT spock.sub_create('sub_n1_from_n2', 'host=<n2> user=coldfront dbname=coldfront port=5432',
ARRAY['default','default_insert_only','ddl_sql'],
false, false, '{}', '0', false);
SELECT spock.sub_create('sub_n1_from_n3', 'host=<n3> user=coldfront dbname=coldfront port=5432',
ARRAY['default','default_insert_only','ddl_sql'],
false, false, '{}', '0', false);
-- 3. **Required** on every node, after spock setup: register the R-A
-- bakery tables (coldfront.claims + coldfront.claim_acks) in the local
-- node's default repset. Without this on a peer, the peer's ack INSERTs
-- never replicate back to the originating writer and every claim
-- ack-waits to timeout.
SELECT coldfront._ensure_claims_replicated();
synchronize_structure := false, synchronize_data := false - tables
already exist on every node from the coldfront extension; no initial copy
needed.
Verify before benching - insert a sentinel claim on each node and
read it back from every other node. All N×(N-1) directions must show the
row before traffic starts. ci/journey.sh story_mesh_substrate is a
copyable reference.
Tuning knobs
The following GUCs adjust write behaviour and execution; tune them as needed:
coldfront.allow_mixed_writes(bool, defaulton) - controls what happens for tiered-mode UPDATE/DELETE whose WHERE can't be proven to target one tier.onemits a dual-tier CTE;offrejects with an error and a hint. Not relevant in decoupled mode (every write is single-tier by definition).duckdb.force_execution- bench it before flipping: on a mixed workload it helpscount(distinct)and similar but regresses index lookups, top-K with PK ordering, and JSON access. Default off.
Going deeper
For deeper detail on each mode's internals, see the architecture references:
- Tiered architecture, watermark, archiver, transparent UPDATE/DELETE, concurrency: architecture.md.
- Decoupled mode internals, ACID model, distributed scaling: architecture_decoupled.md.