Best Practices (ACE)
ACE (Active Consistency Engine) helps keep nodes in a pgEdge Distributed Postgres cluster consistent by detecting and repairing drift in data, schema, and Spock configuration.
General Guidance
Validate your prerequisites.
-
Ensure every table has a primary key (ACE requires it for block/range partitioning).
-
Confirm credentials and connection details in your
pg_service.conffile.
Scope deliberately. Use targeted schema-diff, repset-diff, or table-diff with --table-filter instead of cluster-wide runs when diagnosing a known issue.
-
table-diff: Performs a deep dive on a specific table (fastest to iterate). -
schema-diff: Inventories and compares objects (tables/views/functions/indexes) and optionally run per-table diffs. -
repset-diff: Sweep all tables in a replication set. -
spock-diff: Validate Spock metadata across nodes.
Start safe, then add options.
-
Begin with
--output jsonand--quietoff; add--output htmlfor human review. -
Use
--dry-runwhen performing repairs.
Control Resource Use When Possible.
- Adjust
--block-size,--concurrency-factor, and--compare-unit-sizeto match each host’s capacity. For Merkle operations, consider lowering or raising--max-cpu-ratioas needed.
Keep your Statistics Fresh.
- Run
ANALYZEon large/cold tables before heavy comparisons, since ACE relies on probabilistic sampling (TABLESAMPLE) to speed things up. Especially useful for cold tables and when using Merkle trees.
Use a Connection Pooler.
- Point ACE at pgBouncer or pgCat for stable, efficient connections.
Automate checks.
- Use
--schedule --every=<duration>for quick loops or configure jobs inace.yamland run./ace start. See Scheduling ACE Runs for examples.
Adopting a Safe Repair Workflow
It's a good practice to schedule diff jobs (schema/repset/table) during low-traffic windows and after maintenance events.
- Detect Differences Often: Run a diff and review the JSON/HTML report.
- Perform a Repair Dry-run:
table-repair --dry-runto preview actions. - Perform Conservative Repairs when Possible: Prefer
--upsert-onlyor--insert-onlyon critical tables where deletes are risky. - Verify the Repair:
table-rerunusing the original diff to confirm resolution. - Iterate in Batches if the number of diffs is large, narrow the scope with
--table-filteror segment work by replication set/schema.