ACE Getting Started
ACE is a powerful tool designed to ensure and maintain consistency across nodes in a pgEdge Distributed Postgres cluster. ACE helps identify and resolve data inconsistencies, schema differences, and replication configuration mismatches across nodes.
Key features of ACE include:
- Table-level data comparison and repair
- Replication set level verification
- Diff-driven repair workflows
- Schema comparison
- Spock configuration validation
ACE Use Cases
In an eventually consistent multi-master system, nodes may potentially diverge due to replication issues, network partitions, or node failures. ACE helps restore correctness by performing efficient, controlled comparisons and targeted repairs across nodes.
Node Failures (Planned/Unplanned)
- Problem: A node rejoins the cluster but is out-of-sync.
- Approach:
table-diff(orrepset-diff/schema-diff) to assess drift;table-repairwith--dry-runthen apply.
Network Partitions / Link Degradation
- Problem: Cross-region clusters experience Spock exceptions and partial replication.
- Approach: Identify impacted rows precisely with
table-diff; repair with--upsert-onlyor--insert-onlywhere appropriate to minimize risk.
Planned Maintenance Windows
- Problem: Nodes fall behind during upgrades or maintenance.
- Approach: Run diffs and perform bulk
table-repairto re-synchronize.
Post-Repair Verification
- Problem: Need to confirm remediation success.
- Approach: Use
table-rerun --diff-file=<original-diff>to verify that discrepancies no longer exist.
Spock Configuration Validation
- Problem: Metadata/config drift can cause replication anomalies.
- Approach: Use
spock-diffto compare Spock state across nodes; correct differences before they cause data drift.
Large-Scale Integrity Checks
- Problem: Very large tables make full scans impractical.
- Approach: Use Merkle trees:
- Initialize (
mtree init) and build the tree (mtree build) once for each table, then usemtree table-difffor faster comparisons. - Optionally keep trees current with
mtree listenfor real-time tree updates, and save time duringmtree table-diff.
Simplifying ACE Operations
- Schedule ACE via CLI timers or the built-in scheduler to perform periodic checks (see Scheduling ACE Runs).
- Segment and Target by schema, repset, or by using
table-filterto keep runs predictable. - Store ACE-generated JSON/HTML reports for complete audit trails.
Known Limitations
- ACE cannot be used on a table without a primary key, because primary keys are the basis for range partitioning, hash calculations, and other critical functions in ACE.