Skip to content
This documentation is sourced from a third-party project and is not maintained by pgEdge.

Iterative Index Scans

Added in 0.8.0

With approximate indexes, queries with filtering can return less results since filtering is applied after the index is scanned. Starting with 0.8.0, you can enable iterative index scans, which will automatically scan more of the index until enough results are found (or it reaches hnsw.max_scan_tuples or ivfflat.max_probes).

Iterative scans can use strict or relaxed ordering.

Strict ensures results are in the exact order by distance

SET hnsw.iterative_scan = strict_order;

Relaxed allows results to be slightly out of order by distance, but provides better recall

SET hnsw.iterative_scan = relaxed_order;
# or
SET ivfflat.iterative_scan = relaxed_order;

With relaxed ordering, you can use a materialized CTE to get strict ordering

WITH relaxed_results AS MATERIALIZED (
    SELECT id, embedding <-> '[1,2,3]' AS distance FROM items WHERE category_id = 123 ORDER BY distance LIMIT 5
) SELECT * FROM relaxed_results ORDER BY distance;

For queries that filter by distance, use a materialized CTE and place the distance filter outside of it for best performance (due to the current behavior of the Postgres executor)

WITH nearest_results AS MATERIALIZED (
    SELECT id, embedding <-> '[1,2,3]' AS distance FROM items ORDER BY distance LIMIT 5
) SELECT * FROM nearest_results WHERE distance < 5 ORDER BY distance;

Note: Place any other filters inside the CTE

Iterative Scan Options

Since scanning a large portion of an approximate index is expensive, there are options to control when a scan ends.

HNSW

Specify the max number of tuples to visit (20,000 by default)

SET hnsw.max_scan_tuples = 20000;

Note: This is approximate and does not affect the initial scan

Specify the max amount of memory to use, as a multiple of work_mem (1 by default)

SET hnsw.scan_mem_multiplier = 2;

Note: Try increasing this if increasing hnsw.max_scan_tuples does not improve recall

IVFFlat

Specify the max number of probes

SET ivfflat.max_probes = 100;

Note: If this is lower than ivfflat.probes, ivfflat.probes will be used