Version:

This documentation is sourced from a third-party project and is not maintained by pgEdge.

Performance

Tuning

Use a tool like PgTune to set initial values for Postgres server parameters. For instance, shared_buffers should typically be 25% of the server’s memory. You can find the config file with:

SHOW config_file;

And check individual settings with:

SHOW shared_buffers;

Be sure to restart Postgres for changes to take effect.

Loading

Use COPY for bulk loading data (example).

COPY items (embedding) FROM STDIN WITH (FORMAT BINARY);

Add any indexes after loading the initial data for best performance.

Indexing

See index build time for HNSW and IVFFlat.

In production environments, create indexes concurrently to avoid blocking writes.

CREATE INDEX CONCURRENTLY ...

Querying

Use EXPLAIN (ANALYZE, BUFFERS) to debug performance.

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

Exact Search

To speed up queries without an index, increase max_parallel_workers_per_gather.

SET max_parallel_workers_per_gather = 4;

If vectors are normalized to length 1 (like OpenAI embeddings), use inner product for best performance.

SELECT * FROM items ORDER BY embedding <#> '[3,1,2]' LIMIT 5;

Approximate Search

To speed up queries with an IVFFlat index, increase the number of inverted lists (at the expense of recall).

CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 1000);

Vacuuming

Vacuuming can take a while for HNSW indexes. Speed it up by reindexing first.

REINDEX INDEX CONCURRENTLY index_name;
VACUUM table_name;