Performance
Tuning
Use a tool like PgTune to set initial values for Postgres server parameters. For instance, shared_buffers should typically be 25% of the server’s memory. You can find the config file with:
SHOW config_file;
And check individual settings with:
SHOW shared_buffers;
Be sure to restart Postgres for changes to take effect.
Loading
Use COPY for bulk loading data (example).
COPY items (embedding) FROM STDIN WITH (FORMAT BINARY);
Add any indexes after loading the initial data for best performance.
Indexing
See index build time for HNSW and IVFFlat.
In production environments, create indexes concurrently to avoid blocking writes.
CREATE INDEX CONCURRENTLY ...
Querying
Use EXPLAIN (ANALYZE, BUFFERS) to debug performance.
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;
Exact Search
To speed up queries without an index, increase max_parallel_workers_per_gather.
SET max_parallel_workers_per_gather = 4;
If vectors are normalized to length 1 (like OpenAI embeddings), use inner product for best performance.
SELECT * FROM items ORDER BY embedding <#> '[3,1,2]' LIMIT 5;
Approximate Search
To speed up queries with an IVFFlat index, increase the number of inverted lists (at the expense of recall).
CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 1000);
Vacuuming
Vacuuming can take a while for HNSW indexes. Speed it up by reindexing first.
REINDEX INDEX CONCURRENTLY index_name;
VACUUM table_name;