cache_query
Store a query result with its vector embedding in the cache.
Signature
semantic_cache.cache_query(
query_text text,
query_embedding text,
result_data jsonb,
ttl_seconds integer DEFAULT 3600,
tags text[] DEFAULT NULL
) RETURNS bigint
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
query_text |
text | required | The original query text (for reference and pattern matching) |
query_embedding |
text | required | Vector embedding as text (e.g., '[0.1, 0.2, ...]') |
result_data |
jsonb | required | The query result to cache (must be JSONB) |
ttl_seconds |
integer | 3600 | Time-to-live in seconds (NULL for no expiration) |
tags |
text[] | NULL | Optional tags for organization and bulk invalidation |
Returns
- bigint: The ID of the newly cached entry
Description
This function stores a query result along with its vector embedding for future semantic retrieval. The embedding should be generated by your preferred embedding model (OpenAI, Cohere, Sentence Transformers, etc.) before calling this function.
Behavior
- Automatically calculates result size in bytes
- Sets creation and expiration timestamps
- Generates a query hash for duplicate detection
- Initializes access counters (access_count = 0)
- Updates cache metadata statistics
Vector Dimensions
The embedding must match the configured vector dimension:
SELECT semantic_cache.get_vector_dimension(); -- Check current dimension
If dimensions don't match, you'll get an error. To change dimensions, see Configuration.
Examples
Basic Caching
-- Cache a simple query result
SELECT semantic_cache.cache_query(
query_text := 'SELECT COUNT(*) FROM users WHERE active = true',
query_embedding := '[0.123, 0.456, 0.789, ...]'::text,
result_data := '{"count": 1542}'::jsonb,
ttl_seconds := 3600, -- 1 hour
tags := NULL
);
Returns: 1 (entry ID)
Caching with Tags
-- Cache with organizational tags
SELECT semantic_cache.cache_query(
query_text := 'SELECT * FROM orders WHERE status = ''pending''',
query_embedding := '[0.234, 0.567, ...]'::text,
result_data := '{
"total": 42,
"orders": [
{"id": 1, "amount": 100},
{"id": 2, "amount": 200}
]
}'::jsonb,
ttl_seconds := 1800, -- 30 minutes
tags := ARRAY['orders', 'pending', 'dashboard']
);
Tags enable:
- Organization by category
- Bulk invalidation: invalidate_cache(tag := 'dashboard')
- Analysis: SELECT * FROM semantic_cache.cache_by_tag
Caching LLM Responses
-- Cache an expensive AI API call
SELECT semantic_cache.cache_query(
query_text := 'What was our Q4 2024 revenue and how does it compare to Q3?',
query_embedding := '[0.891, 0.234, ...]'::text,
result_data := '{
"answer": "Q4 2024 revenue was $2.4M, up 23% from Q3 2024 at $1.95M. This represents strong growth driven by new customer acquisition.",
"model": "gpt-4",
"tokens": 87,
"sources": ["financial_db", "sales_report_q4.pdf"],
"confidence": 0.95
}'::jsonb,
ttl_seconds := 7200, -- 2 hours
tags := ARRAY['llm', 'revenue', 'financial-analysis']
);
Caching with No Expiration
-- Cache static reference data (no expiration)
SELECT semantic_cache.cache_query(
query_text := 'SELECT * FROM country_codes',
query_embedding := '[0.445, 0.778, ...]'::text,
result_data := '{"countries": [...]}'::jsonb,
ttl_seconds := NULL, -- Never expires
tags := ARRAY['reference-data', 'static']
);
Generating Embeddings (Example Pattern)
In production, you'd generate embeddings from an external service:
# Python example with OpenAI
import openai
import psycopg2
def cache_with_embedding(query_text, result_data):
# Generate embedding
response = openai.Embedding.create(
model="text-embedding-ada-002",
input=query_text
)
embedding = response['data'][0]['embedding']
embedding_str = str(embedding) # Convert to string
# Cache in PostgreSQL
conn = psycopg2.connect("dbname=mydb")
cur = conn.cursor()
cur.execute("""
SELECT semantic_cache.cache_query(
%s, %s, %s, 3600, ARRAY['api']
)
""", (query_text, embedding_str, result_data))
conn.commit()
Common Patterns
Dynamic TTL Based on Query Type
-- Shorter TTL for real-time data
SELECT semantic_cache.cache_query(
'SELECT stock_price FROM stocks WHERE symbol = ''AAPL''',
'[...]'::text,
'{"price": 178.50}'::jsonb,
300, -- 5 minutes
ARRAY['stocks', 'realtime']
);
-- Longer TTL for analytical data
SELECT semantic_cache.cache_query(
'SELECT AVG(price) FROM stocks WHERE date > NOW() - INTERVAL ''1 year''',
'[...]'::text,
'{"avg_price": 165.23}'::jsonb,
86400, -- 24 hours
ARRAY['analytics', 'historical']
);
Programmatic Caching Wrapper
CREATE OR REPLACE FUNCTION my_app.smart_cache(
query_text TEXT,
query_embedding TEXT,
ttl INTEGER DEFAULT 3600
) RETURNS JSONB AS $$
DECLARE
cached RECORD;
result JSONB;
BEGIN
-- Try cache first
SELECT * INTO cached FROM semantic_cache.get_cached_result(
query_embedding, 0.95
);
IF cached.found THEN
RETURN cached.result_data;
END IF;
-- Execute actual query (simplified)
result := '{"computed": true}'::jsonb;
-- Cache result
PERFORM semantic_cache.cache_query(
query_text, query_embedding, result, ttl, ARRAY['app']
);
RETURN result;
END;
$$ LANGUAGE plpgsql;
Performance Considerations
- Embedding Size: Larger dimensions (3072 vs 1536) increase storage and lookup time
- Result Size: Large JSONB results consume more memory; consider size limits
- TTL Strategy: Shorter TTLs mean more frequent evictions and cache misses
- Tags: Many tags per entry add minimal overhead
Error Conditions
Dimension Mismatch
-- Configured for 1536 dimensions
-- But passing 768-dimensional vector
ERROR: expected vector with dimension 1536, got 768
Solution: Match embedding dimension to configuration:
SELECT semantic_cache.get_vector_dimension(); -- Check
SELECT semantic_cache.set_vector_dimension(768); -- Change if needed
SELECT semantic_cache.rebuild_index(); -- Apply
Invalid Vector Format
-- Malformed vector string
ERROR: invalid input syntax for type vector
Solution: Ensure proper format: '[0.1, 0.2, 0.3]'
NULL Required Parameters
-- Missing required parameter
ERROR: null value in column "query_text" violates not-null constraint
See Also
- get_cached_result - Retrieve cached results
- invalidate_cache - Remove cached entries
- Configuration - Set vector dimensions
- Use Cases - Integration examples