Skip to content

cache_query

Store a query result with its vector embedding in the cache.

Signature

semantic_cache.cache_query(
    query_text text,
    query_embedding text,
    result_data jsonb,
    ttl_seconds integer DEFAULT 3600,
    tags text[] DEFAULT NULL
) RETURNS bigint

Parameters

Parameter Type Default Description
query_text text required The original query text (for reference and pattern matching)
query_embedding text required Vector embedding as text (e.g., '[0.1, 0.2, ...]')
result_data jsonb required The query result to cache (must be JSONB)
ttl_seconds integer 3600 Time-to-live in seconds (NULL for no expiration)
tags text[] NULL Optional tags for organization and bulk invalidation

Returns

  • bigint: The ID of the newly cached entry

Description

This function stores a query result along with its vector embedding for future semantic retrieval. The embedding should be generated by your preferred embedding model (OpenAI, Cohere, Sentence Transformers, etc.) before calling this function.

Behavior

  • Automatically calculates result size in bytes
  • Sets creation and expiration timestamps
  • Generates a query hash for duplicate detection
  • Initializes access counters (access_count = 0)
  • Updates cache metadata statistics

Vector Dimensions

The embedding must match the configured vector dimension:

SELECT semantic_cache.get_vector_dimension();  -- Check current dimension

If dimensions don't match, you'll get an error. To change dimensions, see Configuration.

Examples

Basic Caching

-- Cache a simple query result
SELECT semantic_cache.cache_query(
    query_text := 'SELECT COUNT(*) FROM users WHERE active = true',
    query_embedding := '[0.123, 0.456, 0.789, ...]'::text,
    result_data := '{"count": 1542}'::jsonb,
    ttl_seconds := 3600,  -- 1 hour
    tags := NULL
);

Returns: 1 (entry ID)

Caching with Tags

-- Cache with organizational tags
SELECT semantic_cache.cache_query(
    query_text := 'SELECT * FROM orders WHERE status = ''pending''',
    query_embedding := '[0.234, 0.567, ...]'::text,
    result_data := '{
        "total": 42,
        "orders": [
            {"id": 1, "amount": 100},
            {"id": 2, "amount": 200}
        ]
    }'::jsonb,
    ttl_seconds := 1800,  -- 30 minutes
    tags := ARRAY['orders', 'pending', 'dashboard']
);

Tags enable: - Organization by category - Bulk invalidation: invalidate_cache(tag := 'dashboard') - Analysis: SELECT * FROM semantic_cache.cache_by_tag

Caching LLM Responses

-- Cache an expensive AI API call
SELECT semantic_cache.cache_query(
    query_text := 'What was our Q4 2024 revenue and how does it compare to Q3?',
    query_embedding := '[0.891, 0.234, ...]'::text,
    result_data := '{
        "answer": "Q4 2024 revenue was $2.4M, up 23% from Q3 2024 at $1.95M. This represents strong growth driven by new customer acquisition.",
        "model": "gpt-4",
        "tokens": 87,
        "sources": ["financial_db", "sales_report_q4.pdf"],
        "confidence": 0.95
    }'::jsonb,
    ttl_seconds := 7200,  -- 2 hours
    tags := ARRAY['llm', 'revenue', 'financial-analysis']
);

Caching with No Expiration

-- Cache static reference data (no expiration)
SELECT semantic_cache.cache_query(
    query_text := 'SELECT * FROM country_codes',
    query_embedding := '[0.445, 0.778, ...]'::text,
    result_data := '{"countries": [...]}'::jsonb,
    ttl_seconds := NULL,  -- Never expires
    tags := ARRAY['reference-data', 'static']
);

Generating Embeddings (Example Pattern)

In production, you'd generate embeddings from an external service:

# Python example with OpenAI
import openai
import psycopg2

def cache_with_embedding(query_text, result_data):
    # Generate embedding
    response = openai.Embedding.create(
        model="text-embedding-ada-002",
        input=query_text
    )
    embedding = response['data'][0]['embedding']
    embedding_str = str(embedding)  # Convert to string

    # Cache in PostgreSQL
    conn = psycopg2.connect("dbname=mydb")
    cur = conn.cursor()
    cur.execute("""
        SELECT semantic_cache.cache_query(
            %s, %s, %s, 3600, ARRAY['api']
        )
    """, (query_text, embedding_str, result_data))
    conn.commit()

Common Patterns

Dynamic TTL Based on Query Type

-- Shorter TTL for real-time data
SELECT semantic_cache.cache_query(
    'SELECT stock_price FROM stocks WHERE symbol = ''AAPL''',
    '[...]'::text,
    '{"price": 178.50}'::jsonb,
    300,  -- 5 minutes
    ARRAY['stocks', 'realtime']
);

-- Longer TTL for analytical data
SELECT semantic_cache.cache_query(
    'SELECT AVG(price) FROM stocks WHERE date > NOW() - INTERVAL ''1 year''',
    '[...]'::text,
    '{"avg_price": 165.23}'::jsonb,
    86400,  -- 24 hours
    ARRAY['analytics', 'historical']
);

Programmatic Caching Wrapper

CREATE OR REPLACE FUNCTION my_app.smart_cache(
    query_text TEXT,
    query_embedding TEXT,
    ttl INTEGER DEFAULT 3600
) RETURNS JSONB AS $$
DECLARE
    cached RECORD;
    result JSONB;
BEGIN
    -- Try cache first
    SELECT * INTO cached FROM semantic_cache.get_cached_result(
        query_embedding, 0.95
    );

    IF cached.found THEN
        RETURN cached.result_data;
    END IF;

    -- Execute actual query (simplified)
    result := '{"computed": true}'::jsonb;

    -- Cache result
    PERFORM semantic_cache.cache_query(
        query_text, query_embedding, result, ttl, ARRAY['app']
    );

    RETURN result;
END;
$$ LANGUAGE plpgsql;

Performance Considerations

  • Embedding Size: Larger dimensions (3072 vs 1536) increase storage and lookup time
  • Result Size: Large JSONB results consume more memory; consider size limits
  • TTL Strategy: Shorter TTLs mean more frequent evictions and cache misses
  • Tags: Many tags per entry add minimal overhead

Error Conditions

Dimension Mismatch

-- Configured for 1536 dimensions
-- But passing 768-dimensional vector
ERROR:  expected vector with dimension 1536, got 768

Solution: Match embedding dimension to configuration:

SELECT semantic_cache.get_vector_dimension();  -- Check
SELECT semantic_cache.set_vector_dimension(768);  -- Change if needed
SELECT semantic_cache.rebuild_index();  -- Apply

Invalid Vector Format

-- Malformed vector string
ERROR:  invalid input syntax for type vector

Solution: Ensure proper format: '[0.1, 0.2, 0.3]'

NULL Required Parameters

-- Missing required parameter
ERROR:  null value in column "query_text" violates not-null constraint

See Also