Version:

Docker Deployment

This guide explains how to deploy pgEdge RAG Server using Docker and Docker Compose.

Prerequisites

Before deploying with Docker, ensure you have:

Docker Engine 20.10 or later
Docker Compose V2 or later
API keys for your chosen LLM providers (see Managing API Keys)

Quick Start

1. Clone the Repository

git clone https://github.com/pgedge/pgedge-rag-server.git
cd pgedge-rag-server

2. Configure Environment Variables

Copy the example environment file and configure your API keys:

cp docker.env.example .env

Edit the .env file and add your API keys:

# Required: Add your API keys
OPENAI_API_KEY=sk-your-openai-key-here
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here

# Optional: Customize ports and credentials
POSTGRES_PASSWORD=your-secure-password
RAG_SERVER_PORT=8080

3. Configure the RAG Server

The repository includes a sample configuration file pgedge-rag-server.yaml. Review and customize it for your needs:

Update database credentials if you changed them in .env
Configure your pipelines with appropriate tables and columns
Select your preferred embedding and LLM models

See Creating a Configuration File for detailed configuration options.

4. Start the Services

docker compose up -d

This command will:

Pull the PostgreSQL with pgvector image
Build the RAG server Docker image
Start both services
Initialize the database with pgvector extension and sample schema

5. Verify the Deployment

Check that the services are running:

docker compose ps

Test the RAG server:

curl http://localhost:8080/v1/pipelines

Using Pre-built Images

Instead of building locally, you can use pre-built images from GitHub Container Registry:

docker pull ghcr.io/pgedge/rag-server:latest

Update your docker-compose.yml to use the pre-built image:

services:
    rag-server:
        image: ghcr.io/pgedge/rag-server:latest
        # Remove the 'build' section

Database Initialization

The included init-db.sql script automatically:

Enables the pgvector extension
Creates a sample documents table with vector columns
Creates indexes for vector similarity search
Creates indexes for BM25 text search (hybrid mode)

Customizing the Schema

To customize the database schema:

Edit init-db.sql to match your data structure
Adjust vector dimensions based on your embedding model:
OpenAI text-embedding-3-small: 1536 dimensions
OpenAI text-embedding-3-large: 3072 dimensions
Voyage AI models: 1024 or 1536 dimensions
Update pgedge-rag-server.yaml to reference your table and column names
Restart the services: docker compose down && docker compose up -d

Populating Your Database

After starting the services, populate your database with content:

# Connect to the PostgreSQL container
docker compose exec postgres psql -U postgres -d ragdb

# Insert sample documents (adjust vector dimensions as needed)
INSERT INTO documents (content, title, source) VALUES
('Your document content here', 'Document Title', 'source-name');

For production use, you'll typically:

Generate embeddings using your embedding model
Insert both the text content and embeddings into the database
Ensure the vector dimensions match your embedding model

Managing the Deployment

View Logs

# All services
docker compose logs -f

# Specific service
docker compose logs -f rag-server
docker compose logs -f postgres

Stop Services

docker compose stop

Restart Services

docker compose restart

Remove Services and Data

# Stop and remove containers (preserves data volumes)
docker compose down

# Remove everything including data volumes
docker compose down -v

Update the RAG Server

To update to a new version:

# Pull the latest image
docker compose pull rag-server

# Restart the service
docker compose up -d rag-server

Production Considerations

For production deployments, consider:

Security

Never commit .env files to version control
Use strong passwords for PostgreSQL
Enable TLS/HTTPS for the RAG server (see Configuration)
Restrict network access using Docker network policies
Use secrets management (Docker Secrets, Kubernetes Secrets, etc.)

Data Persistence

The docker-compose setup uses Docker volumes for PostgreSQL data:

volumes:
    postgres_data:
        driver: local

For production:

Use named volumes or bind mounts to specific host paths
Implement regular backup strategies
Consider using managed PostgreSQL services

Resource Limits

Add resource constraints to your docker-compose.yml:

services:
    rag-server:
        deploy:
            resources:
                limits:
                    cpus: '2'
                    memory: 2G
                reservations:
                    cpus: '1'
                    memory: 1G

High Availability

For high availability:

Deploy multiple RAG server instances behind a load balancer
Use PostgreSQL replication for database redundancy
Consider orchestration platforms like Kubernetes

Troubleshooting

Service Won't Start

Check logs for errors:

docker compose logs rag-server

Common issues:

Missing or invalid API keys in .env
Configuration file syntax errors
Port conflicts (8080 or 5432 already in use)

Cannot Connect to PostgreSQL

Verify the database is ready:

docker compose exec postgres pg_isready -U postgres

If the database isn't ready, wait a few moments for initialization to complete.

pgvector Extension Not Found

Ensure you're using the pgvector/pgvector Docker image, which includes the extension pre-installed.

Configuration Changes Not Applied

After modifying pgedge-rag-server.yaml:

docker compose restart rag-server

The configuration file is mounted read-only, so changes on the host are immediately available after restart.