pgEdge RAG Server
A simple API server for performing Retrieval-Augmented Generation (RAG) of text based on content from a PostgreSQL database using pgvector.
What is RAG?
Retrieval-Augmented Generation combines information retrieval with generative AI to produce accurate, grounded responses. Instead of relying solely on an LLM's training data, RAG:
- Retrieves relevant documents from a knowledge base
- Provides those documents as context to the LLM
- Generates an answer based on the retrieved information
This approach reduces hallucinations and keeps responses current with your data.
Features
-
Multiple Pipelines - Configure separate RAG pipelines for different data sources, each with its own database, embedding model, and LLM
-
Hybrid Search - Combines vector similarity (semantic) and BM25 (keyword) search using Reciprocal Rank Fusion for better results
-
Multiple LLM Providers - Support for OpenAI, Anthropic, Voyage, and Ollama
-
Token Budget Management - Automatically manages context size to control LLM costs
-
Streaming Responses - Optional real-time streaming via Server-Sent Events
-
TLS Support - Built-in HTTPS support for production deployments
Quick Start
Prerequisites
- Go 1.22 or later
- PostgreSQL with pgvector extension
- API keys for your chosen LLM providers
Installation
# Clone the repository
git clone https://github.com/pgEdge/pgedge-rag-server.git
cd pgedge-rag-server
# Build the binary
make build
# Run the server
./bin/pgedge-rag-server -config config.yaml
Basic Configuration
Create a config.yaml file:
pipelines:
- name: "my-docs"
description: "Search my documentation"
database:
host: "localhost"
database: "mydb"
tables:
- table: "documents"
text_column: "content"
vector_column: "embedding"
embedding_llm:
provider: "openai"
model: "text-embedding-3-small"
rag_llm:
provider: "openai"
model: "gpt-4o-mini"
Set your API key:
export OPENAI_API_KEY="sk-..."
Query the Server
# List available pipelines
curl http://localhost:8080/v1/pipelines
# Ask a question
curl -X POST http://localhost:8080/v1/pipelines/my-docs \
-H "Content-Type: application/json" \
-d '{"query": "How do I get started?"}'
Documentation
- Configuration Reference - Complete configuration options
- API Reference - REST API documentation
- Architecture - How the server works internally
Development
# Run all checks (format, lint, test, build)
make all
# Run tests only
make test
# Run linter only
make lint
# Format code
make fmt
License
See LICENSE.md for license information.