HTTP + Ollama Chatbot
A chatbot example that uses Ollama for local LLM inference and connects to the pgEdge Natural Language Agent via HTTP to answer questions about your PostgreSQL database using natural language.
Overview
This example demonstrates:
- Connecting to the MCP server via HTTP REST API
- Using Ollama for local, privacy-focused LLM inference
- Tool discovery and execution via HTTP
- Multi-turn conversations with context preservation
Prerequisites
- Python 3.10+
- Ollama installed (get it here)
- The pgEdge Natural Language Agent binary built and available
- A PostgreSQL database (connections can be configured via environment variable or through the chatbot)
Installation
1. Install and configure Ollama:
First, install Ollama from ollama.com, then pull a model:
ollama pull gpt-oss:20b
Start the Ollama service (it usually runs automatically after installation):
ollama serve
2. Navigate to the example directory:
cd examples/http-ollama-chatbot
3. Install Python dependencies:
pip install -r requirements.txt
The requirements.txt contains:
httpx==0.27.0
ollama==0.3.3
4. Build and start the MCP server in HTTP mode:
cd ../..
go build -o bin/pgedge-postgres-mcp ./cmd/pgedge-pg-mcp-svr
# Start the server in HTTP mode (with authentication disabled for this example)
./bin/pgedge-postgres-mcp -http -addr :8080 -no-auth
Note: This example uses -no-auth for simplicity. For production deployments,
use token-based or user-based authentication (see Authentication Guide).
5. Set up environment (in a new terminal):
# Required: MCP server URL
export PGEDGE_MCP_SERVER_URL="http://localhost:8080/mcp/v1"
# Optional: Ollama configuration (defaults shown)
export OLLAMA_BASE_URL="http://localhost:11434"
export OLLAMA_MODEL="gpt-oss:20b"
Note: The MCP server must be configured with database connection information before starting. See "Need to configure a database connection?" in the Troubleshooting section below.
Running the Chatbot
python chatbot.py
You should see:
Using Ollama model: gpt-oss:20b
MCP Server: http://localhost:8080/mcp/v1
✓ Connected to pgEdge Natural Language Agent
PostgreSQL Chatbot (type 'quit' or 'exit' to stop)
============================================================
Example questions:
- How many tables do I have?
- Show me the 10 most recent orders
- What's the total revenue from last month?
- Which customers have placed more than 5 orders?
Example Interaction
You: What tables are in my database?
→ Executing tool: get_schema_info
Assistant: You have 8 tables in your database:
- users
- products
- orders
- order_items
- categories
- reviews
- inventory
- shipments
You: quit
Goodbye\!
Code Walkthrough
1. HTTP Connection with JSON-RPC
The chatbot connects to the MCP server via HTTP using the JSON-RPC 2.0 protocol:
async def _jsonrpc_request(self, method: str, params: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
"""Send a JSON-RPC request to the MCP server."""
request = {
"jsonrpc": "2.0",
"id": self._get_next_id(),
"method": method,
"params": params or {}
}
response = await self.http_client.post(
self.mcp_server_url, # http://localhost:8080/mcp/v1
json=request,
headers={"Content-Type": "application/json"}
)
response.raise_for_status()
return response.json().get("result", {})
async def list_available_tools(self) -> List[Dict[str, Any]]:
"""Retrieve available tools from the MCP server."""
result = await self._jsonrpc_request("tools/list")
return result.get("tools", [])
async def call_tool(self, tool_name: str, arguments: Dict[str, Any]) -> Dict[str, Any]:
"""Call a tool on the MCP server."""
return await self._jsonrpc_request("tools/call", {
"name": tool_name,
"arguments": arguments
})
This uses HTTP with JSON-RPC to communicate with the MCP server instead of stdio, allowing the server and client to run on different machines.
2. Ollama Integration
The chatbot uses Ollama for local LLM inference:
from ollama import AsyncClient
self.ollama = AsyncClient(host=self.ollama_base_url)
# Later, to chat:
response = await self.ollama.chat(
model=self.ollama_model,
messages=full_messages
)
Since Ollama doesn't natively support tool calling like Claude, the chatbot uses a prompt engineering approach:
3. Tool Calling via Prompts
The chatbot formats tools for Ollama and instructs it to respond with JSON when it needs to use a tool:
system_message = f"""You are a helpful PostgreSQL database assistant. You have access to the following tools:
{tools_context}
IMPORTANT INSTRUCTIONS:
1. When you need to use a tool, respond with ONLY a JSON object - no other text before or after:
{{
"tool": "tool_name",
"arguments": {{
"param1": "value1",
"param2": "value2"
}}
}}
2. After calling a tool, you will receive actual results from the database.
3. You MUST base your response ONLY on the actual tool results provided - never make up or guess data.
4. If you receive tool results, format them clearly for the user.
5. Only use tools when necessary to answer the user's question.
"""
The client then attempts to parse the response as JSON. If successful and the JSON contains a "tool" field, it executes the tool. Otherwise, it treats the response as the final answer to the user.
4. Agentic Loop
Similar to the stdio example, but adapted for Ollama:
- Send user query to Ollama with tool definitions in system message
- Check if Ollama responded with a tool call (JSON format)
- Execute the tool via HTTP
- Return results to Ollama
- Repeat until Ollama provides a final answer
Environment Variables
Client Environment Variables:
PGEDGE_MCP_SERVER_URL(required): URL of the MCP server running in HTTP modeOLLAMA_BASE_URL(optional): URL of the Ollama service (default:http://localhost:11434)OLLAMA_MODEL(optional): Ollama model to use (default:gpt-oss:20b)
Server Database Configuration: See "Need to configure a database connection?" in the Troubleshooting section below.
Available Ollama Models
You can use any model that Ollama supports. Popular choices:
gpt-oss:20b(default, recommended)llama3llama3.1mistralmixtralcodellama
To pull a different model:
ollama pull mistral
export OLLAMA_MODEL="mistral"
Troubleshooting
"PGEDGE_MCP_SERVER_URL environment variable is required":
Make sure the MCP server is running in HTTP mode:
./bin/pgedge-postgres-mcp -http -addr :8080 -no-auth
export PGEDGE_MCP_SERVER_URL="http://localhost:8080/mcp/v1"
"Connection refused" or similar HTTP errors:
- Check that the MCP server is running:
curl -X POST http://localhost:8080/mcp/v1 -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' - Verify the URL matches your
PGEDGE_MCP_SERVER_URL(should end with/mcp/v1) - Check server logs for errors
"Ollama not found" or model errors:
- Install Ollama from https://ollama.com/
- Pull a model:
ollama pull gpt-oss:20b - Verify Ollama is running:
curl http://localhost:11434/api/tags
Advantages of HTTP + Ollama
- Privacy: All LLM inference happens locally - no data sent to external APIs
- No API costs: Free to use after initial hardware investment
- Distributed architecture: Server and client can run on different machines
- Offline capable: Works without internet connection
- Scalable: Can distribute load across multiple HTTP servers
Disadvantages
- Performance: Depends on local hardware (may be slower than cloud GPUs)
- Model quality: Ollama models may not match Claude's performance for complex tasks
- Complexity: Requires running both MCP server in HTTP mode and Ollama
- Tool calling: Relies on prompt engineering rather than native tool support, which means models may not consistently follow the JSON format or may hallucinate responses
Source Code
The complete source code for this example is available in the examples/http-ollama-chatbot directory.
Next Steps
- Try the Stdio + Anthropic Claude Chatbot for a simpler setup with more powerful LLM
- Explore authentication options for securing your HTTP endpoint
- Learn about available tools you can use