Version:

Alerter Configuration

The pgEdge AI DBA Workbench Alerter supports configuration through YAML files and command-line flags.

Configuration Precedence

The alerter applies configuration settings in the following order; later sources override earlier ones:

Default values built into the application.
Configuration file settings (YAML format).
Command-line flag overrides.

Configuration File

The alerter searches for its configuration file in the following order:

The path specified via the -config flag.
The per-user config directory at ~/.config/pgedge/ai-dba-alerter.yaml on Linux (honouring $XDG_CONFIG_HOME), ~/Library/Application Support/pgedge/ai-dba-alerter.yaml on macOS, and %AppData%\pgedge\ai-dba-alerter.yaml on Windows.
/etc/pgedge/ai-dba-alerter.yaml (system-wide).

If -config is set and the file is missing, the alerter exits with an error. If -config is not set and none of the default locations contain a configuration file, the alerter uses built-in defaults silently. The alerter no longer searches the binary directory or the current working directory. A SIGHUP signal re-runs discovery on each reload, so a configuration file installed at a default location after startup is picked up on the next signal.

A complete example configuration file is available at ai-dba-alerter.yaml in the project repository.

Command-Line Flags

The alerter accepts the following command-line flags:

Flag	Description	Default
`-config`	Path to configuration file	Auto-detected
`-debug`	Enable debug logging	`false`
`-db-host`	Database host	None
`-db-port`	Database port	None
`-db-name`	Database name	None
`-db-user`	Database user	None
`-db-password`	Database password	None
`-db-sslmode`	Database SSL mode	None

In the following example, the alerter starts with debug logging and a custom configuration file:

./ai-dba-alerter -debug \
    -config /etc/ai-workbench/alerter.yaml

In the following example, the alerter connects to a specific database without a configuration file:

./ai-dba-alerter \
    -db-host db.example.com \
    -db-name ai_workbench \
    -db-user alerter \
    -db-password secret

Configuration File Reference

The configuration file uses YAML format. The following sections describe all available options.

Datastore Connection (`datastore`)

The datastore section configures the connection to the AI DBA Workbench PostgreSQL datastore.

Option	Type	Default	Description
`host`	string	`localhost`	Server hostname
`hostaddr`	string	None	Server IP address
`database`	string	`ai_workbench`	Database name
`username`	string	`postgres`	Database username
`password`	string	None	Database password
`password_file`	string	None	Path to password file
`port`	integer	`5432`	Server port
`sslmode`	string	`prefer`	SSL connection mode
`sslcert`	string	None	Client certificate path
`sslkey`	string	None	Client private key path
`sslrootcert`	string	None	CA certificate path

The sslmode option accepts the following values:

disable disables SSL encryption.
allow attempts non-SSL first and falls back to SSL.
prefer attempts SSL first and falls back to non-SSL.
require requires SSL without certificate verification.
verify-ca requires SSL and verifies the server certificate.
verify-full requires SSL and verifies the certificate and hostname.

In the following example, the datastore section configures a secure connection with certificate verification:

datastore:
  host: db.example.com
  database: ai_workbench
  username: ai_workbench
  password_file: /etc/ai-workbench/password.txt
  port: 5432
  sslmode: verify-full
  sslcert: /etc/ai-workbench/client-cert.pem
  sslkey: /etc/ai-workbench/client-key.pem
  sslrootcert: /etc/ai-workbench/ca-cert.pem

Connection Pool (`pool`)

The pool section configures the database connection pool.

Option	Type	Default	Description
`max_connections`	integer	`10`	Max concurrent connections
`max_idle_seconds`	integer	`300`	Idle connection timeout

Threshold Evaluation (`threshold`)

The threshold section configures threshold-based alert evaluation.

Option	Type	Default	Description
`evaluation_interval_seconds`	integer	`60`	Evaluation interval

Anomaly Detection (`anomaly`)

The anomaly section configures the tiered anomaly detection system.

Option	Type	Default	Description
`enabled`	boolean	`true`	Enable anomaly detection

Tier 1: Statistical Analysis

The anomaly.tier1 section configures z-score-based statistical detection.

Option	Type	Default	Description
`enabled`	boolean	`true`	Enable tier 1
`default_sensitivity`	float	`3.0`	Z-score threshold
`evaluation_interval_seconds`	integer	`60`	Evaluation interval

Tier 1: Variance Floor, Warmup, and Z-Score Cap

Three additional blocks under anomaly.tier1 prevent the detector from firing on baselines that have not yet stabilised. These blocks are most relevant on young datastores; on a datastore with one or two days of metric history, a baseline's stored standard deviation can collapse far below the metric's natural variation, producing z-scores in the thousands. The variance floor and warmup gate suppress this failure mode, and the z-score cap acts as defence in depth.

The anomaly.tier1.max_z_score option clamps the absolute z-score symmetrically around zero before the sensitivity comparison. The default is 100.0; any genuine outlier sits well below this value, and the cap simply prevents a runaway divisor from generating multi-thousand-sigma scores. Setting max_z_score: 0 disables the cap.

Option	Type	Default	Description
`max_z_score`	float	`100.0`	Z-score clamp; `0` disables

The anomaly.tier1.variance_floor block enforces a minimum divisor on the z-score calculation. The effective standard deviation is the larger of the raw stored value and a hybrid floor; the floor itself is the larger of relative_pct times the absolute baseline mean and absolute_floor. The relative term dominates for non-zero metrics, while the absolute term acts as a safety net when the mean approaches zero. Setting both relative_pct: 0 and absolute_floor: 0 disables the floor entirely; the detector then falls back to the existing stddev == 0 guard.

Option	Type	Default	Description
`relative_pct`	float	`0.05`	Floor as fraction of abs(mean)
`absolute_floor`	float	`0.001`	Absolute minimum stddev

The anomaly.tier1.warmup block suppresses detection for baselines that have not accumulated enough samples or enough wall-clock observation time. Each period_type (all, hourly, and daily) has its own pair of thresholds, and a baseline is considered warm only when both are met. The all baseline defaults require roughly one day of operation at the 60 second collection interval; the hourly and daily defaults require enough span for each time bucket to have been observed multiple times. Setting both min_samples: 0 and min_span_hours: 0 for a given period_type disables warmup suppression for that type.

Option	Type	Default	Description
`all.min_samples`	integer	`100`	Minimum sample count
`all.min_span_hours`	integer	`24`	Minimum span in hours
`hourly.min_samples`	integer	`5`	Minimum sample count
`hourly.min_span_hours`	integer	`120`	Minimum span in hours
`daily.min_samples`	integer	`3`	Minimum sample count
`daily.min_span_hours`	integer	`336`	Minimum span in hours

In the following example, the anomaly.tier1 section tightens the variance floor and extends the all warmup window:

anomaly:
  tier1:
    max_z_score: 100.0
    variance_floor:
      relative_pct: 0.10
      absolute_floor: 0.005
    warmup:
      all:
        min_samples: 200
        min_span_hours: 48
      hourly:
        min_samples: 5
        min_span_hours: 120
      daily:
        min_samples: 3
        min_span_hours: 336

Warmup suppressions are recorded at debug log level only; the detector does not write a candidate row or an alert when it skips a cold baseline. On the long-term test host, enable debug logging in the alerter and inspect recent suppressions with sudo journalctl -u ai-workbench-alerter.service --since 10m. The log line names the connection, metric, period type, and sample count, which is enough to confirm whether a missing alert reflects warmup suppression or a genuinely quiet metric.

Tier 2: Embedding Similarity

The anomaly.tier2 section configures pgvector-based similarity search for pattern matching.

Option	Type	Default	Description
`enabled`	boolean	`true`	Enable tier 2
`suppression_threshold`	float	`0.85`	Suppression threshold
`similarity_threshold`	float	`0.3`	Similarity threshold

Tier 3: LLM Classification

The anomaly.tier3 section configures LLM-based classification for complex anomalies.

Option	Type	Default	Description
`enabled`	boolean	`true`	Enable tier 3
`timeout_seconds`	integer	`30`	LLM API timeout

Baseline Calculation (`baselines`)

The baselines section configures baseline metric calculation for anomaly detection.

Option	Type	Default	Description
`refresh_interval_seconds`	integer	`3600`	Refresh interval
`lookback_days`	integer	`7`	Historical lookback in days

Correlation (`correlation`)

The correlation section configures alert correlation across metrics.

Option	Type	Default	Description
`window_seconds`	integer	`120`	Correlation time window

LLM Providers (`llm`)

The llm section configures LLM providers for tier 3 anomaly detection and embedding generation.

Option	Type	Default	Description
`embedding_provider`	string	`ollama`	Embedding provider
`reasoning_provider`	string	`ollama`	Classification provider

Ollama Configuration

The llm.ollama section configures the local Ollama provider.

Option	Type	Default	Description
`base_url`	string	`http://localhost:11434`	Ollama server URL
`embedding_model`	string	`nomic-embed-text`	Embedding model
`reasoning_model`	string	`qwen2.5:7b-instruct`	Reasoning model

OpenAI Configuration

The llm.openai section configures the OpenAI provider.

Option	Type	Default	Description
`api_key_file`	string	None	Path to API key file
`base_url`	string	`https://api.openai.com/v1`	OpenAI base URL
`embedding_model`	string	`text-embedding-3-small`	Embedding model
`reasoning_model`	string	`gpt-4o-mini`	Reasoning model

The openai provider works with any server that implements the OpenAI-compatible API. Set base_url to point at a local inference server. The API key is optional when using a custom base URL.

The following local inference servers are compatible:

Docker Model Runner uses http://localhost:12434/engines/llama.cpp/v1 as the default endpoint.
llama.cpp uses http://localhost:8080/v1 as the default endpoint.
LM Studio uses http://localhost:1234/v1 as the default endpoint.
EXO uses http://localhost:52415/v1 as the default endpoint.

In the following example, the llm.openai section configures a local llama.cpp server:

llm:
  reasoning_provider: openai
  openai:
    base_url: http://localhost:8080/v1
    reasoning_model: my-local-model

Anthropic Configuration

The llm.anthropic section configures the Anthropic provider.

Option	Type	Default	Description
`api_key_file`	string	None	Path to API key file
`base_url`	string	`https://api.anthropic.com/v1`	Anthropic base URL
`reasoning_model`	string	`claude-3-5-haiku-20241022`	Reasoning model

Gemini Configuration

The llm.gemini section configures the Google Gemini provider.

Option	Type	Default	Description
`api_key_file`	string	None	Path to API key file
`base_url`	string	`https://generativelanguage.googleapis.com`	Gemini base URL
`reasoning_model`	string	`gemini-2.0-flash`	Reasoning model

Voyage Configuration

The llm.voyage section configures the Voyage provider for embeddings.

Option	Type	Default	Description
`api_key_file`	string	None	Path to API key file
`base_url`	string	`https://api.voyageai.com/v1/embeddings`	Voyage base URL
`embedding_model`	string	`voyage-3-lite`	Embedding model

Notifications (`notifications`)

The notifications section configures the notification delivery system for sending alerts through external channels.

Option	Type	Default	Description
`enabled`	boolean	`false`	Enable notifications
`secret_file`	string	None	Path to server secret
`process_interval_seconds`	integer	`30`	Processing interval
`reminder_check_interval_minutes`	integer	`60`	Reminder check interval
`max_retry_attempts`	integer	`3`	Max retry attempts
`retry_backoff_minutes`	list	`[5, 15, 60]`	Retry backoff schedule
`http_timeout_seconds`	integer	`30`	HTTP request timeout
`http_max_idle_conns`	integer	`10`	Max idle HTTP connections

The secret_file option specifies a file containing the same plain text secret used by the server component. The alerter uses this secret to decrypt notification channel credentials that the server encrypted. The alerter and the server must reference the same secret file.

In the following example, the notifications section enables delivery with custom retry settings:

notifications:
  enabled: true
  secret_file: /etc/ai-workbench/ai-dba-server.secret
  process_interval_seconds: 30
  max_retry_attempts: 5
  retry_backoff_minutes: [5, 15, 30]
  http_timeout_seconds: 60

API Key Management

Store API keys for LLM providers in files with restricted permissions. The alerter reads API keys from the paths specified in the api_key_file options.

In the following example, the commands create an API key file with secure permissions:

echo "sk-your-api-key-here" \
    > /etc/ai-workbench/openai-api-key.txt
chmod 600 /etc/ai-workbench/openai-api-key.txt

The corresponding configuration references the key file:

llm:
  embedding_provider: openai
  reasoning_provider: openai
  openai:
    api_key_file: /etc/ai-workbench/openai-api-key.txt
    embedding_model: text-embedding-3-small
    reasoning_model: gpt-4o

Signal Handling

The alerter responds to Unix signals for operational control:

SIGINT and SIGTERM trigger a graceful shutdown.
SIGHUP reloads the configuration file without restarting the process.

In the following example, the kill command reloads the configuration:

kill -HUP $(pidof ai-dba-alerter)