Files
bakery-ia/docs/SIGNOZ_COMPLETE_CONFIGURATION_GUIDE.md
Urtzi Alfaro 7ef85c1188 Add comprehensive SigNoz configuration guide and monitoring setup
Documentation includes:

1. OpAMP Root Cause Analysis:
   - Explains OpenAMP (Open Agent Management Protocol) functionality
   - Documents how OpAMP was overwriting config with "nop" receivers
   - Provides two solution paths:
     * Option 1: Disable OpAMP (current solution)
     * Option 2: Fix OpAMP server configuration (recommended for prod)
   - References: SigNoz architecture and OTel collector docs

2. Database Receivers Configuration:
   - PostgreSQL: Complete setup for 21 database instances
     * SQL commands to create monitoring users
     * Proper pg_monitor role permissions
     * Environment variable configuration
   - Redis: Configuration with/without TLS
     * Uses existing redis-secrets
     * Optional TLS certificate generation
   - RabbitMQ: Management API setup
     * Uses existing rabbitmq-secrets
     * Port 15672 management interface

3. Automation Script:
   - create-pg-monitoring-users.sh
   - Creates monitoring user in all 21 PostgreSQL databases
   - Generates secure random password
   - Verifies permissions
   - Provides next-step commands

Resources Referenced:
- PostgreSQL: https://signoz.io/docs/integrations/postgresql/
- Redis: https://signoz.io/blog/redis-opentelemetry/
- RabbitMQ: https://signoz.io/blog/opentelemetry-rabbitmq-metrics-monitoring/
- OpAMP: https://signoz.io/docs/operate/configuration/
- OTel Config: https://signoz.io/docs/opentelemetry-collection-agents/opentelemetry-collector/configuration/

Current Infrastructure Discovered:
- 21 PostgreSQL databases (all services have dedicated DBs)
- 1 Redis instance (password in redis-secrets)
- 1 RabbitMQ instance (credentials in rabbitmq-secrets)

Next Implementation Steps:
1. Run create-pg-monitoring-users.sh script
2. Create Kubernetes secrets for monitoring credentials
3. Update signoz-values-dev.yaml with receivers
4. Enable receivers in metrics pipeline
5. Test and verify metric collection

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-09 12:15:58 +01:00

15 KiB

SigNoz Complete Configuration Guide

Root Cause Analysis and Solutions

This document provides a comprehensive analysis of the SigNoz telemetry collection issues and the proper configuration for all receivers.


Problem 1: OpAMP Configuration Corruption

Root Cause

What is OpAMP? OpAMP (Open Agent Management Protocol) is a protocol for remote configuration management in OpenTelemetry Collectors. In SigNoz, OpAMP runs a server that dynamically configures log pipelines in the SigNoz OTel collector.

The Issue:

  • OpAMP was successfully connecting to the SigNoz backend and receiving remote configuration
  • The remote configuration contained only nop (no-operation) receivers and exporters
  • This overwrote the local collector configuration at runtime
  • Result: The collector appeared healthy but couldn't receive or export any data

Why This Happened:

  1. The SigNoz backend's OpAMP server was pushing an invalid/incomplete configuration
  2. The collector's --manager-config flag pointed to OpAMP configuration
  3. OpAMP's --copy-path=/var/tmp/collector-config.yaml overwrote the good config

Solution Options

Option 1: Disable OpAMP (Current Solution)

Since OpAMP is pushing bad configuration and we have a working static configuration, we disabled it:

kubectl patch deployment -n bakery-ia signoz-otel-collector --type=json -p='[
  {
    "op": "replace",
    "path": "/spec/template/spec/containers/0/args",
    "value": [
      "--config=/conf/otel-collector-config.yaml",
      "--feature-gates=-pkg.translator.prometheus.NormalizeName"
    ]
  }
]'

Important: This patch must be applied after every helm install or helm upgrade because the Helm chart doesn't support disabling OpAMP via values.

To properly use OpAMP:

  1. Check SigNoz Backend Configuration:

    • Verify the SigNoz service is properly configured to serve OpAMP
    • Check logs: kubectl logs -n bakery-ia statefulset/signoz
    • Look for OpAMP-related errors
  2. Configure OpAMP Server Settings: According to SigNoz configuration documentation, set these environment variables in the SigNoz statefulset:

    signoz:
      env:
        OPAMP_ENABLED: "true"
        OPAMP_SERVER_ENDPOINT: "ws://signoz:4320/v1/opamp"
    
  3. Verify OpAMP Configuration File:

    kubectl get configmap -n bakery-ia signoz-otel-collector -o yaml
    

    Should contain:

    otel-collector-opamp-config.yaml: |
      server_endpoint: "ws://signoz:4320/v1/opamp"
    
  4. Monitor OpAMP Status:

    kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep opamp
    

References


Problem 2: Database and Infrastructure Receivers Configuration

Overview

You have the following infrastructure requiring monitoring:

  • 21 PostgreSQL databases (auth, inventory, orders, forecasting, production, etc.)
  • 1 Redis instance (caching layer)
  • 1 RabbitMQ instance (message queue)

All receivers were disabled because they lacked proper credentials and configuration.


PostgreSQL Receiver Configuration

Prerequisites

Based on SigNoz PostgreSQL Integration Guide, each PostgreSQL instance needs a monitoring user with proper permissions.

Step 1: Create Monitoring Users

For each PostgreSQL database, create a dedicated monitoring user:

For PostgreSQL 10 and newer:

CREATE USER monitoring WITH PASSWORD 'your_secure_password';
GRANT pg_monitor TO monitoring;
GRANT SELECT ON pg_stat_database TO monitoring;

For PostgreSQL 9.6 to 9.x:

CREATE USER monitoring WITH PASSWORD 'your_secure_password';
GRANT SELECT ON pg_stat_database TO monitoring;

Step 2: Create Monitoring User for All Databases

Run this script to create monitoring users in all PostgreSQL databases:

#!/bin/bash
# File: infrastructure/scripts/create-pg-monitoring-users.sh

DATABASES=(
  "auth-db"
  "inventory-db"
  "orders-db"
  "ai-insights-db"
  "alert-processor-db"
  "demo-session-db"
  "distribution-db"
  "external-db"
  "forecasting-db"
  "notification-db"
  "orchestrator-db"
  "pos-db"
  "procurement-db"
  "production-db"
  "recipes-db"
  "sales-db"
  "suppliers-db"
  "tenant-db"
  "training-db"
)

MONITORING_PASSWORD="monitoring_secure_pass_$(openssl rand -hex 16)"

echo "Creating monitoring users with password: $MONITORING_PASSWORD"
echo "Save this password for your SigNoz configuration!"

for db in "${DATABASES[@]}"; do
  echo "Processing $db..."
  kubectl exec -n bakery-ia deployment/$db -- psql -U postgres -c "
    CREATE USER monitoring WITH PASSWORD '$MONITORING_PASSWORD';
    GRANT pg_monitor TO monitoring;
    GRANT SELECT ON pg_stat_database TO monitoring;
  " 2>&1 | grep -v "already exists" || true
done

echo ""
echo "Monitoring users created!"
echo "Password: $MONITORING_PASSWORD"

Step 3: Store Credentials in Kubernetes Secret

kubectl create secret generic -n bakery-ia postgres-monitoring-secrets \
  --from-literal=POSTGRES_MONITOR_USER=monitoring \
  --from-literal=POSTGRES_MONITOR_PASSWORD=<password-from-script>

Step 4: Configure PostgreSQL Receivers in SigNoz

Update infrastructure/helm/signoz-values-dev.yaml:

otelCollector:
  config:
    receivers:
      # PostgreSQL receivers for database metrics
      postgresql/auth:
        endpoint: auth-db-service.bakery-ia:5432
        username: ${env:POSTGRES_MONITOR_USER}
        password: ${env:POSTGRES_MONITOR_PASSWORD}
        databases:
          - auth_db
        collection_interval: 60s
        tls:
          insecure: true  # Set to false if using TLS

      postgresql/inventory:
        endpoint: inventory-db-service.bakery-ia:5432
        username: ${env:POSTGRES_MONITOR_USER}
        password: ${env:POSTGRES_MONITOR_PASSWORD}
        databases:
          - inventory_db
        collection_interval: 60s
        tls:
          insecure: true

      # Add all other databases...
      postgresql/orders:
        endpoint: orders-db-service.bakery-ia:5432
        username: ${env:POSTGRES_MONITOR_USER}
        password: ${env:POSTGRES_MONITOR_PASSWORD}
        databases:
          - orders_db
        collection_interval: 60s
        tls:
          insecure: true

    # Update metrics pipeline
    service:
      pipelines:
        metrics:
          receivers:
            - otlp
            - postgresql/auth
            - postgresql/inventory
            - postgresql/orders
            # Add all PostgreSQL receivers
          processors: [memory_limiter, batch, resourcedetection]
          exporters: [signozclickhousemetrics]

Step 5: Add Environment Variables to OTel Collector Deployment

The Helm chart needs to inject these environment variables. Modify your Helm values:

otelCollector:
  env:
    - name: POSTGRES_MONITOR_USER
      valueFrom:
        secretKeyRef:
          name: postgres-monitoring-secrets
          key: POSTGRES_MONITOR_USER
    - name: POSTGRES_MONITOR_PASSWORD
      valueFrom:
        secretKeyRef:
          name: postgres-monitoring-secrets
          key: POSTGRES_MONITOR_PASSWORD

References


Redis Receiver Configuration

Current Infrastructure

  • Service: redis-service.bakery-ia:6379
  • Password: Available in secret redis-secrets
  • TLS: Currently not configured

Step 1: Check if Redis Requires TLS

kubectl exec -n bakery-ia deployment/redis -- redis-cli CONFIG GET tls-port

If TLS is not configured (tls-port is 0 or empty), you can use insecure: true.

Step 2: Configure Redis Receiver

Update infrastructure/helm/signoz-values-dev.yaml:

otelCollector:
  config:
    receivers:
      # Redis receiver for cache metrics
      redis:
        endpoint: redis-service.bakery-ia:6379
        password: ${env:REDIS_PASSWORD}
        collection_interval: 60s
        transport: tcp
        tls:
          insecure: true  # Change to false if using TLS
        metrics:
          redis.maxmemory:
            enabled: true
          redis.cmd.latency:
            enabled: true

  env:
    - name: REDIS_PASSWORD
      valueFrom:
        secretKeyRef:
          name: redis-secrets
          key: REDIS_PASSWORD

    service:
      pipelines:
        metrics:
          receivers: [otlp, redis, ...]

Optional: Configure TLS for Redis

If you want to enable TLS for Redis (recommended for production):

  1. Generate TLS Certificates:
# Create CA
openssl genrsa -out ca-key.pem 4096
openssl req -new -x509 -days 3650 -key ca-key.pem -out ca-cert.pem

# Create Redis server certificate
openssl genrsa -out redis-key.pem 4096
openssl req -new -key redis-key.pem -out redis.csr
openssl x509 -req -days 3650 -in redis.csr -CA ca-cert.pem -CAkey ca-key.pem -CAcreateserial -out redis-cert.pem

# Create Kubernetes secret
kubectl create secret generic -n bakery-ia redis-tls \
  --from-file=ca-cert.pem=ca-cert.pem \
  --from-file=redis-cert.pem=redis-cert.pem \
  --from-file=redis-key.pem=redis-key.pem
  1. Mount Certificates in OTel Collector:
otelCollector:
  volumes:
    - name: redis-tls
      secret:
        secretName: redis-tls

  volumeMounts:
    - name: redis-tls
      mountPath: /etc/redis-tls
      readOnly: true

  config:
    receivers:
      redis:
        tls:
          insecure: false
          cert_file: /etc/redis-tls/redis-cert.pem
          key_file: /etc/redis-tls/redis-key.pem
          ca_file: /etc/redis-tls/ca-cert.pem

References


RabbitMQ Receiver Configuration

Current Infrastructure

  • Service: rabbitmq-service.bakery-ia
    • Port 5672: AMQP protocol
    • Port 15672: Management API (required for metrics)
  • Credentials:
    • Username: bakery
    • Password: Available in secret rabbitmq-secrets

Step 1: Enable RabbitMQ Management Plugin

kubectl exec -n bakery-ia deployment/rabbitmq -- rabbitmq-plugins enable rabbitmq_management

Step 2: Verify Management API Access

kubectl port-forward -n bakery-ia svc/rabbitmq-service 15672:15672
# In browser: http://localhost:15672
# Login with: bakery / <password>

Step 3: Configure RabbitMQ Receiver

Update infrastructure/helm/signoz-values-dev.yaml:

otelCollector:
  config:
    receivers:
      # RabbitMQ receiver via management API
      rabbitmq:
        endpoint: http://rabbitmq-service.bakery-ia:15672
        username: ${env:RABBITMQ_USER}
        password: ${env:RABBITMQ_PASSWORD}
        collection_interval: 30s

  env:
    - name: RABBITMQ_USER
      valueFrom:
        secretKeyRef:
          name: rabbitmq-secrets
          key: RABBITMQ_USER
    - name: RABBITMQ_PASSWORD
      valueFrom:
        secretKeyRef:
          name: rabbitmq-secrets
          key: RABBITMQ_PASSWORD

    service:
      pipelines:
        metrics:
          receivers: [otlp, rabbitmq, ...]

References


Complete Implementation Plan

Phase 1: Enable Basic Infrastructure Monitoring (No TLS)

  1. Create PostgreSQL monitoring users (all 21 databases)
  2. Create Kubernetes secrets for credentials
  3. Update Helm values with receiver configurations
  4. Configure environment variables in OTel Collector
  5. Apply Helm upgrade and OpAMP patch
  6. Verify metrics collection

Phase 2: Enable TLS (Optional, Production-Ready)

  1. Generate TLS certificates for Redis
  2. Configure Redis TLS in deployment
  3. Update Redis receiver with TLS settings
  4. Configure PostgreSQL TLS if required
  5. Test and verify secure connections

Phase 3: Enable OpAMP (Optional, Advanced)

  1. Fix SigNoz OpAMP server configuration
  2. Test remote configuration in dev environment
  3. Gradually enable OpAMP after validation
  4. Monitor for configuration corruption

Verification Commands

Check Collector Metrics

kubectl port-forward -n bakery-ia svc/signoz-otel-collector 8888:8888
curl http://localhost:8888/metrics | grep "otelcol_receiver_accepted"

Check Database Connectivity

kubectl exec -n bakery-ia deployment/signoz-otel-collector -- \
  /bin/sh -c "nc -zv auth-db-service 5432"

Check RabbitMQ Management API

kubectl exec -n bakery-ia deployment/signoz-otel-collector -- \
  /bin/sh -c "wget -O- http://rabbitmq-service:15672/api/overview"

Check Redis Connectivity

kubectl exec -n bakery-ia deployment/signoz-otel-collector -- \
  /bin/sh -c "nc -zv redis-service 6379"

Troubleshooting

PostgreSQL Connection Refused

  • Verify monitoring user exists: kubectl exec deployment/auth-db -- psql -U postgres -c "\du"
  • Check user permissions: kubectl exec deployment/auth-db -- psql -U monitoring -c "SELECT 1"

Redis Authentication Failed

  • Verify password: kubectl get secret redis-secrets -o jsonpath='{.data.REDIS_PASSWORD}' | base64 -d
  • Test connection: kubectl exec deployment/redis -- redis-cli -a <password> PING

RabbitMQ Management API Not Available

  • Check plugin status: kubectl exec deployment/rabbitmq -- rabbitmq-plugins list
  • Enable plugin: kubectl exec deployment/rabbitmq -- rabbitmq-plugins enable rabbitmq_management

Summary

Current Status:

  • OTel Collector receiving traces (97+ spans)
  • ClickHouse authentication fixed
  • OpAMP disabled (preventing config corruption)
  • PostgreSQL receivers not configured (no monitoring users)
  • Redis receiver not configured (missing in pipeline)
  • RabbitMQ receiver not configured (missing in pipeline)

Next Steps:

  1. Create PostgreSQL monitoring users across all 21 databases
  2. Configure Redis receiver with existing credentials
  3. Configure RabbitMQ receiver with existing credentials
  4. Test and verify all metrics are flowing
  5. Optionally enable TLS for production
  6. Optionally fix and re-enable OpAMP for dynamic configuration