# SigNoz Complete Configuration Guide ## Root Cause Analysis and Solutions This document provides a comprehensive analysis of the SigNoz telemetry collection issues and the proper configuration for all receivers. --- ## Problem 1: OpAMP Configuration Corruption ### Root Cause **What is OpAMP?** [OpAMP (Open Agent Management Protocol)](https://signoz.io/docs/operate/configuration/) is a protocol for remote configuration management in OpenTelemetry Collectors. In SigNoz, OpAMP runs a server that dynamically configures log pipelines in the SigNoz OTel collector. **The Issue:** - OpAMP was successfully connecting to the SigNoz backend and receiving remote configuration - The remote configuration contained only `nop` (no-operation) receivers and exporters - This overwrote the local collector configuration at runtime - Result: The collector appeared healthy but couldn't receive or export any data **Why This Happened:** 1. The SigNoz backend's OpAMP server was pushing an invalid/incomplete configuration 2. The collector's `--manager-config` flag pointed to OpAMP configuration 3. OpAMP's `--copy-path=/var/tmp/collector-config.yaml` overwrote the good config ### Solution Options #### Option 1: Disable OpAMP (Current Solution) Since OpAMP is pushing bad configuration and we have a working static configuration, we disabled it: ```bash kubectl patch deployment -n bakery-ia signoz-otel-collector --type=json -p='[ { "op": "replace", "path": "/spec/template/spec/containers/0/args", "value": [ "--config=/conf/otel-collector-config.yaml", "--feature-gates=-pkg.translator.prometheus.NormalizeName" ] } ]' ``` **Important:** This patch must be applied after every `helm install` or `helm upgrade` because the Helm chart doesn't support disabling OpAMP via values. #### Option 2: Fix OpAMP Configuration (Recommended for Production) To properly use OpAMP: 1. **Check SigNoz Backend Configuration:** - Verify the SigNoz service is properly configured to serve OpAMP - Check logs: `kubectl logs -n bakery-ia statefulset/signoz` - Look for OpAMP-related errors 2. **Configure OpAMP Server Settings:** According to [SigNoz configuration documentation](https://signoz.io/docs/operate/configuration/), set these environment variables in the SigNoz statefulset: ```yaml signoz: env: OPAMP_ENABLED: "true" OPAMP_SERVER_ENDPOINT: "ws://signoz:4320/v1/opamp" ``` 3. **Verify OpAMP Configuration File:** ```bash kubectl get configmap -n bakery-ia signoz-otel-collector -o yaml ``` Should contain: ```yaml otel-collector-opamp-config.yaml: | server_endpoint: "ws://signoz:4320/v1/opamp" ``` 4. **Monitor OpAMP Status:** ```bash kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep opamp ``` ### References - [SigNoz Architecture](https://signoz.io/docs/architecture/) - [OpenTelemetry Collector Configuration](https://signoz.io/docs/opentelemetry-collection-agents/opentelemetry-collector/configuration/) - [SigNoz Helm Chart](https://github.com/SigNoz/charts) --- ## Problem 2: Database and Infrastructure Receivers Configuration ### Overview You have the following infrastructure requiring monitoring: - **21 PostgreSQL databases** (auth, inventory, orders, forecasting, production, etc.) - **1 Redis instance** (caching layer) - **1 RabbitMQ instance** (message queue) All receivers were disabled because they lacked proper credentials and configuration. --- ## PostgreSQL Receiver Configuration ### Prerequisites Based on [SigNoz PostgreSQL Integration Guide](https://signoz.io/docs/integrations/postgresql/), each PostgreSQL instance needs a monitoring user with proper permissions. ### Step 1: Create Monitoring Users For each PostgreSQL database, create a dedicated monitoring user: **For PostgreSQL 10 and newer:** ```sql CREATE USER monitoring WITH PASSWORD 'your_secure_password'; GRANT pg_monitor TO monitoring; GRANT SELECT ON pg_stat_database TO monitoring; ``` **For PostgreSQL 9.6 to 9.x:** ```sql CREATE USER monitoring WITH PASSWORD 'your_secure_password'; GRANT SELECT ON pg_stat_database TO monitoring; ``` ### Step 2: Create Monitoring User for All Databases Run this script to create monitoring users in all PostgreSQL databases: ```bash #!/bin/bash # File: infrastructure/scripts/create-pg-monitoring-users.sh DATABASES=( "auth-db" "inventory-db" "orders-db" "ai-insights-db" "alert-processor-db" "demo-session-db" "distribution-db" "external-db" "forecasting-db" "notification-db" "orchestrator-db" "pos-db" "procurement-db" "production-db" "recipes-db" "sales-db" "suppliers-db" "tenant-db" "training-db" ) MONITORING_PASSWORD="monitoring_secure_pass_$(openssl rand -hex 16)" echo "Creating monitoring users with password: $MONITORING_PASSWORD" echo "Save this password for your SigNoz configuration!" for db in "${DATABASES[@]}"; do echo "Processing $db..." kubectl exec -n bakery-ia deployment/$db -- psql -U postgres -c " CREATE USER monitoring WITH PASSWORD '$MONITORING_PASSWORD'; GRANT pg_monitor TO monitoring; GRANT SELECT ON pg_stat_database TO monitoring; " 2>&1 | grep -v "already exists" || true done echo "" echo "Monitoring users created!" echo "Password: $MONITORING_PASSWORD" ``` ### Step 3: Store Credentials in Kubernetes Secret ```bash kubectl create secret generic -n bakery-ia postgres-monitoring-secrets \ --from-literal=POSTGRES_MONITOR_USER=monitoring \ --from-literal=POSTGRES_MONITOR_PASSWORD= ``` ### Step 4: Configure PostgreSQL Receivers in SigNoz Update `infrastructure/helm/signoz-values-dev.yaml`: ```yaml otelCollector: config: receivers: # PostgreSQL receivers for database metrics postgresql/auth: endpoint: auth-db-service.bakery-ia:5432 username: ${env:POSTGRES_MONITOR_USER} password: ${env:POSTGRES_MONITOR_PASSWORD} databases: - auth_db collection_interval: 60s tls: insecure: true # Set to false if using TLS postgresql/inventory: endpoint: inventory-db-service.bakery-ia:5432 username: ${env:POSTGRES_MONITOR_USER} password: ${env:POSTGRES_MONITOR_PASSWORD} databases: - inventory_db collection_interval: 60s tls: insecure: true # Add all other databases... postgresql/orders: endpoint: orders-db-service.bakery-ia:5432 username: ${env:POSTGRES_MONITOR_USER} password: ${env:POSTGRES_MONITOR_PASSWORD} databases: - orders_db collection_interval: 60s tls: insecure: true # Update metrics pipeline service: pipelines: metrics: receivers: - otlp - postgresql/auth - postgresql/inventory - postgresql/orders # Add all PostgreSQL receivers processors: [memory_limiter, batch, resourcedetection] exporters: [signozclickhousemetrics] ``` ### Step 5: Add Environment Variables to OTel Collector Deployment The Helm chart needs to inject these environment variables. Modify your Helm values: ```yaml otelCollector: env: - name: POSTGRES_MONITOR_USER valueFrom: secretKeyRef: name: postgres-monitoring-secrets key: POSTGRES_MONITOR_USER - name: POSTGRES_MONITOR_PASSWORD valueFrom: secretKeyRef: name: postgres-monitoring-secrets key: POSTGRES_MONITOR_PASSWORD ``` ### References - [PostgreSQL Monitoring with OpenTelemetry | SigNoz](https://signoz.io/blog/opentelemetry-postgresql-metrics-monitoring/) - [PostgreSQL Integration | SigNoz](https://signoz.io/docs/integrations/postgresql/) --- ## Redis Receiver Configuration ### Current Infrastructure - **Service**: `redis-service.bakery-ia:6379` - **Password**: Available in secret `redis-secrets` - **TLS**: Currently not configured ### Step 1: Check if Redis Requires TLS ```bash kubectl exec -n bakery-ia deployment/redis -- redis-cli CONFIG GET tls-port ``` If TLS is not configured (tls-port is 0 or empty), you can use `insecure: true`. ### Step 2: Configure Redis Receiver Update `infrastructure/helm/signoz-values-dev.yaml`: ```yaml otelCollector: config: receivers: # Redis receiver for cache metrics redis: endpoint: redis-service.bakery-ia:6379 password: ${env:REDIS_PASSWORD} collection_interval: 60s transport: tcp tls: insecure: true # Change to false if using TLS metrics: redis.maxmemory: enabled: true redis.cmd.latency: enabled: true env: - name: REDIS_PASSWORD valueFrom: secretKeyRef: name: redis-secrets key: REDIS_PASSWORD service: pipelines: metrics: receivers: [otlp, redis, ...] ``` ### Optional: Configure TLS for Redis If you want to enable TLS for Redis (recommended for production): 1. **Generate TLS Certificates:** ```bash # Create CA openssl genrsa -out ca-key.pem 4096 openssl req -new -x509 -days 3650 -key ca-key.pem -out ca-cert.pem # Create Redis server certificate openssl genrsa -out redis-key.pem 4096 openssl req -new -key redis-key.pem -out redis.csr openssl x509 -req -days 3650 -in redis.csr -CA ca-cert.pem -CAkey ca-key.pem -CAcreateserial -out redis-cert.pem # Create Kubernetes secret kubectl create secret generic -n bakery-ia redis-tls \ --from-file=ca-cert.pem=ca-cert.pem \ --from-file=redis-cert.pem=redis-cert.pem \ --from-file=redis-key.pem=redis-key.pem ``` 2. **Mount Certificates in OTel Collector:** ```yaml otelCollector: volumes: - name: redis-tls secret: secretName: redis-tls volumeMounts: - name: redis-tls mountPath: /etc/redis-tls readOnly: true config: receivers: redis: tls: insecure: false cert_file: /etc/redis-tls/redis-cert.pem key_file: /etc/redis-tls/redis-key.pem ca_file: /etc/redis-tls/ca-cert.pem ``` ### References - [Redis Monitoring with OpenTelemetry | SigNoz](https://signoz.io/blog/redis-opentelemetry/) - [Redis Monitoring 101 | SigNoz](https://signoz.io/blog/redis-monitoring/) --- ## RabbitMQ Receiver Configuration ### Current Infrastructure - **Service**: `rabbitmq-service.bakery-ia` - Port 5672: AMQP protocol - Port 15672: Management API (required for metrics) - **Credentials**: - Username: `bakery` - Password: Available in secret `rabbitmq-secrets` ### Step 1: Enable RabbitMQ Management Plugin ```bash kubectl exec -n bakery-ia deployment/rabbitmq -- rabbitmq-plugins enable rabbitmq_management ``` ### Step 2: Verify Management API Access ```bash kubectl port-forward -n bakery-ia svc/rabbitmq-service 15672:15672 # In browser: http://localhost:15672 # Login with: bakery / ``` ### Step 3: Configure RabbitMQ Receiver Update `infrastructure/helm/signoz-values-dev.yaml`: ```yaml otelCollector: config: receivers: # RabbitMQ receiver via management API rabbitmq: endpoint: http://rabbitmq-service.bakery-ia:15672 username: ${env:RABBITMQ_USER} password: ${env:RABBITMQ_PASSWORD} collection_interval: 30s env: - name: RABBITMQ_USER valueFrom: secretKeyRef: name: rabbitmq-secrets key: RABBITMQ_USER - name: RABBITMQ_PASSWORD valueFrom: secretKeyRef: name: rabbitmq-secrets key: RABBITMQ_PASSWORD service: pipelines: metrics: receivers: [otlp, rabbitmq, ...] ``` ### References - [RabbitMQ Monitoring with OpenTelemetry | SigNoz](https://signoz.io/blog/opentelemetry-rabbitmq-metrics-monitoring/) - [OpenTelemetry Receivers | SigNoz](https://signoz.io/docs/userguide/otel-metrics-receivers/) --- ## Complete Implementation Plan ### Phase 1: Enable Basic Infrastructure Monitoring (No TLS) 1. **Create PostgreSQL monitoring users** (all 21 databases) 2. **Create Kubernetes secrets** for credentials 3. **Update Helm values** with receiver configurations 4. **Configure environment variables** in OTel Collector 5. **Apply Helm upgrade** and OpAMP patch 6. **Verify metrics collection** ### Phase 2: Enable TLS (Optional, Production-Ready) 1. **Generate TLS certificates** for Redis 2. **Configure Redis TLS** in deployment 3. **Update Redis receiver** with TLS settings 4. **Configure PostgreSQL TLS** if required 5. **Test and verify** secure connections ### Phase 3: Enable OpAMP (Optional, Advanced) 1. **Fix SigNoz OpAMP server configuration** 2. **Test remote configuration** in dev environment 3. **Gradually enable** OpAMP after validation 4. **Monitor** for configuration corruption --- ## Verification Commands ### Check Collector Metrics ```bash kubectl port-forward -n bakery-ia svc/signoz-otel-collector 8888:8888 curl http://localhost:8888/metrics | grep "otelcol_receiver_accepted" ``` ### Check Database Connectivity ```bash kubectl exec -n bakery-ia deployment/signoz-otel-collector -- \ /bin/sh -c "nc -zv auth-db-service 5432" ``` ### Check RabbitMQ Management API ```bash kubectl exec -n bakery-ia deployment/signoz-otel-collector -- \ /bin/sh -c "wget -O- http://rabbitmq-service:15672/api/overview" ``` ### Check Redis Connectivity ```bash kubectl exec -n bakery-ia deployment/signoz-otel-collector -- \ /bin/sh -c "nc -zv redis-service 6379" ``` --- ## Troubleshooting ### PostgreSQL Connection Refused - Verify monitoring user exists: `kubectl exec deployment/auth-db -- psql -U postgres -c "\du"` - Check user permissions: `kubectl exec deployment/auth-db -- psql -U monitoring -c "SELECT 1"` ### Redis Authentication Failed - Verify password: `kubectl get secret redis-secrets -o jsonpath='{.data.REDIS_PASSWORD}' | base64 -d` - Test connection: `kubectl exec deployment/redis -- redis-cli -a PING` ### RabbitMQ Management API Not Available - Check plugin status: `kubectl exec deployment/rabbitmq -- rabbitmq-plugins list` - Enable plugin: `kubectl exec deployment/rabbitmq -- rabbitmq-plugins enable rabbitmq_management` --- ## Summary **Current Status:** - ✅ OTel Collector receiving traces (97+ spans) - ✅ ClickHouse authentication fixed - ✅ OpAMP disabled (preventing config corruption) - ❌ PostgreSQL receivers not configured (no monitoring users) - ❌ Redis receiver not configured (missing in pipeline) - ❌ RabbitMQ receiver not configured (missing in pipeline) **Next Steps:** 1. Create PostgreSQL monitoring users across all 21 databases 2. Configure Redis receiver with existing credentials 3. Configure RabbitMQ receiver with existing credentials 4. Test and verify all metrics are flowing 5. Optionally enable TLS for production 6. Optionally fix and re-enable OpAMP for dynamic configuration