Add comprehensive SigNoz configuration guide and monitoring setup
Documentation includes:
1. OpAMP Root Cause Analysis:
- Explains OpenAMP (Open Agent Management Protocol) functionality
- Documents how OpAMP was overwriting config with "nop" receivers
- Provides two solution paths:
* Option 1: Disable OpAMP (current solution)
* Option 2: Fix OpAMP server configuration (recommended for prod)
- References: SigNoz architecture and OTel collector docs
2. Database Receivers Configuration:
- PostgreSQL: Complete setup for 21 database instances
* SQL commands to create monitoring users
* Proper pg_monitor role permissions
* Environment variable configuration
- Redis: Configuration with/without TLS
* Uses existing redis-secrets
* Optional TLS certificate generation
- RabbitMQ: Management API setup
* Uses existing rabbitmq-secrets
* Port 15672 management interface
3. Automation Script:
- create-pg-monitoring-users.sh
- Creates monitoring user in all 21 PostgreSQL databases
- Generates secure random password
- Verifies permissions
- Provides next-step commands
Resources Referenced:
- PostgreSQL: https://signoz.io/docs/integrations/postgresql/
- Redis: https://signoz.io/blog/redis-opentelemetry/
- RabbitMQ: https://signoz.io/blog/opentelemetry-rabbitmq-metrics-monitoring/
- OpAMP: https://signoz.io/docs/operate/configuration/
- OTel Config: https://signoz.io/docs/opentelemetry-collection-agents/opentelemetry-collector/configuration/
Current Infrastructure Discovered:
- 21 PostgreSQL databases (all services have dedicated DBs)
- 1 Redis instance (password in redis-secrets)
- 1 RabbitMQ instance (credentials in rabbitmq-secrets)
Next Implementation Steps:
1. Run create-pg-monitoring-users.sh script
2. Create Kubernetes secrets for monitoring credentials
3. Update signoz-values-dev.yaml with receivers
4. Enable receivers in metrics pipeline
5. Test and verify metric collection
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
145
infrastructure/scripts/create-pg-monitoring-users.sh
Executable file
145
infrastructure/scripts/create-pg-monitoring-users.sh
Executable file
@@ -0,0 +1,145 @@
|
||||
#!/bin/bash
|
||||
# Create monitoring users in all PostgreSQL databases for SigNoz metrics collection
|
||||
#
|
||||
# This script creates a 'monitoring' user with pg_monitor role in each PostgreSQL database
|
||||
# Based on: https://signoz.io/docs/integrations/postgresql/
|
||||
#
|
||||
# Usage: ./create-pg-monitoring-users.sh
|
||||
|
||||
set -e
|
||||
|
||||
NAMESPACE="bakery-ia"
|
||||
MONITORING_USER="monitoring"
|
||||
MONITORING_PASSWORD="monitoring_$(openssl rand -hex 16)"
|
||||
|
||||
# List of all PostgreSQL database deployments
|
||||
DATABASES=(
|
||||
"auth-db"
|
||||
"inventory-db"
|
||||
"orders-db"
|
||||
"ai-insights-db"
|
||||
"alert-processor-db"
|
||||
"demo-session-db"
|
||||
"distribution-db"
|
||||
"external-db"
|
||||
"forecasting-db"
|
||||
"notification-db"
|
||||
"orchestrator-db"
|
||||
"pos-db"
|
||||
"procurement-db"
|
||||
"production-db"
|
||||
"recipes-db"
|
||||
"sales-db"
|
||||
"suppliers-db"
|
||||
"tenant-db"
|
||||
"training-db"
|
||||
)
|
||||
|
||||
echo "=================================================="
|
||||
echo "PostgreSQL Monitoring User Setup for SigNoz"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
echo "This script will create a monitoring user in all PostgreSQL databases"
|
||||
echo "User: $MONITORING_USER"
|
||||
echo "Password: $MONITORING_PASSWORD"
|
||||
echo ""
|
||||
echo "IMPORTANT: Save this password! You'll need it for SigNoz configuration."
|
||||
echo ""
|
||||
read -p "Press Enter to continue or Ctrl+C to cancel..."
|
||||
|
||||
SUCCESS_COUNT=0
|
||||
FAILED_COUNT=0
|
||||
FAILED_DBS=()
|
||||
|
||||
for db in "${DATABASES[@]}"; do
|
||||
echo ""
|
||||
echo "Processing: $db"
|
||||
echo "---"
|
||||
|
||||
# Create monitoring user with pg_monitor role (PostgreSQL 10+)
|
||||
if kubectl exec -n $NAMESPACE deployment/$db -- psql -U postgres -c "
|
||||
DO \$\$
|
||||
BEGIN
|
||||
-- Try to create the user
|
||||
CREATE USER $MONITORING_USER WITH PASSWORD '$MONITORING_PASSWORD';
|
||||
RAISE NOTICE 'User created successfully';
|
||||
EXCEPTION
|
||||
WHEN duplicate_object THEN
|
||||
-- User already exists, update password
|
||||
ALTER USER $MONITORING_USER WITH PASSWORD '$MONITORING_PASSWORD';
|
||||
RAISE NOTICE 'User already exists, password updated';
|
||||
END
|
||||
\$\$;
|
||||
|
||||
-- Grant pg_monitor role (PostgreSQL 10+)
|
||||
GRANT pg_monitor TO $MONITORING_USER;
|
||||
|
||||
-- Grant SELECT on pg_stat_database
|
||||
GRANT SELECT ON pg_stat_database TO $MONITORING_USER;
|
||||
|
||||
-- Verify permissions
|
||||
SELECT
|
||||
r.rolname as role_name,
|
||||
ARRAY_AGG(b.rolname) as granted_roles
|
||||
FROM pg_auth_members m
|
||||
JOIN pg_roles r ON (m.member = r.oid)
|
||||
JOIN pg_roles b ON (m.roleid = b.oid)
|
||||
WHERE r.rolname = '$MONITORING_USER'
|
||||
GROUP BY r.rolname;
|
||||
" 2>&1; then
|
||||
echo "✅ SUCCESS: $db"
|
||||
((SUCCESS_COUNT++))
|
||||
else
|
||||
echo "❌ FAILED: $db"
|
||||
((FAILED_COUNT++))
|
||||
FAILED_DBS+=("$db")
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "=================================================="
|
||||
echo "Summary"
|
||||
echo "=================================================="
|
||||
echo "Successful: $SUCCESS_COUNT databases"
|
||||
echo "Failed: $FAILED_COUNT databases"
|
||||
|
||||
if [ $FAILED_COUNT -gt 0 ]; then
|
||||
echo ""
|
||||
echo "Failed databases:"
|
||||
for db in "${FAILED_DBS[@]}"; do
|
||||
echo " - $db"
|
||||
done
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=================================================="
|
||||
echo "Next Steps"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
echo "1. Create Kubernetes secret with monitoring credentials:"
|
||||
echo ""
|
||||
echo "kubectl create secret generic -n $NAMESPACE postgres-monitoring-secrets \\"
|
||||
echo " --from-literal=POSTGRES_MONITOR_USER=$MONITORING_USER \\"
|
||||
echo " --from-literal=POSTGRES_MONITOR_PASSWORD='$MONITORING_PASSWORD'"
|
||||
echo ""
|
||||
echo "2. Update infrastructure/helm/signoz-values-dev.yaml with PostgreSQL receivers"
|
||||
echo ""
|
||||
echo "3. Add environment variables to otelCollector configuration"
|
||||
echo ""
|
||||
echo "4. Run: helm upgrade signoz signoz/signoz -n $NAMESPACE -f infrastructure/helm/signoz-values-dev.yaml"
|
||||
echo ""
|
||||
echo "5. Apply OpAMP patch:"
|
||||
echo ""
|
||||
echo "kubectl patch deployment -n $NAMESPACE signoz-otel-collector --type=json -p='["
|
||||
echo " {\"op\":\"replace\",\"path\":\"/spec/template/spec/containers/0/args\",\"value\":["
|
||||
echo " \"--config=/conf/otel-collector-config.yaml\","
|
||||
echo " \"--feature-gates=-pkg.translator.prometheus.NormalizeName\""
|
||||
echo " ]}"
|
||||
echo "]'"
|
||||
echo ""
|
||||
echo "=================================================="
|
||||
echo "SAVE THIS INFORMATION!"
|
||||
echo "=================================================="
|
||||
echo "Username: $MONITORING_USER"
|
||||
echo "Password: $MONITORING_PASSWORD"
|
||||
echo "=================================================="
|
||||
Reference in New Issue
Block a user