# SigNoz Monitoring Quick Start Get complete observability (metrics, logs, traces, system metrics) in under 10 minutes using OpenTelemetry. ## What You'll Get ✅ **Distributed Tracing** - Complete request flows across all services ✅ **Application Metrics** - HTTP requests, durations, error rates, custom business metrics ✅ **System Metrics** - CPU usage, memory usage, disk I/O, network I/O per service ✅ **Structured Logs** - Searchable logs correlated with traces ✅ **Unified Dashboard** - Single UI for all telemetry data **All data pushed via OpenTelemetry OTLP protocol - No Prometheus, no scraping needed!** ## Prerequisites - Kubernetes cluster running (Kind/Minikube/Production) - Helm 3.x installed - kubectl configured ## Step 1: Deploy SigNoz ```bash # Add Helm repository helm repo add signoz https://charts.signoz.io helm repo update # Create namespace kubectl create namespace signoz # Install SigNoz helm install signoz signoz/signoz \ -n signoz \ -f infrastructure/helm/signoz-values-dev.yaml # Wait for pods to be ready (2-3 minutes) kubectl wait --for=condition=ready pod -l app=signoz -n signoz --timeout=300s ``` ## Step 2: Configure Services Each service needs OpenTelemetry environment variables. The auth-service is already configured as an example. ### Quick Configuration (for remaining services) Add these environment variables to each service deployment: ```yaml env: # OpenTelemetry Collector endpoint - name: OTEL_COLLECTOR_ENDPOINT value: "http://signoz-otel-collector.signoz.svc.cluster.local:4318" - name: OTEL_EXPORTER_OTLP_ENDPOINT value: "http://signoz-otel-collector.signoz.svc.cluster.local:4318" - name: OTEL_SERVICE_NAME value: "your-service-name" # e.g., "inventory-service" # Enable tracing - name: ENABLE_TRACING value: "true" # Enable logs export - name: OTEL_LOGS_EXPORTER value: "otlp" # Enable metrics export (includes system metrics) - name: ENABLE_OTEL_METRICS value: "true" - name: ENABLE_SYSTEM_METRICS value: "true" ``` ### Using the Configuration Script ```bash # Generate configuration patches for all services ./infrastructure/kubernetes/add-monitoring-config.sh # This creates /tmp/*-otel-patch.yaml files # Review and manually add to each service deployment ``` ## Step 3: Deploy Updated Services ```bash # Apply updated configurations kubectl apply -k infrastructure/kubernetes/overlays/dev/ # Or restart services to pick up new env vars kubectl rollout restart deployment -n bakery-ia # Wait for rollout kubectl rollout status deployment -n bakery-ia --timeout=5m ``` ## Step 4: Access SigNoz UI ### Via Ingress ```bash # Add to /etc/hosts if needed echo "127.0.0.1 monitoring.bakery-ia.local" | sudo tee -a /etc/hosts # Access UI open https://monitoring.bakery-ia.local ``` ### Via Port Forward ```bash kubectl port-forward -n signoz svc/signoz-frontend 3301:3301 open http://localhost:3301 ``` ## Step 5: Explore Your Data ### Traces 1. Go to **Services** tab 2. See all your services listed 3. Click on a service → View traces 4. Click on a trace → See detailed span tree with timing ### Metrics **HTTP Metrics** (automatically collected): - `http_requests_total` - Total requests by method, endpoint, status - `http_request_duration_seconds` - Request latency - `active_requests` - Current active HTTP requests **System Metrics** (automatically collected per service): - `process.cpu.utilization` - Process CPU usage % - `process.memory.usage` - Process memory in bytes - `process.memory.utilization` - Process memory % - `process.threads.count` - Number of threads - `system.cpu.utilization` - System-wide CPU % - `system.memory.usage` - System memory usage - `system.disk.io.read` - Disk bytes read - `system.disk.io.write` - Disk bytes written - `system.network.io.sent` - Network bytes sent - `system.network.io.received` - Network bytes received **Custom Business Metrics** (if configured): - User registrations - Orders created - Login attempts - etc. ### Logs 1. Go to **Logs** tab 2. Filter by service: `service_name="auth-service"` 3. Search for specific messages 4. See structured fields (user_id, tenant_id, etc.) ### Trace-Log Correlation 1. Find a trace in **Traces** tab 2. Note the `trace_id` 3. Go to **Logs** tab 4. Filter: `trace_id=""` 5. See all logs for that specific request! ## Verification Commands ```bash # Check if services are sending telemetry kubectl logs -n bakery-ia deployment/auth-service | grep -i "telemetry\|otel" # Check SigNoz collector is receiving data kubectl logs -n signoz deployment/signoz-otel-collector | tail -50 # Test connectivity to collector kubectl exec -n bakery-ia deployment/auth-service -- \ curl -v http://signoz-otel-collector.signoz.svc.cluster.local:4318 ``` ## Common Issues ### No data in SigNoz ```bash # 1. Verify environment variables are set kubectl get deployment auth-service -n bakery-ia -o yaml | grep OTEL # 2. Check collector logs kubectl logs -n signoz deployment/signoz-otel-collector # 3. Restart service kubectl rollout restart deployment/auth-service -n bakery-ia ``` ### Services not appearing ```bash # Check network connectivity kubectl exec -n bakery-ia deployment/auth-service -- \ curl http://signoz-otel-collector.signoz.svc.cluster.local:4318 # Should return: connection successful (not connection refused) ``` ## Architecture ``` ┌─────────────────────────────────────────────┐ │ Your Microservices │ │ ┌──────┐ ┌──────┐ ┌──────┐ │ │ │ auth │ │ inv │ │orders│ ... │ │ └──┬───┘ └──┬───┘ └──┬───┘ │ │ │ │ │ │ │ └─────────┴─────────┘ │ │ │ │ │ OTLP Push │ │ (traces, metrics, logs) │ └──────────────┼──────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────┐ │ SigNoz OpenTelemetry Collector │ │ :4317 (gRPC) :4318 (HTTP) │ │ │ │ Receivers: OTLP only (no Prometheus) │ │ Processors: batch, memory_limiter │ │ Exporters: ClickHouse │ └──────────────┼──────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────┐ │ ClickHouse Database │ │ Stores: traces, metrics, logs │ └──────────────┼──────────────────────────────┘ │ ▼ ┌──────────────────────────────────────────────┐ │ SigNoz Frontend UI │ │ monitoring.bakery-ia.local or :3301 │ └──────────────────────────────────────────────┘ ``` ## What Makes This Different **Pure OpenTelemetry** - No Prometheus involved: - ✅ All metrics pushed via OTLP (not scraped) - ✅ Automatic system metrics collection (CPU, memory, disk, network) - ✅ Unified data model for all telemetry - ✅ Native trace-metric-log correlation - ✅ Lower resource usage (no scraping overhead) ## Next Steps - **Create Dashboards** - Build custom views for your metrics - **Set Up Alerts** - Configure alerts for errors, latency, resource usage - **Explore System Metrics** - Monitor CPU, memory per service - **Query Logs** - Use powerful log query language - **Correlate Everything** - Jump from traces → logs → metrics ## Need Help? - [Full Documentation](./MONITORING_SETUP.md) - Detailed setup guide - [SigNoz Docs](https://signoz.io/docs/) - Official documentation - [OpenTelemetry Python](https://opentelemetry.io/docs/instrumentation/python/) - Python instrumentation --- **Metrics You Get Out of the Box:** | Category | Metrics | Description | |----------|---------|-------------| | HTTP | `http_requests_total` | Total requests by method, endpoint, status | | HTTP | `http_request_duration_seconds` | Request latency histogram | | HTTP | `active_requests` | Current active requests | | Process | `process.cpu.utilization` | Process CPU usage % | | Process | `process.memory.usage` | Process memory in bytes | | Process | `process.memory.utilization` | Process memory % | | Process | `process.threads.count` | Thread count | | System | `system.cpu.utilization` | System CPU % | | System | `system.memory.usage` | System memory usage | | System | `system.memory.utilization` | System memory % | | Disk | `system.disk.io.read` | Disk read bytes | | Disk | `system.disk.io.write` | Disk write bytes | | Network | `system.network.io.sent` | Network sent bytes | | Network | `system.network.io.received` | Network received bytes |