- Updated all OpenTelemetry packages to latest versions: - opentelemetry-api: 1.27.0 → 1.39.1 - opentelemetry-sdk: 1.27.0 → 1.39.1 - opentelemetry-exporter-otlp-proto-grpc: 1.27.0 → 1.39.1 - opentelemetry-exporter-otlp-proto-http: 1.27.0 → 1.39.1 - opentelemetry-instrumentation-fastapi: 0.48b0 → 0.60b1 - opentelemetry-instrumentation-httpx: 0.48b0 → 0.60b1 - opentelemetry-instrumentation-redis: 0.48b0 → 0.60b1 - opentelemetry-instrumentation-sqlalchemy: 0.48b0 → 0.60b1 - Removed prometheus-client==0.23.1 from all services - Unified all services to use the same monitoring package versions Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>
9.5 KiB
SigNoz Monitoring Quick Start
Get complete observability (metrics, logs, traces, system metrics) in under 10 minutes using OpenTelemetry.
What You'll Get
✅ Distributed Tracing - Complete request flows across all services ✅ Application Metrics - HTTP requests, durations, error rates, custom business metrics ✅ System Metrics - CPU usage, memory usage, disk I/O, network I/O per service ✅ Structured Logs - Searchable logs correlated with traces ✅ Unified Dashboard - Single UI for all telemetry data
All data pushed via OpenTelemetry OTLP protocol - No Prometheus, no scraping needed!
Prerequisites
- Kubernetes cluster running (Kind/Minikube/Production)
- Helm 3.x installed
- kubectl configured
Step 1: Deploy SigNoz
# Add Helm repository
helm repo add signoz https://charts.signoz.io
helm repo update
# Create namespace
kubectl create namespace signoz
# Install SigNoz
helm install signoz signoz/signoz \
-n signoz \
-f infrastructure/helm/signoz-values-dev.yaml
# Wait for pods to be ready (2-3 minutes)
kubectl wait --for=condition=ready pod -l app=signoz -n signoz --timeout=300s
Step 2: Configure Services
Each service needs OpenTelemetry environment variables. The auth-service is already configured as an example.
Quick Configuration (for remaining services)
Add these environment variables to each service deployment:
env:
# OpenTelemetry Collector endpoint
- name: OTEL_COLLECTOR_ENDPOINT
value: "http://signoz-otel-collector.signoz.svc.cluster.local:4318"
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://signoz-otel-collector.signoz.svc.cluster.local:4318"
- name: OTEL_SERVICE_NAME
value: "your-service-name" # e.g., "inventory-service"
# Enable tracing
- name: ENABLE_TRACING
value: "true"
# Enable logs export
- name: OTEL_LOGS_EXPORTER
value: "otlp"
# Enable metrics export (includes system metrics)
- name: ENABLE_OTEL_METRICS
value: "true"
- name: ENABLE_SYSTEM_METRICS
value: "true"
Using the Configuration Script
# Generate configuration patches for all services
./infrastructure/kubernetes/add-monitoring-config.sh
# This creates /tmp/*-otel-patch.yaml files
# Review and manually add to each service deployment
Step 3: Deploy Updated Services
# Apply updated configurations
kubectl apply -k infrastructure/kubernetes/overlays/dev/
# Or restart services to pick up new env vars
kubectl rollout restart deployment -n bakery-ia
# Wait for rollout
kubectl rollout status deployment -n bakery-ia --timeout=5m
Step 4: Access SigNoz UI
Via Ingress
# Add to /etc/hosts if needed
echo "127.0.0.1 monitoring.bakery-ia.local" | sudo tee -a /etc/hosts
# Access UI
open https://monitoring.bakery-ia.local
Via Port Forward
kubectl port-forward -n signoz svc/signoz-frontend 3301:3301
open http://localhost:3301
Step 5: Explore Your Data
Traces
- Go to Services tab
- See all your services listed
- Click on a service → View traces
- Click on a trace → See detailed span tree with timing
Metrics
HTTP Metrics (automatically collected):
http_requests_total- Total requests by method, endpoint, statushttp_request_duration_seconds- Request latencyactive_requests- Current active HTTP requests
System Metrics (automatically collected per service):
process.cpu.utilization- Process CPU usage %process.memory.usage- Process memory in bytesprocess.memory.utilization- Process memory %process.threads.count- Number of threadssystem.cpu.utilization- System-wide CPU %system.memory.usage- System memory usagesystem.disk.io.read- Disk bytes readsystem.disk.io.write- Disk bytes writtensystem.network.io.sent- Network bytes sentsystem.network.io.received- Network bytes received
Custom Business Metrics (if configured):
- User registrations
- Orders created
- Login attempts
- etc.
Logs
- Go to Logs tab
- Filter by service:
service_name="auth-service" - Search for specific messages
- See structured fields (user_id, tenant_id, etc.)
Trace-Log Correlation
- Find a trace in Traces tab
- Note the
trace_id - Go to Logs tab
- Filter:
trace_id="<the-trace-id>" - See all logs for that specific request!
Verification Commands
# Check if services are sending telemetry
kubectl logs -n bakery-ia deployment/auth-service | grep -i "telemetry\|otel"
# Check SigNoz collector is receiving data
kubectl logs -n signoz deployment/signoz-otel-collector | tail -50
# Test connectivity to collector
kubectl exec -n bakery-ia deployment/auth-service -- \
curl -v http://signoz-otel-collector.signoz.svc.cluster.local:4318
Common Issues
No data in SigNoz
# 1. Verify environment variables are set
kubectl get deployment auth-service -n bakery-ia -o yaml | grep OTEL
# 2. Check collector logs
kubectl logs -n signoz deployment/signoz-otel-collector
# 3. Restart service
kubectl rollout restart deployment/auth-service -n bakery-ia
Services not appearing
# Check network connectivity
kubectl exec -n bakery-ia deployment/auth-service -- \
curl http://signoz-otel-collector.signoz.svc.cluster.local:4318
# Should return: connection successful (not connection refused)
Architecture
┌─────────────────────────────────────────────┐
│ Your Microservices │
│ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │ auth │ │ inv │ │orders│ ... │
│ └──┬───┘ └──┬───┘ └──┬───┘ │
│ │ │ │ │
│ └─────────┴─────────┘ │
│ │ │
│ OTLP Push │
│ (traces, metrics, logs) │
└──────────────┼──────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ SigNoz OpenTelemetry Collector │
│ :4317 (gRPC) :4318 (HTTP) │
│ │
│ Receivers: OTLP only (no Prometheus) │
│ Processors: batch, memory_limiter │
│ Exporters: ClickHouse │
└──────────────┼──────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ ClickHouse Database │
│ Stores: traces, metrics, logs │
└──────────────┼──────────────────────────────┘
│
▼
┌──────────────────────────────────────────────┐
│ SigNoz Frontend UI │
│ monitoring.bakery-ia.local or :3301 │
└──────────────────────────────────────────────┘
What Makes This Different
Pure OpenTelemetry - No Prometheus involved:
- ✅ All metrics pushed via OTLP (not scraped)
- ✅ Automatic system metrics collection (CPU, memory, disk, network)
- ✅ Unified data model for all telemetry
- ✅ Native trace-metric-log correlation
- ✅ Lower resource usage (no scraping overhead)
Next Steps
- Create Dashboards - Build custom views for your metrics
- Set Up Alerts - Configure alerts for errors, latency, resource usage
- Explore System Metrics - Monitor CPU, memory per service
- Query Logs - Use powerful log query language
- Correlate Everything - Jump from traces → logs → metrics
Need Help?
- Full Documentation - Detailed setup guide
- SigNoz Docs - Official documentation
- OpenTelemetry Python - Python instrumentation
Metrics You Get Out of the Box:
| Category | Metrics | Description |
|---|---|---|
| HTTP | http_requests_total |
Total requests by method, endpoint, status |
| HTTP | http_request_duration_seconds |
Request latency histogram |
| HTTP | active_requests |
Current active requests |
| Process | process.cpu.utilization |
Process CPU usage % |
| Process | process.memory.usage |
Process memory in bytes |
| Process | process.memory.utilization |
Process memory % |
| Process | process.threads.count |
Thread count |
| System | system.cpu.utilization |
System CPU % |
| System | system.memory.usage |
System memory usage |
| System | system.memory.utilization |
System memory % |
| Disk | system.disk.io.read |
Disk read bytes |
| Disk | system.disk.io.write |
Disk write bytes |
| Network | system.network.io.sent |
Network sent bytes |
| Network | system.network.io.received |
Network received bytes |