Commit Graph

107 Commits

Author SHA1 Message Date
81bbd7e88a Fix resources isues 20 2026-01-22 19:31:54 +01:00
1af45739f3 Fix resources isues 19 2026-01-22 19:27:28 +01:00
be59cec3ca Fix resources isues 18 2026-01-22 19:01:06 +01:00
0affc247ce Fix resources isues 17 2026-01-22 18:54:09 +01:00
5e0cef6691 Fix resources isues 16 2026-01-22 17:59:42 +01:00
8e6083e5af Fix resources isues 15 2026-01-22 17:53:02 +01:00
741112be63 Fix resources isues 14 2026-01-22 17:43:34 +01:00
4699bcbb7a Fix resources isues 12 2026-01-22 17:41:20 +01:00
179d11968e Fix resources isues 12 2026-01-22 17:38:34 +01:00
be4ad40c3d Fix resources isues 11 2026-01-22 17:29:56 +01:00
ae5571f9ab Fix resources isues 10 2026-01-22 16:24:03 +01:00
7645e184e2 Fix resources isues 9 2026-01-22 15:42:32 +01:00
b17cdc4b47 Fix resources isues 8 2026-01-22 12:31:10 +01:00
6aa3e9424b Fix resources isues 7 2026-01-22 11:41:08 +01:00
56d4aec5c4 Fix resources isues 6 2026-01-22 11:30:36 +01:00
0183f3ab72 Fix resources isues 5 2026-01-22 11:15:11 +01:00
6505044f24 Fix resources isues 4 2026-01-22 10:36:00 +01:00
ac2e8cebf9 Fix resources isues 3 2026-01-22 10:19:01 +01:00
89ec45a7c1 Fix resources isues 2 2026-01-22 10:07:05 +01:00
8dc422e0e5 Fix resources isues 2026-01-22 07:54:56 +01:00
Urtzi Alfaro
aeff6b1537 Add new infra architecture 13 2026-01-21 23:16:19 +01:00
Urtzi Alfaro
66dfd50fbc Add new infra architecture 12 2026-01-21 16:21:24 +01:00
Urtzi Alfaro
2512de4173 Add new infra architecture 11 2026-01-20 22:05:10 +01:00
Urtzi Alfaro
0217ad83be Fix: align ingress base and overlays - single host per environment 2026-01-20 21:42:05 +01:00
Urtzi Alfaro
17508b1eac Fix: remove mail TLS from main ingress (handled by mailu ingress) 2026-01-20 21:38:54 +01:00
Urtzi Alfaro
1f65b7a48e Fix: set includeSelectors=false to avoid immutable selector conflicts 2026-01-20 21:35:12 +01:00
Urtzi Alfaro
dbf74fc1cb Fix kustomization: remove merge conflicts, fix paths, add gateway resource 2026-01-20 21:33:53 +01:00
Urtzi Alfaro
3b81b5f77e Add new infra architecture 10 2026-01-20 10:39:40 +01:00
Urtzi Alfaro
bc00bab061 Add new infra architecture 9 2026-01-20 07:20:56 +01:00
Urtzi Alfaro
52b8abdc0e Add new infra architecture 8 2026-01-19 22:28:53 +01:00
Urtzi Alfaro
7d6845574c Add new infra architecture 6 2026-01-19 16:31:11 +01:00
Urtzi Alfaro
b78399da2c Add new infra architecture 5 2026-01-19 15:15:04 +01:00
Urtzi Alfaro
e96405b828 Add new infra architecture 4 2026-01-19 14:22:07 +01:00
Urtzi Alfaro
9edcc8c231 Add new infra architecture 3 2026-01-19 13:57:50 +01:00
Urtzi Alfaro
8461226a97 Add new infra architecture 2 2026-01-19 12:12:19 +01:00
Urtzi Alfaro
35f164f0cd Add new infra architecture 2026-01-19 11:55:17 +01:00
Urtzi Alfaro
21d35ea92b Add ci/cd and fix multiple pods issues 2026-01-18 09:02:27 +01:00
Urtzi Alfaro
3c4b5c2a06 Add minio support and forntend analitycs 2026-01-17 22:42:40 +01:00
Urtzi Alfaro
6ddf608d37 Add subcription feature 2026-01-13 22:22:38 +01:00
Urtzi Alfaro
230bbe6a19 Add improvements 2026-01-12 14:24:14 +01:00
Urtzi Alfaro
b66bfda100 Update pilot launch doc 2026-01-11 09:18:17 +01:00
Urtzi Alfaro
b089c216db Imporve monitoring 6 2026-01-10 13:43:38 +01:00
Urtzi Alfaro
c05538cafb Imporve monitoring 5 2026-01-09 23:14:12 +01:00
Urtzi Alfaro
22dab143ba Imporve monitoring 4 2026-01-09 14:48:44 +01:00
Urtzi Alfaro
7ef85c1188 Add comprehensive SigNoz configuration guide and monitoring setup
Documentation includes:

1. OpAMP Root Cause Analysis:
   - Explains OpenAMP (Open Agent Management Protocol) functionality
   - Documents how OpAMP was overwriting config with "nop" receivers
   - Provides two solution paths:
     * Option 1: Disable OpAMP (current solution)
     * Option 2: Fix OpAMP server configuration (recommended for prod)
   - References: SigNoz architecture and OTel collector docs

2. Database Receivers Configuration:
   - PostgreSQL: Complete setup for 21 database instances
     * SQL commands to create monitoring users
     * Proper pg_monitor role permissions
     * Environment variable configuration
   - Redis: Configuration with/without TLS
     * Uses existing redis-secrets
     * Optional TLS certificate generation
   - RabbitMQ: Management API setup
     * Uses existing rabbitmq-secrets
     * Port 15672 management interface

3. Automation Script:
   - create-pg-monitoring-users.sh
   - Creates monitoring user in all 21 PostgreSQL databases
   - Generates secure random password
   - Verifies permissions
   - Provides next-step commands

Resources Referenced:
- PostgreSQL: https://signoz.io/docs/integrations/postgresql/
- Redis: https://signoz.io/blog/redis-opentelemetry/
- RabbitMQ: https://signoz.io/blog/opentelemetry-rabbitmq-metrics-monitoring/
- OpAMP: https://signoz.io/docs/operate/configuration/
- OTel Config: https://signoz.io/docs/opentelemetry-collection-agents/opentelemetry-collector/configuration/

Current Infrastructure Discovered:
- 21 PostgreSQL databases (all services have dedicated DBs)
- 1 Redis instance (password in redis-secrets)
- 1 RabbitMQ instance (credentials in rabbitmq-secrets)

Next Implementation Steps:
1. Run create-pg-monitoring-users.sh script
2. Create Kubernetes secrets for monitoring credentials
3. Update signoz-values-dev.yaml with receivers
4. Enable receivers in metrics pipeline
5. Test and verify metric collection

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-09 12:15:58 +01:00
Urtzi Alfaro
1329bae784 Fix SigNoz OTel Collector configuration and disable OpAMP
Root Cause Analysis:
- OTel Collector was starting but OpAMP was overwriting config with "nop" receivers/exporters
- ClickHouse authentication was failing due to missing credentials in DSN strings
- Redis/PostgreSQL/RabbitMQ receivers had missing TLS certs causing startup failures

Changes:
1. Fixed ClickHouse Exporters:
   - Added admin credentials to clickhousetraces datasource
   - Added admin credentials to clickhouselogsexporter dsn
   - Now using: tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/

2. Disabled Unconfigured Receivers:
   - Commented out PostgreSQL receivers (no monitor users configured)
   - Commented out Redis receiver (TLS certificates not available)
   - Commented out RabbitMQ receiver (credentials not configured)
   - Updated metrics pipeline to use only OTLP receiver

3. OpAMP Disabled:
   - OpAMP was causing collector to use nop exporters/receivers
   - Cannot disable via Helm (extraArgs appends, doesn't replace)
   - Must apply kubectl patch after Helm install:
     kubectl patch deployment signoz-otel-collector --type=json -p='[{"op":"replace","path":"/spec/template/spec/containers/0/args","value":["--config=/conf/otel-collector-config.yaml","--feature-gates=-pkg.translator.prometheus.NormalizeName"]}]'

Results:
 OTel Collector successfully receiving traces (97+ spans)
 Services connecting without UNAVAILABLE errors
 No ClickHouse authentication failures
 All pipelines active (traces, metrics, logs)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-09 11:51:03 +01:00
Urtzi Alfaro
43a3f35bd1 Imporve monitoring 3 2026-01-09 11:18:20 +01:00
Urtzi Alfaro
8ca5d9c100 Imporve monitoring 2 2026-01-09 07:26:11 +01:00
Urtzi Alfaro
4af860c010 Imporve monitoring 2026-01-09 06:57:18 +01:00
Urtzi Alfaro
e8fda39e50 Improve metrics 2026-01-08 20:48:24 +01:00