Add new infra architecture 5
This commit is contained in:
338
MAILU_DEPLOYMENT_ARCHITECTURE.md
Normal file
338
MAILU_DEPLOYMENT_ARCHITECTURE.md
Normal file
@@ -0,0 +1,338 @@
|
||||
# Mailu Deployment Architecture for Bakery-IA Project
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document outlines the recommended architecture for deploying Mailu email services across development and production environments for the Bakery-IA project. The solution addresses DNSSEC validation requirements while maintaining consistency across different Kubernetes platforms.
|
||||
|
||||
## Environment Overview
|
||||
|
||||
### Development Environment
|
||||
- **Platform**: Kind (Kubernetes in Docker) or Colima
|
||||
- **Purpose**: Local development and testing
|
||||
- **Characteristics**: Ephemeral, single-node, resource-constrained
|
||||
|
||||
### Production Environment
|
||||
- **Platform**: MicroK8s on Ubuntu VPS
|
||||
- **Purpose**: Production email services
|
||||
- **Characteristics**: Single-node or small cluster, persistent storage, production-grade reliability
|
||||
|
||||
## Core Requirements
|
||||
|
||||
1. **DNSSEC Validation**: Mailu v1.9+ requires DNSSEC-validating resolver
|
||||
2. **Cross-Environment Consistency**: Unified approach for dev and prod
|
||||
3. **Resource Efficiency**: Optimized for constrained environments
|
||||
4. **Reliability**: Production-grade availability and monitoring
|
||||
|
||||
## Architectural Solution
|
||||
|
||||
### Unified DNS Resolution Strategy
|
||||
|
||||
**Recommended Approach**: Deploy Unbound as a dedicated DNSSEC-validating resolver pod in both environments
|
||||
|
||||
#### Benefits:
|
||||
- ✅ Consistent behavior across dev and prod
|
||||
- ✅ Meets Mailu's DNSSEC requirements
|
||||
- ✅ Privacy-preserving (no external DNS queries)
|
||||
- ✅ Avoids rate-limiting from public DNS providers
|
||||
- ✅ Full control over DNS resolution
|
||||
|
||||
### Implementation Components
|
||||
|
||||
#### 1. Unbound Deployment Manifest
|
||||
|
||||
```yaml
|
||||
# unbound.yaml - Cross-environment compatible
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: unbound-resolver
|
||||
namespace: mailu
|
||||
labels:
|
||||
app: unbound
|
||||
component: dns
|
||||
spec:
|
||||
replicas: 1 # Scale to 2+ in production with anti-affinity
|
||||
selector:
|
||||
matchLabels:
|
||||
app: unbound
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: unbound
|
||||
component: dns
|
||||
spec:
|
||||
containers:
|
||||
- name: unbound
|
||||
image: mvance/unbound:latest
|
||||
ports:
|
||||
- containerPort: 53
|
||||
name: dns-udp
|
||||
protocol: UDP
|
||||
- containerPort: 53
|
||||
name: dns-tcp
|
||||
protocol: TCP
|
||||
resources:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "384Mi"
|
||||
readinessProbe:
|
||||
exec:
|
||||
command: ["drill", "@127.0.0.1", "-p", "53", "+dnssec", "example.org"]
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 30
|
||||
securityContext:
|
||||
capabilities:
|
||||
add: ["NET_BIND_SERVICE"]
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: unbound-dns
|
||||
namespace: mailu
|
||||
spec:
|
||||
selector:
|
||||
app: unbound
|
||||
ports:
|
||||
- name: dns-udp
|
||||
port: 53
|
||||
targetPort: 53
|
||||
protocol: UDP
|
||||
- name: dns-tcp
|
||||
port: 53
|
||||
targetPort: 53
|
||||
protocol: TCP
|
||||
```
|
||||
|
||||
#### 2. Mailu Configuration (values.yaml)
|
||||
|
||||
```yaml
|
||||
# Production-tuned Mailu configuration
|
||||
dnsPolicy: None
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "10.152.183.x" # Replace with actual unbound service IP
|
||||
|
||||
# Component-specific DNS configuration
|
||||
admin:
|
||||
dnsPolicy: None
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "10.152.183.x"
|
||||
|
||||
rspamd:
|
||||
dnsPolicy: None
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "10.152.183.x"
|
||||
|
||||
# Environment-specific configurations
|
||||
persistence:
|
||||
enabled: true
|
||||
# Development: use default storage class
|
||||
# Production: use microk8s-hostpath or longhorn
|
||||
storageClass: "standard"
|
||||
|
||||
replicas: 1 # Increase in production as needed
|
||||
|
||||
# Security settings
|
||||
secretKey: "generate-strong-key-here"
|
||||
|
||||
# Ingress configuration
|
||||
# Use existing Bakery-IA ingress controller
|
||||
```
|
||||
|
||||
### Environment-Specific Adaptations
|
||||
|
||||
#### Development (Kind/Colima)
|
||||
|
||||
**Optimizations:**
|
||||
- Use hostPath volumes for persistence
|
||||
- Reduce resource requests/limits
|
||||
- Disable or simplify monitoring
|
||||
- Use NodePort for external access
|
||||
|
||||
**Deployment:**
|
||||
```bash
|
||||
# Apply unbound
|
||||
kubectl apply -f unbound.yaml
|
||||
|
||||
# Get unbound service IP
|
||||
UNBOUND_IP=$(kubectl get svc unbound-dns -n mailu -o jsonpath='{.spec.clusterIP}')
|
||||
|
||||
# Deploy Mailu with dev-specific values
|
||||
helm upgrade --install mailu mailu/mailu \
|
||||
--namespace mailu \
|
||||
-f values-dev.yaml \
|
||||
--set dnsConfig.nameservers[0]=$UNBOUND_IP
|
||||
```
|
||||
|
||||
#### Production (MicroK8s/Ubuntu)
|
||||
|
||||
**Enhancements:**
|
||||
- Use Longhorn or OpenEBS for storage
|
||||
- Enable monitoring and logging
|
||||
- Configure proper ingress with TLS
|
||||
- Set up backup solutions
|
||||
|
||||
**Deployment:**
|
||||
```bash
|
||||
# Enable required MicroK8s addons
|
||||
microk8s enable dns storage ingress metallb
|
||||
|
||||
# Apply unbound
|
||||
kubectl apply -f unbound.yaml
|
||||
|
||||
# Get unbound service IP
|
||||
UNBOUND_IP=$(kubectl get svc unbound-dns -n mailu -o jsonpath='{.spec.clusterIP}')
|
||||
|
||||
# Deploy Mailu with production values
|
||||
helm upgrade --install mailu mailu/mailu \
|
||||
--namespace mailu \
|
||||
-f values-prod.yaml \
|
||||
--set dnsConfig.nameservers[0]=$UNBOUND_IP
|
||||
```
|
||||
|
||||
## Verification Procedures
|
||||
|
||||
### DNSSEC Validation Test
|
||||
|
||||
```bash
|
||||
# From within a Mailu pod
|
||||
kubectl exec -it -n mailu deploy/mailu-admin -- bash
|
||||
|
||||
# Test DNSSEC validation
|
||||
dig @unbound-dns +short +dnssec +adflag example.org A
|
||||
|
||||
# Should show AD flag in response
|
||||
```
|
||||
|
||||
### Service Health Checks
|
||||
|
||||
```bash
|
||||
# Check unbound service
|
||||
kubectl get pods -n mailu -l app=unbound
|
||||
kubectl logs -n mailu -l app=unbound
|
||||
|
||||
# Check Mailu components
|
||||
kubectl get pods -n mailu
|
||||
kubectl logs -n mailu -l app.kubernetes.io/name=mailu
|
||||
```
|
||||
|
||||
## Monitoring and Maintenance
|
||||
|
||||
### Production Monitoring Setup
|
||||
|
||||
```yaml
|
||||
# Example monitoring configuration for production
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: ServiceMonitor
|
||||
metadata:
|
||||
name: unbound-monitor
|
||||
namespace: mailu
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: unbound
|
||||
endpoints:
|
||||
- port: dns-tcp
|
||||
interval: 30s
|
||||
path: /metrics
|
||||
```
|
||||
|
||||
### Backup Strategy
|
||||
|
||||
**Production:**
|
||||
- Daily Velero backups of Mailu namespace
|
||||
- Weekly database dumps
|
||||
- Monthly full cluster snapshots
|
||||
|
||||
**Development:**
|
||||
- On-demand backups before major changes
|
||||
- Volume snapshots for critical data
|
||||
|
||||
## Troubleshooting Guide
|
||||
|
||||
### Common Issues and Solutions
|
||||
|
||||
**Issue: DNSSEC validation failures**
|
||||
- Verify unbound pod logs
|
||||
- Check network policies
|
||||
- Test DNS resolution from within pods
|
||||
|
||||
**Issue: Mailu pods failing to start**
|
||||
- Confirm DNS configuration in values.yaml
|
||||
- Verify unbound service is reachable
|
||||
- Check resource availability
|
||||
|
||||
**Issue: Performance problems**
|
||||
- Monitor CPU/memory usage
|
||||
- Adjust resource limits
|
||||
- Consider scaling replicas
|
||||
|
||||
## Migration Path
|
||||
|
||||
### From Development to Production
|
||||
|
||||
1. **Configuration Migration**
|
||||
- Update storage class from hostPath to production storage
|
||||
- Adjust resource requests/limits
|
||||
- Enable monitoring and logging
|
||||
|
||||
2. **Data Migration**
|
||||
- Export development data
|
||||
- Import into production environment
|
||||
- Verify data integrity
|
||||
|
||||
3. **DNS Configuration**
|
||||
- Update DNS records to point to production
|
||||
- Verify TLS certificates
|
||||
- Test email delivery
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Production Security Hardening
|
||||
|
||||
1. **Network Security**
|
||||
- Implement network policies
|
||||
- Restrict ingress/egress traffic
|
||||
- Use TLS for all external communications
|
||||
|
||||
2. **Access Control**
|
||||
- Implement RBAC for Mailu namespace
|
||||
- Restrict admin access
|
||||
- Use strong authentication
|
||||
|
||||
3. **Monitoring and Alerting**
|
||||
- Set up anomaly detection
|
||||
- Configure alert thresholds
|
||||
- Implement log retention policies
|
||||
|
||||
## Cost Optimization
|
||||
|
||||
### Resource Management
|
||||
|
||||
**Development:**
|
||||
- Use minimal resource allocations
|
||||
- Scale down when not in use
|
||||
- Clean up unused resources
|
||||
|
||||
**Production:**
|
||||
- Right-size resource requests
|
||||
- Implement auto-scaling where possible
|
||||
- Monitor and optimize usage patterns
|
||||
|
||||
## Conclusion
|
||||
|
||||
This architecture provides a robust, consistent solution for deploying Mailu across development and production environments. By using Unbound as a dedicated DNSSEC-validating resolver, we ensure compliance with Mailu's requirements while maintaining flexibility and reliability across different Kubernetes platforms.
|
||||
|
||||
The solution is designed to be:
|
||||
- **Consistent**: Same core architecture across environments
|
||||
- **Reliable**: Production-grade availability and monitoring
|
||||
- **Efficient**: Optimized resource usage
|
||||
- **Maintainable**: Clear documentation and troubleshooting guides
|
||||
|
||||
This approach aligns with the Bakery-IA project's requirements for a secure, reliable email infrastructure that can be consistently deployed across different environments.
|
||||
6
Tiltfile
6
Tiltfile
@@ -473,13 +473,16 @@ k8s_image_json_path(
|
||||
|
||||
# Redis & RabbitMQ
|
||||
k8s_resource('redis', resource_deps=['security-setup'], labels=['01-infrastructure'])
|
||||
k8s_resource('rabbitmq', labels=['01-infrastructure'])
|
||||
k8s_resource('rabbitmq', resource_deps=['security-setup'], labels=['01-infrastructure'])
|
||||
k8s_resource('nominatim', labels=['01-infrastructure'])
|
||||
|
||||
# MinIO Storage
|
||||
k8s_resource('minio', resource_deps=['security-setup'], labels=['01-infrastructure'])
|
||||
k8s_resource('minio-bucket-init', resource_deps=['minio'], labels=['01-infrastructure'])
|
||||
|
||||
# Unbound DNSSEC Resolver - Infrastructure component for Mailu DNS validation
|
||||
k8s_resource('unbound-resolver', resource_deps=['security-setup'], labels=['01-infrastructure'])
|
||||
|
||||
# Mail Infrastructure (Mailu) - Manual trigger for Helm deployment
|
||||
local_resource(
|
||||
'mailu-helm',
|
||||
@@ -542,6 +545,7 @@ local_resource(
|
||||
auto_init=False, # Manual trigger only
|
||||
)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# MONITORING RESOURCES - SigNoz (Unified Observability)
|
||||
# =============================================================================
|
||||
|
||||
@@ -433,6 +433,45 @@ microk8s enable prometheus
|
||||
microk8s enable registry
|
||||
```
|
||||
|
||||
### Step 3: Enhanced Infrastructure Components
|
||||
|
||||
**The platform includes additional infrastructure components that enhance security, monitoring, and operations:**
|
||||
|
||||
```bash
|
||||
# The platform includes Mailu for email services
|
||||
# Deploy Mailu via Helm (optional but recommended for production):
|
||||
kubectl create namespace bakery-ia --dry-run=client -o yaml | kubectl apply -f -
|
||||
helm repo add mailu https://mailu.github.io/helm-charts
|
||||
helm repo update
|
||||
helm install mailu mailu/mailu \
|
||||
-n bakery-ia \
|
||||
-f infrastructure/platform/mail/mailu-helm/values.yaml \
|
||||
--timeout 10m \
|
||||
--wait
|
||||
|
||||
# Verify Mailu deployment
|
||||
kubectl get pods -n bakery-ia | grep mailu
|
||||
```
|
||||
|
||||
**For development environments, ensure the prepull-base-images script is run:**
|
||||
```bash
|
||||
# On your local machine, run the prepull script to cache base images
|
||||
cd bakery-ia
|
||||
chmod +x scripts/prepull-base-images.sh
|
||||
./scripts/prepull-base-images.sh
|
||||
```
|
||||
|
||||
**For production environments, ensure CI/CD infrastructure is properly configured:**
|
||||
```bash
|
||||
# Tekton Pipelines for CI/CD (optional - can be deployed separately)
|
||||
kubectl create namespace tekton-pipelines
|
||||
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||||
kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
|
||||
|
||||
# Flux CD for GitOps (already enabled in MicroK8s if needed)
|
||||
# flux install --namespace=flux-system --network-policy=false
|
||||
```
|
||||
|
||||
### Step 3: Configure Firewall
|
||||
|
||||
```bash
|
||||
@@ -917,7 +956,34 @@ echo -n "your-value-here" | base64
|
||||
|
||||
**CRITICAL:** Never commit real secrets to git! The secrets.yaml file should be in `.gitignore`.
|
||||
|
||||
### Step 2: Apply Application Secrets
|
||||
### Step 2: CI/CD Secrets Configuration
|
||||
|
||||
**For production CI/CD setup, additional secrets are required:**
|
||||
|
||||
```bash
|
||||
# Create Docker Hub credentials secret (for image pulls)
|
||||
kubectl create secret docker-registry dockerhub-creds \
|
||||
--docker-server=docker.io \
|
||||
--docker-username=YOUR_DOCKERHUB_USERNAME \
|
||||
--docker-password=YOUR_DOCKERHUB_TOKEN \
|
||||
--docker-email=your-email@example.com \
|
||||
-n bakery-ia
|
||||
|
||||
# Create Gitea registry credentials (if using Gitea for CI/CD)
|
||||
kubectl create secret docker-registry gitea-registry-credentials \
|
||||
-n tekton-pipelines \
|
||||
--docker-server=gitea.bakery-ia.local:5000 \
|
||||
--docker-username=your-username \
|
||||
--docker-password=your-password
|
||||
|
||||
# Create Git credentials for Flux (if using GitOps)
|
||||
kubectl create secret generic gitea-credentials \
|
||||
-n flux-system \
|
||||
--from-literal=username=your-username \
|
||||
--from-literal=password=your-password
|
||||
```
|
||||
|
||||
### Step 3: Apply Application Secrets
|
||||
|
||||
```bash
|
||||
# Copy manifests to VPS (from local machine)
|
||||
@@ -938,7 +1004,30 @@ kubectl get secrets -n bakery-ia
|
||||
|
||||
## Database Migrations
|
||||
|
||||
### Step 0: Deploy SigNoz Monitoring (BEFORE Application)
|
||||
### Step 0: Deploy CI/CD Infrastructure (Optional but Recommended)
|
||||
|
||||
**For production environments, deploy CI/CD infrastructure components:**
|
||||
|
||||
```bash
|
||||
# Deploy Tekton Pipelines for CI/CD (optional but recommended for production)
|
||||
kubectl create namespace tekton-pipelines
|
||||
|
||||
# Install Tekton Pipelines
|
||||
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||||
|
||||
# Install Tekton Triggers
|
||||
kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
|
||||
|
||||
# Apply Tekton configurations
|
||||
kubectl apply -f ~/infrastructure/cicd/tekton/tasks/
|
||||
kubectl apply -f ~/infrastructure/cicd/tekton/pipelines/
|
||||
kubectl apply -f ~/infrastructure/cicd/tekton/triggers/
|
||||
|
||||
# Verify Tekton deployment
|
||||
kubectl get pods -n tekton-pipelines
|
||||
```
|
||||
|
||||
### Step 1: Deploy SigNoz Monitoring (BEFORE Application)
|
||||
|
||||
**⚠️ CRITICAL:** SigNoz must be deployed BEFORE the application into the **bakery-ia namespace** because the production kustomization patches SigNoz resources.
|
||||
|
||||
@@ -975,7 +1064,7 @@ kubectl get statefulset -n bakery-ia | grep signoz
|
||||
|
||||
**⚠️ Important:** Do NOT create a separate `signoz` namespace. SigNoz must be in `bakery-ia` namespace for the overlays to work correctly.
|
||||
|
||||
### Step 1: Deploy Application and Databases
|
||||
### Step 2: Deploy Application and Databases
|
||||
|
||||
```bash
|
||||
# On VPS
|
||||
@@ -1271,6 +1360,88 @@ kubectl logs -n bakery-ia deployment/signoz-otel-collector --tail=50 | grep -i "
|
||||
kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep filelog
|
||||
```
|
||||
|
||||
### Step 2: Configure CI/CD Infrastructure (Optional but Recommended)
|
||||
|
||||
If you deployed the CI/CD infrastructure, configure it for your workflow:
|
||||
|
||||
#### Gitea Setup (Git Server + Registry)
|
||||
```bash
|
||||
# Access Gitea at: http://gitea.bakery-ia.local (for dev) or http://gitea.bakewise.ai (for prod)
|
||||
# Make sure to add the appropriate hostname to /etc/hosts or configure DNS
|
||||
|
||||
# Create your repositories for each service
|
||||
# Configure webhook to trigger Tekton pipelines
|
||||
```
|
||||
|
||||
#### Tekton Pipeline Configuration
|
||||
```bash
|
||||
# Verify Tekton pipelines are running
|
||||
kubectl get pods -n tekton-pipelines
|
||||
|
||||
# Create a PipelineRun manually to test:
|
||||
kubectl create -f - <<EOF
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: PipelineRun
|
||||
metadata:
|
||||
name: manual-ci-run
|
||||
namespace: tekton-pipelines
|
||||
spec:
|
||||
pipelineRef:
|
||||
name: bakery-ia-ci
|
||||
workspaces:
|
||||
- name: shared-workspace
|
||||
volumeClaimTemplate:
|
||||
spec:
|
||||
accessModes: ["ReadWriteOnce"]
|
||||
resources:
|
||||
requests:
|
||||
storage: 5Gi
|
||||
- name: docker-credentials
|
||||
secret:
|
||||
secretName: gitea-registry-credentials
|
||||
params:
|
||||
- name: git-url
|
||||
value: "http://gitea.bakery-ia.local/bakery/bakery-ia.git"
|
||||
- name: git-revision
|
||||
value: "main"
|
||||
EOF
|
||||
```
|
||||
|
||||
#### Flux CD Configuration (GitOps)
|
||||
```bash
|
||||
# Verify Flux is running
|
||||
kubectl get pods -n flux-system
|
||||
|
||||
# Set up GitRepository and Kustomization resources for GitOps deployment
|
||||
# Example:
|
||||
cat <<EOF | kubectl apply -f -
|
||||
apiVersion: source.toolkit.fluxcd.io/v1
|
||||
kind: GitRepository
|
||||
metadata:
|
||||
name: bakery-ia
|
||||
namespace: flux-system
|
||||
spec:
|
||||
interval: 1m
|
||||
url: https://github.com/your-org/bakery-ia.git
|
||||
ref:
|
||||
branch: main
|
||||
---
|
||||
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
||||
kind: Kustomization
|
||||
metadata:
|
||||
name: bakery-ia
|
||||
namespace: flux-system
|
||||
spec:
|
||||
interval: 5m
|
||||
sourceRef:
|
||||
kind: GitRepository
|
||||
name: bakery-ia
|
||||
path: ./infrastructure/environments/prod/k8s-manifests
|
||||
prune: true
|
||||
validation: client
|
||||
EOF
|
||||
```
|
||||
|
||||
### Step 2: Configure Alerting
|
||||
|
||||
SigNoz includes integrated alerting with AlertManager. Configure it for your team:
|
||||
|
||||
@@ -12,14 +12,15 @@
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Monitoring & Observability](#monitoring--observability)
|
||||
3. [Security Operations](#security-operations)
|
||||
4. [Database Management](#database-management)
|
||||
5. [Backup & Recovery](#backup--recovery)
|
||||
6. [Performance Optimization](#performance-optimization)
|
||||
7. [Scaling Operations](#scaling-operations)
|
||||
8. [Incident Response](#incident-response)
|
||||
9. [Maintenance Tasks](#maintenance-tasks)
|
||||
10. [Compliance & Audit](#compliance--audit)
|
||||
3. [CI/CD Operations](#ci-cd-operations)
|
||||
4. [Security Operations](#security-operations)
|
||||
5. [Database Management](#database-management)
|
||||
6. [Backup & Recovery](#backup--recovery)
|
||||
7. [Performance Optimization](#performance-optimization)
|
||||
8. [Scaling Operations](#scaling-operations)
|
||||
9. [Incident Response](#incident-response)
|
||||
10. [Maintenance Tasks](#maintenance-tasks)
|
||||
11. [Compliance & Audit](#compliance--audit)
|
||||
|
||||
---
|
||||
|
||||
@@ -33,6 +34,8 @@
|
||||
- **Capacity:** 10-tenant pilot (scalable to 100+)
|
||||
- **Security:** TLS encryption, RBAC, audit logging
|
||||
- **Monitoring:** Prometheus, Grafana, AlertManager, SigNoz
|
||||
- **CI/CD:** Tekton Pipelines, Gitea, Flux CD (GitOps)
|
||||
- **Email:** Mailu (integrated email server)
|
||||
|
||||
**Key Metrics (10-tenant baseline):**
|
||||
- **Uptime Target:** 99.5% (3.65 hours downtime/month)
|
||||
@@ -46,11 +49,12 @@
|
||||
|
||||
| Role | Responsibilities |
|
||||
|------|------------------|
|
||||
| **DevOps Engineer** | Deployment, infrastructure, scaling |
|
||||
| **DevOps Engineer** | Deployment, infrastructure, scaling, CI/CD |
|
||||
| **SRE** | Monitoring, incident response, performance |
|
||||
| **Security Admin** | Access control, security patches, compliance |
|
||||
| **Database Admin** | Backups, optimization, migrations |
|
||||
| **On-Call Engineer** | 24/7 incident response (if applicable) |
|
||||
| **CI/CD Admin** | Pipeline management, GitOps workflows |
|
||||
|
||||
---
|
||||
|
||||
@@ -73,18 +77,6 @@ SigNoz is a comprehensive, open-source observability platform that provides:
|
||||
- **Database Monitoring** - All 18 PostgreSQL databases + Redis + RabbitMQ
|
||||
- **Kubernetes Monitoring** - Cluster, node, pod, and container metrics
|
||||
|
||||
**Port Forwarding (if ingress not available):**
|
||||
```bash
|
||||
# SigNoz Frontend (Main UI)
|
||||
kubectl port-forward -n bakery-ia svc/signoz 8080:8080
|
||||
|
||||
# SigNoz AlertManager
|
||||
kubectl port-forward -n bakery-ia svc/signoz-alertmanager 9093:9093
|
||||
|
||||
# OTel Collector (for debugging)
|
||||
kubectl port-forward -n bakery-ia svc/signoz-otel-collector 4317:4317 # gRPC
|
||||
kubectl port-forward -n bakery-ia svc/signoz-otel-collector 4318:4318 # HTTP
|
||||
```
|
||||
|
||||
### Key SigNoz Dashboards and Features
|
||||
|
||||
@@ -340,6 +332,116 @@ kubectl logs -n bakery-ia deployment/signoz-otel-collector | grep k8sattributes
|
||||
|
||||
---
|
||||
|
||||
## CI/CD Operations
|
||||
|
||||
### CI/CD Infrastructure Overview
|
||||
|
||||
The platform includes a complete CI/CD pipeline using:
|
||||
- **Gitea** - Git server and container registry
|
||||
- **Tekton** - Pipeline automation
|
||||
- **Flux CD** - GitOps deployment
|
||||
|
||||
### Access CI/CD Systems
|
||||
|
||||
**Gitea (Git Server):**
|
||||
- URL: http://gitea.bakery-ia.local (development) or http://gitea.bakewise.ai (production)
|
||||
- Admin panel: http://gitea.bakery-ia.local/admin
|
||||
|
||||
**Tekton Dashboard:**
|
||||
```bash
|
||||
# Port forward to access Tekton dashboard
|
||||
kubectl port-forward -n tekton-pipelines svc/tekton-dashboard 9097:9097
|
||||
# Access at: http://localhost:9097
|
||||
```
|
||||
|
||||
**Flux Status:**
|
||||
```bash
|
||||
# Check Flux status
|
||||
flux check
|
||||
kubectl get gitrepository -n flux-system
|
||||
kubectl get kustomization -n flux-system
|
||||
```
|
||||
|
||||
### CI/CD Monitoring
|
||||
|
||||
**Check pipeline status:**
|
||||
```bash
|
||||
# List all PipelineRuns
|
||||
kubectl get pipelineruns -n tekton-pipelines
|
||||
|
||||
# Check Tekton controller logs
|
||||
kubectl logs -n tekton-pipelines -l app=tekton-pipelines-controller
|
||||
|
||||
# Check Tekton dashboard logs
|
||||
kubectl logs -n tekton-pipelines -l app=tekton-dashboard
|
||||
```
|
||||
|
||||
**Monitor GitOps synchronization:**
|
||||
```bash
|
||||
# Check GitRepository status
|
||||
kubectl get gitrepository -n flux-system -o wide
|
||||
|
||||
# Check Kustomization status
|
||||
kubectl get kustomization -n flux-system -o wide
|
||||
|
||||
# Get reconciliation history
|
||||
kubectl get events -n flux-system --sort-by='.lastTimestamp'
|
||||
```
|
||||
|
||||
### CI/CD Troubleshooting
|
||||
|
||||
**Pipeline not triggering:**
|
||||
```bash
|
||||
# Check Gitea webhook logs
|
||||
kubectl logs -n tekton-pipelines -l app=tekton-triggers-controller
|
||||
|
||||
# Verify EventListener pods are running
|
||||
kubectl get pods -n tekton-pipelines -l app=tekton-triggers-eventlistener
|
||||
|
||||
# Check TriggerBinding configuration
|
||||
kubectl get triggerbinding -n tekton-pipelines
|
||||
```
|
||||
|
||||
**Build failures:**
|
||||
```bash
|
||||
# Check Kaniko logs for build errors
|
||||
kubectl logs -n tekton-pipelines -l tekton.dev/task=kaniko-build
|
||||
|
||||
# Verify Dockerfile paths are correct
|
||||
kubectl describe taskrun -n tekton-pipelines
|
||||
```
|
||||
|
||||
**Flux not applying changes:**
|
||||
```bash
|
||||
# Check GitRepository status
|
||||
kubectl describe gitrepository -n flux-system
|
||||
|
||||
# Check Kustomization reconciliation
|
||||
kubectl describe kustomization -n flux-system
|
||||
|
||||
# Check Flux logs
|
||||
kubectl logs -n flux-system -l app.kubernetes.io/name=helm-controller
|
||||
```
|
||||
|
||||
### CI/CD Maintenance Tasks
|
||||
|
||||
**Daily Tasks:**
|
||||
- [ ] Check for failed pipeline runs
|
||||
- [ ] Verify GitOps synchronization status
|
||||
- [ ] Clean up old PipelineRun resources
|
||||
|
||||
**Weekly Tasks:**
|
||||
- [ ] Review pipeline performance metrics
|
||||
- [ ] Update pipeline definitions if needed
|
||||
- [ ] Rotate CI/CD secrets
|
||||
|
||||
**Monthly Tasks:**
|
||||
- [ ] Update Tekton and Flux versions
|
||||
- [ ] Review and optimize pipeline performance
|
||||
- [ ] Audit CI/CD access permissions
|
||||
|
||||
---
|
||||
|
||||
## Security Operations
|
||||
|
||||
### Security Posture Overview
|
||||
@@ -1210,6 +1312,8 @@ kubectl exec -n bakery-ia deployment/auth-db -- \
|
||||
- [TLS Configuration](./tls-configuration.md) - Certificate management
|
||||
- [RBAC Implementation](./rbac-implementation.md) - Access control configuration
|
||||
- [Monitoring Stack README](../infrastructure/kubernetes/base/components/monitoring/README.md) - Detailed monitoring documentation
|
||||
- [CI/CD Infrastructure README](../infrastructure/cicd/README.md) - Gitea, Tekton, and Flux CD setup and operations
|
||||
- [SigNoz Monitoring README](../infrastructure/monitoring/signoz/README.md) - SigNoz deployment and configuration
|
||||
|
||||
**External Resources:**
|
||||
- Kubernetes: https://kubernetes.io/docs
|
||||
|
||||
@@ -28,6 +28,7 @@ metadata:
|
||||
note: "Registry credentials for pushing images"
|
||||
type: kubernetes.io/dockerconfigjson
|
||||
stringData:
|
||||
{{- if and .Values.secrets.registry.registryUrl .Values.secrets.registry.username .Values.secrets.registry.password }}
|
||||
.dockerconfigjson: |
|
||||
{
|
||||
"auths": {
|
||||
@@ -37,6 +38,9 @@ stringData:
|
||||
}
|
||||
}
|
||||
}
|
||||
{{- else }}
|
||||
.dockerconfigjson: '{"auths":{}}'
|
||||
{{- end }}
|
||||
---
|
||||
# Secret for Git credentials (used by pipeline to push GitOps updates)
|
||||
apiVersion: v1
|
||||
|
||||
@@ -83,6 +83,10 @@ images:
|
||||
- name: bitnami/kubectl
|
||||
newName: localhost:5000/bitnami_kubectl_latest
|
||||
newTag: latest
|
||||
# DNS resolver
|
||||
- name: mvance/unbound
|
||||
newName: localhost:5000/mvance_unbound_latest
|
||||
newTag: latest
|
||||
# Alpine variants
|
||||
- name: alpine
|
||||
newName: localhost:5000/alpine_3.19
|
||||
|
||||
@@ -221,6 +221,9 @@ images:
|
||||
newTag: latest
|
||||
- name: bitnami/kubectl
|
||||
newTag: latest
|
||||
# DNS resolver
|
||||
- name: mvance/unbound
|
||||
newTag: latest
|
||||
# Alpine variants
|
||||
- name: alpine
|
||||
newTag: "3.19"
|
||||
|
||||
@@ -10,844 +10,3 @@ global:
|
||||
clusterName: "bakery-ia-dev"
|
||||
domain: "monitoring.bakery-ia.local"
|
||||
# Docker Hub credentials - applied to all sub-charts (including Zookeeper, ClickHouse, etc)
|
||||
imagePullSecrets:
|
||||
- dockerhub-creds
|
||||
|
||||
# Docker Hub credentials for pulling images (root level for SigNoz components)
|
||||
imagePullSecrets:
|
||||
- dockerhub-creds
|
||||
|
||||
# SignOz Main Component (includes frontend and query service)
|
||||
signoz:
|
||||
replicaCount: 1
|
||||
|
||||
service:
|
||||
type: ClusterIP
|
||||
port: 8080
|
||||
|
||||
# DISABLE built-in ingress - using unified bakery-ingress instead
|
||||
# Route configured in infrastructure/kubernetes/overlays/dev/dev-ingress.yaml
|
||||
ingress:
|
||||
enabled: false
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m # Combined frontend + query service
|
||||
memory: 256Mi
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 1Gi
|
||||
|
||||
# Environment variables (new format - replaces configVars)
|
||||
env:
|
||||
signoz_telemetrystore_provider: "clickhouse"
|
||||
dot_metrics_enabled: "true"
|
||||
signoz_emailing_enabled: "false"
|
||||
signoz_alertmanager_provider: "signoz"
|
||||
# Retention for dev (7 days)
|
||||
signoz_traces_ttl_duration_hrs: "168"
|
||||
signoz_metrics_ttl_duration_hrs: "168"
|
||||
signoz_logs_ttl_duration_hrs: "168"
|
||||
# OpAMP Server Configuration - DISABLED for dev (causes gRPC instability)
|
||||
signoz_opamp_server_enabled: "false"
|
||||
# signoz_opamp_server_endpoint: "0.0.0.0:4320"
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 5Gi
|
||||
storageClass: "standard"
|
||||
|
||||
# AlertManager Configuration
|
||||
alertmanager:
|
||||
replicaCount: 1
|
||||
image:
|
||||
repository: signoz/alertmanager
|
||||
tag: 0.23.5
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
service:
|
||||
type: ClusterIP
|
||||
port: 9093
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 25m # Reduced for local dev
|
||||
memory: 64Mi # Reduced for local dev
|
||||
limits:
|
||||
cpu: 200m
|
||||
memory: 256Mi
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 2Gi
|
||||
storageClass: "standard"
|
||||
|
||||
config:
|
||||
global:
|
||||
resolve_timeout: 5m
|
||||
route:
|
||||
group_by: ['alertname', 'cluster', 'service']
|
||||
group_wait: 10s
|
||||
group_interval: 10s
|
||||
repeat_interval: 12h
|
||||
receiver: 'default'
|
||||
receivers:
|
||||
- name: 'default'
|
||||
# Add email, slack, webhook configs here
|
||||
|
||||
# ClickHouse Configuration - Time Series Database
|
||||
# Minimal resources for local development on constrained Kind cluster
|
||||
clickhouse:
|
||||
enabled: true
|
||||
installCustomStorageClass: false
|
||||
|
||||
image:
|
||||
registry: docker.io
|
||||
repository: clickhouse/clickhouse-server
|
||||
tag: 25.5.6 # Official recommended version
|
||||
|
||||
# Reduce ClickHouse resource requests for local dev
|
||||
clickhouse:
|
||||
resources:
|
||||
requests:
|
||||
cpu: 200m # Reduced from default 500m
|
||||
memory: 512Mi
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 1Gi
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 20Gi
|
||||
|
||||
# Zookeeper Configuration (required by ClickHouse)
|
||||
zookeeper:
|
||||
enabled: true
|
||||
replicaCount: 1 # Single replica for dev
|
||||
|
||||
image:
|
||||
tag: 3.7.1 # Official recommended version
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 256Mi
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 5Gi
|
||||
|
||||
# OpenTelemetry Collector - Data ingestion endpoint for all telemetry
|
||||
otelCollector:
|
||||
enabled: true
|
||||
replicaCount: 1
|
||||
|
||||
image:
|
||||
repository: signoz/signoz-otel-collector
|
||||
tag: v0.129.12 # Latest recommended version
|
||||
|
||||
# OpAMP Configuration - DISABLED for development
|
||||
# OpAMP is designed for production with remote config management
|
||||
# In dev, it causes gRPC instability and collector reloads
|
||||
# We use static configuration instead
|
||||
|
||||
# Init containers for the Otel Collector pod
|
||||
initContainers:
|
||||
fix-postgres-tls:
|
||||
enabled: true
|
||||
image:
|
||||
registry: docker.io
|
||||
repository: busybox
|
||||
tag: 1.35
|
||||
pullPolicy: IfNotPresent
|
||||
command:
|
||||
- sh
|
||||
- -c
|
||||
- |
|
||||
echo "Fixing PostgreSQL TLS file permissions..."
|
||||
cp /etc/postgres-tls-source/* /etc/postgres-tls/
|
||||
chmod 600 /etc/postgres-tls/server-key.pem
|
||||
chmod 644 /etc/postgres-tls/server-cert.pem
|
||||
chmod 644 /etc/postgres-tls/ca-cert.pem
|
||||
echo "PostgreSQL TLS permissions fixed"
|
||||
volumeMounts:
|
||||
- name: postgres-tls-source
|
||||
mountPath: /etc/postgres-tls-source
|
||||
readOnly: true
|
||||
- name: postgres-tls-fixed
|
||||
mountPath: /etc/postgres-tls
|
||||
readOnly: false
|
||||
|
||||
# Service configuration - expose both gRPC and HTTP endpoints
|
||||
service:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
# gRPC receivers
|
||||
- name: otlp-grpc
|
||||
port: 4317
|
||||
targetPort: 4317
|
||||
protocol: TCP
|
||||
# HTTP receivers
|
||||
- name: otlp-http
|
||||
port: 4318
|
||||
targetPort: 4318
|
||||
protocol: TCP
|
||||
# Prometheus remote write
|
||||
- name: prometheus
|
||||
port: 8889
|
||||
targetPort: 8889
|
||||
protocol: TCP
|
||||
# Metrics
|
||||
- name: metrics
|
||||
port: 8888
|
||||
targetPort: 8888
|
||||
protocol: TCP
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 50m # Reduced from 100m
|
||||
memory: 128Mi # Reduced from 256Mi
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
|
||||
# Additional environment variables for receivers
|
||||
additionalEnvs:
|
||||
POSTGRES_MONITOR_USER: "monitoring"
|
||||
POSTGRES_MONITOR_PASSWORD: "monitoring_369f9c001f242b07ef9e2826e17169ca"
|
||||
REDIS_PASSWORD: "OxdmdJjdVNXp37MNC2IFoMnTpfGGFv1k"
|
||||
RABBITMQ_USER: "bakery"
|
||||
RABBITMQ_PASSWORD: "forecast123"
|
||||
|
||||
# Mount TLS certificates for secure connections
|
||||
extraVolumes:
|
||||
- name: redis-tls
|
||||
secret:
|
||||
secretName: redis-tls-secret
|
||||
- name: postgres-tls
|
||||
secret:
|
||||
secretName: postgres-tls
|
||||
- name: postgres-tls-fixed
|
||||
emptyDir: {}
|
||||
- name: varlogpods
|
||||
hostPath:
|
||||
path: /var/log/pods
|
||||
|
||||
extraVolumeMounts:
|
||||
- name: redis-tls
|
||||
mountPath: /etc/redis-tls
|
||||
readOnly: true
|
||||
- name: postgres-tls
|
||||
mountPath: /etc/postgres-tls-source
|
||||
readOnly: true
|
||||
- name: postgres-tls-fixed
|
||||
mountPath: /etc/postgres-tls
|
||||
readOnly: false
|
||||
- name: varlogpods
|
||||
mountPath: /var/log/pods
|
||||
readOnly: true
|
||||
|
||||
# Disable OpAMP - use static configuration only
|
||||
# Use 'args' instead of 'extraArgs' to completely override the command
|
||||
command:
|
||||
name: /signoz-otel-collector
|
||||
args:
|
||||
- --config=/conf/otel-collector-config.yaml
|
||||
- --feature-gates=-pkg.translator.prometheus.NormalizeName
|
||||
|
||||
# OpenTelemetry Collector configuration
|
||||
config:
|
||||
# Connectors - bridge between pipelines
|
||||
connectors:
|
||||
signozmeter:
|
||||
dimensions:
|
||||
- name: service.name
|
||||
- name: deployment.environment
|
||||
- name: host.name
|
||||
metrics_flush_interval: 1h
|
||||
|
||||
receivers:
|
||||
# OTLP receivers for traces, metrics, and logs from applications
|
||||
# All application telemetry is pushed via OTLP protocol
|
||||
otlp:
|
||||
protocols:
|
||||
grpc:
|
||||
endpoint: 0.0.0.0:4317
|
||||
http:
|
||||
endpoint: 0.0.0.0:4318
|
||||
cors:
|
||||
allowed_origins:
|
||||
- "*"
|
||||
|
||||
# Filelog receiver for Kubernetes pod logs
|
||||
# Collects container stdout/stderr from /var/log/pods
|
||||
filelog:
|
||||
include:
|
||||
- /var/log/pods/*/*/*.log
|
||||
exclude:
|
||||
# Exclude SigNoz's own logs to avoid recursive collection
|
||||
- /var/log/pods/bakery-ia_signoz-*/*/*.log
|
||||
include_file_path: true
|
||||
include_file_name: false
|
||||
operators:
|
||||
# Parse CRI-O / containerd log format
|
||||
- type: regex_parser
|
||||
regex: '^(?P<time>[^ ]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) (?P<log>.*)$'
|
||||
timestamp:
|
||||
parse_from: attributes.time
|
||||
layout: '%Y-%m-%dT%H:%M:%S.%LZ'
|
||||
# Fix timestamp parsing - extract from the parsed time field
|
||||
- type: move
|
||||
from: attributes.time
|
||||
to: attributes.timestamp
|
||||
# Extract Kubernetes metadata from file path
|
||||
- type: regex_parser
|
||||
id: extract_metadata_from_filepath
|
||||
regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[^\/]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
|
||||
parse_from: attributes["log.file.path"]
|
||||
# Move metadata to resource attributes
|
||||
- type: move
|
||||
from: attributes.namespace
|
||||
to: resource["k8s.namespace.name"]
|
||||
- type: move
|
||||
from: attributes.pod_name
|
||||
to: resource["k8s.pod.name"]
|
||||
- type: move
|
||||
from: attributes.container_name
|
||||
to: resource["k8s.container.name"]
|
||||
- type: move
|
||||
from: attributes.log
|
||||
to: body
|
||||
|
||||
# Kubernetes Cluster Receiver - Collects cluster-level metrics
|
||||
# Provides information about nodes, namespaces, pods, and other cluster resources
|
||||
k8s_cluster:
|
||||
collection_interval: 30s
|
||||
node_conditions_to_report:
|
||||
- Ready
|
||||
- MemoryPressure
|
||||
- DiskPressure
|
||||
- PIDPressure
|
||||
- NetworkUnavailable
|
||||
allocatable_types_to_report:
|
||||
- cpu
|
||||
- memory
|
||||
- pods
|
||||
|
||||
|
||||
|
||||
# PostgreSQL receivers for database metrics
|
||||
# ENABLED: Monitor users configured and credentials stored in secrets
|
||||
# Collects metrics directly from PostgreSQL databases with proper TLS
|
||||
postgresql/auth:
|
||||
endpoint: auth-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- auth_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/inventory:
|
||||
endpoint: inventory-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- inventory_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/orders:
|
||||
endpoint: orders-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- orders_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/ai-insights:
|
||||
endpoint: ai-insights-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- ai_insights_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/alert-processor:
|
||||
endpoint: alert-processor-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- alert_processor_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/distribution:
|
||||
endpoint: distribution-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- distribution_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/external:
|
||||
endpoint: external-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- external_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/forecasting:
|
||||
endpoint: forecasting-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- forecasting_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/notification:
|
||||
endpoint: notification-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- notification_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/orchestrator:
|
||||
endpoint: orchestrator-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- orchestrator_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/pos:
|
||||
endpoint: pos-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- pos_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/procurement:
|
||||
endpoint: procurement-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- procurement_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/production:
|
||||
endpoint: production-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- production_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/recipes:
|
||||
endpoint: recipes-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- recipes_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/sales:
|
||||
endpoint: sales-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- sales_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/suppliers:
|
||||
endpoint: suppliers-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- suppliers_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/tenant:
|
||||
endpoint: tenant-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- tenant_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/training:
|
||||
endpoint: training-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- training_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
# Redis receiver for cache metrics
|
||||
# ENABLED: Using existing credentials from redis-secrets with TLS
|
||||
redis:
|
||||
endpoint: redis-service.bakery-ia:6379
|
||||
password: ${env:REDIS_PASSWORD}
|
||||
collection_interval: 60s
|
||||
transport: tcp
|
||||
tls:
|
||||
insecure_skip_verify: false
|
||||
cert_file: /etc/redis-tls/redis-cert.pem
|
||||
key_file: /etc/redis-tls/redis-key.pem
|
||||
ca_file: /etc/redis-tls/ca-cert.pem
|
||||
metrics:
|
||||
redis.maxmemory:
|
||||
enabled: true
|
||||
redis.cmd.latency:
|
||||
enabled: true
|
||||
|
||||
# RabbitMQ receiver via management API
|
||||
# ENABLED: Using existing credentials from rabbitmq-secrets
|
||||
rabbitmq:
|
||||
endpoint: http://rabbitmq-service.bakery-ia:15672
|
||||
username: ${env:RABBITMQ_USER}
|
||||
password: ${env:RABBITMQ_PASSWORD}
|
||||
collection_interval: 30s
|
||||
|
||||
# Prometheus Receiver - Scrapes metrics from Kubernetes API
|
||||
# Simplified configuration using only Kubernetes API metrics
|
||||
prometheus:
|
||||
config:
|
||||
scrape_configs:
|
||||
- job_name: 'kubernetes-nodes-cadvisor'
|
||||
scrape_interval: 30s
|
||||
scrape_timeout: 10s
|
||||
scheme: https
|
||||
tls_config:
|
||||
insecure_skip_verify: true
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
kubernetes_sd_configs:
|
||||
- role: node
|
||||
relabel_configs:
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_label_(.+)
|
||||
- target_label: __address__
|
||||
replacement: kubernetes.default.svc:443
|
||||
- source_labels: [__meta_kubernetes_node_name]
|
||||
regex: (.+)
|
||||
target_label: __metrics_path__
|
||||
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
|
||||
- job_name: 'kubernetes-apiserver'
|
||||
scrape_interval: 30s
|
||||
scrape_timeout: 10s
|
||||
scheme: https
|
||||
tls_config:
|
||||
insecure_skip_verify: true
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
kubernetes_sd_configs:
|
||||
- role: endpoints
|
||||
relabel_configs:
|
||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
||||
action: keep
|
||||
regex: default;kubernetes;https
|
||||
|
||||
processors:
|
||||
# Batch processor for better performance (optimized for high throughput)
|
||||
batch:
|
||||
timeout: 1s
|
||||
send_batch_size: 10000 # Increased from 1024 for better performance
|
||||
send_batch_max_size: 10000
|
||||
|
||||
# Batch processor for meter data
|
||||
batch/meter:
|
||||
timeout: 1s
|
||||
send_batch_size: 20000
|
||||
send_batch_max_size: 25000
|
||||
|
||||
# Memory limiter to prevent OOM
|
||||
memory_limiter:
|
||||
check_interval: 1s
|
||||
limit_mib: 400
|
||||
spike_limit_mib: 100
|
||||
|
||||
# Resource detection
|
||||
resourcedetection:
|
||||
detectors: [env, system, docker]
|
||||
timeout: 5s
|
||||
|
||||
# Kubernetes attributes processor - CRITICAL for logs
|
||||
# Extracts pod, namespace, container metadata from log attributes
|
||||
k8sattributes:
|
||||
auth_type: "serviceAccount"
|
||||
passthrough: false
|
||||
extract:
|
||||
metadata:
|
||||
- k8s.pod.name
|
||||
- k8s.pod.uid
|
||||
- k8s.deployment.name
|
||||
- k8s.namespace.name
|
||||
- k8s.node.name
|
||||
- k8s.container.name
|
||||
labels:
|
||||
- tag_name: "app"
|
||||
- tag_name: "pod-template-hash"
|
||||
annotations:
|
||||
- tag_name: "description"
|
||||
|
||||
# SigNoz span metrics processor with delta aggregation (recommended)
|
||||
# Generates RED metrics (Rate, Error, Duration) from trace spans
|
||||
signozspanmetrics/delta:
|
||||
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
|
||||
metrics_exporter: signozclickhousemetrics
|
||||
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s]
|
||||
dimensions_cache_size: 100000
|
||||
dimensions:
|
||||
- name: service.namespace
|
||||
default: default
|
||||
- name: deployment.environment
|
||||
default: default
|
||||
- name: signoz.collector.id
|
||||
|
||||
exporters:
|
||||
# ClickHouse exporter for traces
|
||||
clickhousetraces:
|
||||
datasource: tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/?database=signoz_traces
|
||||
timeout: 10s
|
||||
retry_on_failure:
|
||||
enabled: true
|
||||
initial_interval: 5s
|
||||
max_interval: 30s
|
||||
max_elapsed_time: 300s
|
||||
|
||||
# ClickHouse exporter for metrics
|
||||
signozclickhousemetrics:
|
||||
dsn: "tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/signoz_metrics"
|
||||
timeout: 10s
|
||||
retry_on_failure:
|
||||
enabled: true
|
||||
initial_interval: 5s
|
||||
max_interval: 30s
|
||||
max_elapsed_time: 300s
|
||||
|
||||
# ClickHouse exporter for meter data (usage metrics)
|
||||
signozclickhousemeter:
|
||||
dsn: "tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/signoz_meter"
|
||||
timeout: 45s
|
||||
sending_queue:
|
||||
enabled: false
|
||||
|
||||
# ClickHouse exporter for logs
|
||||
clickhouselogsexporter:
|
||||
dsn: tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/?database=signoz_logs
|
||||
timeout: 10s
|
||||
retry_on_failure:
|
||||
enabled: true
|
||||
initial_interval: 5s
|
||||
max_interval: 30s
|
||||
|
||||
# Metadata exporter for service metadata
|
||||
metadataexporter:
|
||||
dsn: "tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/signoz_metadata"
|
||||
timeout: 10s
|
||||
cache:
|
||||
provider: in_memory
|
||||
|
||||
# Debug exporter for debugging (optional)
|
||||
debug:
|
||||
verbosity: detailed
|
||||
sampling_initial: 5
|
||||
sampling_thereafter: 200
|
||||
|
||||
service:
|
||||
pipelines:
|
||||
# Traces pipeline - exports to ClickHouse and signozmeter connector
|
||||
traces:
|
||||
receivers: [otlp]
|
||||
processors: [memory_limiter, batch, signozspanmetrics/delta, resourcedetection]
|
||||
exporters: [clickhousetraces, metadataexporter, signozmeter]
|
||||
|
||||
# Metrics pipeline
|
||||
metrics:
|
||||
receivers: [otlp,
|
||||
postgresql/auth, postgresql/inventory, postgresql/orders,
|
||||
postgresql/ai-insights, postgresql/alert-processor, postgresql/distribution,
|
||||
postgresql/external, postgresql/forecasting, postgresql/notification,
|
||||
postgresql/orchestrator, postgresql/pos, postgresql/procurement,
|
||||
postgresql/production, postgresql/recipes, postgresql/sales,
|
||||
postgresql/suppliers, postgresql/tenant, postgresql/training,
|
||||
redis, rabbitmq, k8s_cluster, prometheus]
|
||||
processors: [memory_limiter, batch, resourcedetection]
|
||||
exporters: [signozclickhousemetrics]
|
||||
|
||||
# Meter pipeline - receives from signozmeter connector
|
||||
metrics/meter:
|
||||
receivers: [signozmeter]
|
||||
processors: [batch/meter]
|
||||
exporters: [signozclickhousemeter]
|
||||
|
||||
# Logs pipeline - includes both OTLP and Kubernetes pod logs
|
||||
logs:
|
||||
receivers: [otlp, filelog]
|
||||
processors: [memory_limiter, batch, resourcedetection, k8sattributes]
|
||||
exporters: [clickhouselogsexporter]
|
||||
|
||||
# ClusterRole configuration for Kubernetes monitoring
|
||||
# CRITICAL: Required for k8s_cluster receiver to access Kubernetes API
|
||||
# Without these permissions, k8s metrics will not appear in SigNoz UI
|
||||
clusterRole:
|
||||
create: true
|
||||
name: "signoz-otel-collector-bakery-ia"
|
||||
annotations: {}
|
||||
# Complete RBAC rules required by k8sclusterreceiver
|
||||
# Based on OpenTelemetry and SigNoz official documentation
|
||||
rules:
|
||||
# Core API group - fundamental Kubernetes resources
|
||||
- apiGroups: [""]
|
||||
resources:
|
||||
- "events"
|
||||
- "namespaces"
|
||||
- "nodes"
|
||||
- "nodes/proxy"
|
||||
- "nodes/metrics"
|
||||
- "nodes/spec"
|
||||
- "pods"
|
||||
- "pods/status"
|
||||
- "replicationcontrollers"
|
||||
- "replicationcontrollers/status"
|
||||
- "resourcequotas"
|
||||
- "services"
|
||||
- "endpoints"
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Apps API group - modern workload controllers
|
||||
- apiGroups: ["apps"]
|
||||
resources: ["deployments", "daemonsets", "statefulsets", "replicasets"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Batch API group - job management
|
||||
- apiGroups: ["batch"]
|
||||
resources: ["jobs", "cronjobs"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Autoscaling API group - HPA metrics (CRITICAL)
|
||||
- apiGroups: ["autoscaling"]
|
||||
resources: ["horizontalpodautoscalers"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Extensions API group - legacy support
|
||||
- apiGroups: ["extensions"]
|
||||
resources: ["deployments", "daemonsets", "replicasets"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Metrics API group - resource metrics
|
||||
- apiGroups: ["metrics.k8s.io"]
|
||||
resources: ["nodes", "pods"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
clusterRoleBinding:
|
||||
annotations: {}
|
||||
name: "signoz-otel-collector-bakery-ia"
|
||||
|
||||
# Additional Configuration
|
||||
serviceAccount:
|
||||
create: true
|
||||
annotations: {}
|
||||
name: "signoz-otel-collector"
|
||||
|
||||
# Security Context
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
fsGroup: 1000
|
||||
|
||||
# Network Policies (disabled for dev)
|
||||
networkPolicy:
|
||||
enabled: false
|
||||
|
||||
# Monitoring SigNoz itself
|
||||
selfMonitoring:
|
||||
enabled: true
|
||||
serviceMonitor:
|
||||
enabled: false
|
||||
|
||||
@@ -10,989 +10,3 @@ global:
|
||||
clusterName: "bakery-ia-prod"
|
||||
domain: "monitoring.bakewise.ai"
|
||||
# Docker Hub credentials - applied to all sub-charts (including Zookeeper, ClickHouse, etc)
|
||||
imagePullSecrets:
|
||||
- dockerhub-creds
|
||||
|
||||
# Docker Hub credentials for pulling images (root level for SigNoz components)
|
||||
imagePullSecrets:
|
||||
- dockerhub-creds
|
||||
|
||||
# SigNoz Main Component (unified frontend + query service)
|
||||
# BREAKING CHANGE: v0.89.0+ uses unified component instead of separate frontend/queryService
|
||||
signoz:
|
||||
replicaCount: 2
|
||||
|
||||
image:
|
||||
repository: signoz/signoz
|
||||
tag: v0.106.0 # Latest stable version
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
service:
|
||||
type: ClusterIP
|
||||
port: 8080 # HTTP/API port
|
||||
internalPort: 8085 # Internal gRPC port
|
||||
|
||||
# DISABLE built-in ingress - using unified bakery-ingress-prod instead
|
||||
# Route configured in infrastructure/kubernetes/overlays/prod/prod-ingress.yaml
|
||||
ingress:
|
||||
enabled: false
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 500m
|
||||
memory: 1Gi
|
||||
limits:
|
||||
cpu: 2000m
|
||||
memory: 4Gi
|
||||
|
||||
# Pod Anti-affinity for HA
|
||||
affinity:
|
||||
podAntiAffinity:
|
||||
preferredDuringSchedulingIgnoredDuringExecution:
|
||||
- weight: 100
|
||||
podAffinityTerm:
|
||||
labelSelector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/component: query-service
|
||||
topologyKey: kubernetes.io/hostname
|
||||
|
||||
# Environment variables (new format - replaces configVars)
|
||||
env:
|
||||
signoz_telemetrystore_provider: "clickhouse"
|
||||
dot_metrics_enabled: "true"
|
||||
signoz_emailing_enabled: "true"
|
||||
signoz_alertmanager_provider: "signoz"
|
||||
# Retention configuration (30 days for prod)
|
||||
signoz_traces_ttl_duration_hrs: "720"
|
||||
signoz_metrics_ttl_duration_hrs: "720"
|
||||
signoz_logs_ttl_duration_hrs: "720"
|
||||
# OpAMP Server Configuration
|
||||
# WARNING: OpAMP can cause gRPC instability and collector reloads
|
||||
# Only enable if you have a stable OpAMP backend server
|
||||
signoz_opamp_server_enabled: "false"
|
||||
# signoz_opamp_server_endpoint: "0.0.0.0:4320"
|
||||
# SMTP configuration for email alerts - now using Mailu as SMTP server
|
||||
signoz_smtp_enabled: "true"
|
||||
signoz_smtp_host: "mailu-postfix.bakery-ia.svc.cluster.local"
|
||||
signoz_smtp_port: "587"
|
||||
signoz_smtp_from: "alerts@bakewise.ai"
|
||||
signoz_smtp_username: "alerts@bakewise.ai"
|
||||
# Password should be set via secret: signoz_smtp_password
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 20Gi
|
||||
storageClass: "standard"
|
||||
|
||||
# Horizontal Pod Autoscaler
|
||||
autoscaling:
|
||||
enabled: true
|
||||
minReplicas: 2
|
||||
maxReplicas: 5
|
||||
targetCPUUtilizationPercentage: 70
|
||||
targetMemoryUtilizationPercentage: 80
|
||||
|
||||
# AlertManager Configuration
|
||||
alertmanager:
|
||||
enabled: true
|
||||
replicaCount: 2
|
||||
|
||||
image:
|
||||
repository: signoz/alertmanager
|
||||
tag: 0.23.5
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
service:
|
||||
type: ClusterIP
|
||||
port: 9093
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
|
||||
# Pod Anti-affinity for HA
|
||||
affinity:
|
||||
podAntiAffinity:
|
||||
preferredDuringSchedulingIgnoredDuringExecution:
|
||||
- weight: 100
|
||||
podAffinityTerm:
|
||||
labelSelector:
|
||||
matchExpressions:
|
||||
- key: app
|
||||
operator: In
|
||||
values:
|
||||
- signoz-alertmanager
|
||||
topologyKey: kubernetes.io/hostname
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 5Gi
|
||||
storageClass: "standard"
|
||||
|
||||
config:
|
||||
global:
|
||||
resolve_timeout: 5m
|
||||
smtp_smarthost: 'mailu-postfix.bakery-ia.svc.cluster.local:587'
|
||||
smtp_from: 'alerts@bakewise.ai'
|
||||
smtp_auth_username: 'alerts@bakewise.ai'
|
||||
smtp_auth_password: '${SMTP_PASSWORD}'
|
||||
smtp_require_tls: true
|
||||
|
||||
route:
|
||||
group_by: ['alertname', 'cluster', 'service', 'severity']
|
||||
group_wait: 10s
|
||||
group_interval: 10s
|
||||
repeat_interval: 12h
|
||||
receiver: 'critical-alerts'
|
||||
routes:
|
||||
- match:
|
||||
severity: critical
|
||||
receiver: 'critical-alerts'
|
||||
continue: true
|
||||
- match:
|
||||
severity: warning
|
||||
receiver: 'warning-alerts'
|
||||
|
||||
receivers:
|
||||
- name: 'critical-alerts'
|
||||
email_configs:
|
||||
- to: 'critical-alerts@bakewise.ai'
|
||||
headers:
|
||||
Subject: '[CRITICAL] {{ .GroupLabels.alertname }} - Bakery IA'
|
||||
# Slack webhook for critical alerts
|
||||
slack_configs:
|
||||
- api_url: '${SLACK_WEBHOOK_URL}'
|
||||
channel: '#alerts-critical'
|
||||
title: '[CRITICAL] {{ .GroupLabels.alertname }}'
|
||||
text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'
|
||||
|
||||
- name: 'warning-alerts'
|
||||
email_configs:
|
||||
- to: 'oncall@bakewise.ai'
|
||||
headers:
|
||||
Subject: '[WARNING] {{ .GroupLabels.alertname }} - Bakery IA'
|
||||
|
||||
# ClickHouse Configuration - Time Series Database
|
||||
clickhouse:
|
||||
enabled: true
|
||||
installCustomStorageClass: false
|
||||
|
||||
image:
|
||||
registry: docker.io
|
||||
repository: clickhouse/clickhouse-server
|
||||
tag: 25.5.6 # Updated to official recommended version
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
# ClickHouse resources (nested config)
|
||||
clickhouse:
|
||||
resources:
|
||||
requests:
|
||||
cpu: 1000m
|
||||
memory: 2Gi
|
||||
limits:
|
||||
cpu: 4000m
|
||||
memory: 8Gi
|
||||
|
||||
# Pod Anti-affinity for HA
|
||||
affinity:
|
||||
podAntiAffinity:
|
||||
requiredDuringSchedulingIgnoredDuringExecution:
|
||||
- labelSelector:
|
||||
matchExpressions:
|
||||
- key: app
|
||||
operator: In
|
||||
values:
|
||||
- signoz-clickhouse
|
||||
topologyKey: kubernetes.io/hostname
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 100Gi
|
||||
storageClass: "standard"
|
||||
|
||||
# Cold storage configuration for better disk space management
|
||||
coldStorage:
|
||||
enabled: true
|
||||
defaultKeepFreeSpaceBytes: 10737418240 # Keep 10GB free
|
||||
ttl:
|
||||
deleteTTLDays: 30 # Move old data to cold storage after 30 days
|
||||
|
||||
# Zookeeper Configuration (required by ClickHouse for coordination)
|
||||
zookeeper:
|
||||
enabled: true
|
||||
replicaCount: 3 # CRITICAL: Always use 3 replicas for production HA
|
||||
|
||||
image:
|
||||
tag: 3.7.1 # Official recommended version
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 256Mi
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 10Gi
|
||||
storageClass: "standard"
|
||||
|
||||
# OpenTelemetry Collector - Integrated with SigNoz
|
||||
otelCollector:
|
||||
enabled: true
|
||||
replicaCount: 2
|
||||
|
||||
image:
|
||||
repository: signoz/signoz-otel-collector
|
||||
tag: v0.129.12 # Updated to latest recommended version
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
# Init containers for the Otel Collector pod
|
||||
initContainers:
|
||||
fix-postgres-tls:
|
||||
enabled: true
|
||||
image:
|
||||
registry: docker.io
|
||||
repository: busybox
|
||||
tag: 1.35
|
||||
pullPolicy: IfNotPresent
|
||||
command:
|
||||
- sh
|
||||
- -c
|
||||
- |
|
||||
echo "Fixing PostgreSQL TLS file permissions..."
|
||||
cp /etc/postgres-tls-source/* /etc/postgres-tls/
|
||||
chmod 600 /etc/postgres-tls/server-key.pem
|
||||
chmod 644 /etc/postgres-tls/server-cert.pem
|
||||
chmod 644 /etc/postgres-tls/ca-cert.pem
|
||||
echo "PostgreSQL TLS permissions fixed"
|
||||
volumeMounts:
|
||||
- name: postgres-tls-source
|
||||
mountPath: /etc/postgres-tls-source
|
||||
readOnly: true
|
||||
- name: postgres-tls-fixed
|
||||
mountPath: /etc/postgres-tls
|
||||
readOnly: false
|
||||
|
||||
service:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- name: otlp-grpc
|
||||
port: 4317
|
||||
targetPort: 4317
|
||||
protocol: TCP
|
||||
- name: otlp-http
|
||||
port: 4318
|
||||
targetPort: 4318
|
||||
protocol: TCP
|
||||
- name: prometheus
|
||||
port: 8889
|
||||
targetPort: 8889
|
||||
protocol: TCP
|
||||
- name: metrics
|
||||
port: 8888
|
||||
targetPort: 8888
|
||||
protocol: TCP
|
||||
|
||||
resources:
|
||||
requests:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
limits:
|
||||
cpu: 2000m
|
||||
memory: 2Gi
|
||||
|
||||
# Additional environment variables for receivers
|
||||
additionalEnvs:
|
||||
POSTGRES_MONITOR_USER: "monitoring"
|
||||
POSTGRES_MONITOR_PASSWORD: "monitoring_369f9c001f242b07ef9e2826e17169ca"
|
||||
REDIS_PASSWORD: "OxdmdJjdVNXp37MNC2IFoMnTpfGGFv1k"
|
||||
RABBITMQ_USER: "bakery"
|
||||
RABBITMQ_PASSWORD: "forecast123"
|
||||
|
||||
# Mount TLS certificates for secure connections
|
||||
extraVolumes:
|
||||
- name: redis-tls
|
||||
secret:
|
||||
secretName: redis-tls-secret
|
||||
- name: postgres-tls
|
||||
secret:
|
||||
secretName: postgres-tls
|
||||
- name: postgres-tls-fixed
|
||||
emptyDir: {}
|
||||
- name: varlogpods
|
||||
hostPath:
|
||||
path: /var/log/pods
|
||||
|
||||
extraVolumeMounts:
|
||||
- name: redis-tls
|
||||
mountPath: /etc/redis-tls
|
||||
readOnly: true
|
||||
- name: postgres-tls
|
||||
mountPath: /etc/postgres-tls-source
|
||||
readOnly: true
|
||||
- name: postgres-tls-fixed
|
||||
mountPath: /etc/postgres-tls
|
||||
readOnly: false
|
||||
- name: varlogpods
|
||||
mountPath: /var/log/pods
|
||||
readOnly: true
|
||||
|
||||
# Enable OpAMP for dynamic configuration management
|
||||
command:
|
||||
name: /signoz-otel-collector
|
||||
extraArgs:
|
||||
- --config=/conf/otel-collector-config.yaml
|
||||
- --manager-config=/conf/otel-collector-opamp-config.yaml
|
||||
- --feature-gates=-pkg.translator.prometheus.NormalizeName
|
||||
|
||||
# Full OTEL Collector Configuration
|
||||
config:
|
||||
# Connectors - bridge between pipelines
|
||||
connectors:
|
||||
signozmeter:
|
||||
dimensions:
|
||||
- name: service.name
|
||||
- name: deployment.environment
|
||||
- name: host.name
|
||||
metrics_flush_interval: 1h
|
||||
|
||||
extensions:
|
||||
health_check:
|
||||
endpoint: 0.0.0.0:13133
|
||||
zpages:
|
||||
endpoint: 0.0.0.0:55679
|
||||
|
||||
receivers:
|
||||
otlp:
|
||||
protocols:
|
||||
grpc:
|
||||
endpoint: 0.0.0.0:4317
|
||||
max_recv_msg_size_mib: 32 # Increased for larger payloads
|
||||
http:
|
||||
endpoint: 0.0.0.0:4318
|
||||
cors:
|
||||
allowed_origins:
|
||||
- "https://monitoring.bakewise.ai"
|
||||
- "https://*.bakewise.ai"
|
||||
|
||||
# Filelog receiver for Kubernetes pod logs
|
||||
# Collects container stdout/stderr from /var/log/pods
|
||||
filelog:
|
||||
include:
|
||||
- /var/log/pods/*/*/*.log
|
||||
exclude:
|
||||
# Exclude SigNoz's own logs to avoid recursive collection
|
||||
- /var/log/pods/bakery-ia_signoz-*/*/*.log
|
||||
include_file_path: true
|
||||
include_file_name: false
|
||||
operators:
|
||||
# Parse CRI-O / containerd log format
|
||||
- type: regex_parser
|
||||
regex: '^(?P<time>[^ ]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) (?P<log>.*)$'
|
||||
timestamp:
|
||||
parse_from: attributes.time
|
||||
layout: '%Y-%m-%dT%H:%M:%S.%LZ'
|
||||
# Fix timestamp parsing - extract from the parsed time field
|
||||
- type: move
|
||||
from: attributes.time
|
||||
to: attributes.timestamp
|
||||
# Extract Kubernetes metadata from file path
|
||||
- type: regex_parser
|
||||
id: extract_metadata_from_filepath
|
||||
regex: '^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[^\/]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$'
|
||||
parse_from: attributes["log.file.path"]
|
||||
# Move metadata to resource attributes
|
||||
- type: move
|
||||
from: attributes.namespace
|
||||
to: resource["k8s.namespace.name"]
|
||||
- type: move
|
||||
from: attributes.pod_name
|
||||
to: resource["k8s.pod.name"]
|
||||
- type: move
|
||||
from: attributes.container_name
|
||||
to: resource["k8s.container.name"]
|
||||
- type: move
|
||||
from: attributes.log
|
||||
to: body
|
||||
|
||||
# Kubernetes Cluster Receiver - Collects cluster-level metrics
|
||||
# Provides information about nodes, namespaces, pods, and other cluster resources
|
||||
k8s_cluster:
|
||||
collection_interval: 30s
|
||||
node_conditions_to_report:
|
||||
- Ready
|
||||
- MemoryPressure
|
||||
- DiskPressure
|
||||
- PIDPressure
|
||||
- NetworkUnavailable
|
||||
allocatable_types_to_report:
|
||||
- cpu
|
||||
- memory
|
||||
- pods
|
||||
|
||||
# Prometheus receiver for scraping metrics
|
||||
prometheus:
|
||||
config:
|
||||
scrape_configs:
|
||||
- job_name: 'kubernetes-nodes-cadvisor'
|
||||
scrape_interval: 30s
|
||||
scrape_timeout: 10s
|
||||
scheme: https
|
||||
tls_config:
|
||||
insecure_skip_verify: true
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
kubernetes_sd_configs:
|
||||
- role: node
|
||||
relabel_configs:
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_label_(.+)
|
||||
- target_label: __address__
|
||||
replacement: kubernetes.default.svc:443
|
||||
- source_labels: [__meta_kubernetes_node_name]
|
||||
regex: (.+)
|
||||
target_label: __metrics_path__
|
||||
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
|
||||
- job_name: 'kubernetes-apiserver'
|
||||
scrape_interval: 30s
|
||||
scrape_timeout: 10s
|
||||
scheme: https
|
||||
tls_config:
|
||||
insecure_skip_verify: true
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
kubernetes_sd_configs:
|
||||
- role: endpoints
|
||||
relabel_configs:
|
||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
||||
action: keep
|
||||
regex: default;kubernetes;https
|
||||
|
||||
# Redis receiver for cache metrics
|
||||
# ENABLED: Using existing credentials from redis-secrets with TLS
|
||||
redis:
|
||||
endpoint: redis-service.bakery-ia:6379
|
||||
password: ${env:REDIS_PASSWORD}
|
||||
collection_interval: 60s
|
||||
transport: tcp
|
||||
tls:
|
||||
insecure_skip_verify: false
|
||||
cert_file: /etc/redis-tls/redis-cert.pem
|
||||
key_file: /etc/redis-tls/redis-key.pem
|
||||
ca_file: /etc/redis-tls/ca-cert.pem
|
||||
metrics:
|
||||
redis.maxmemory:
|
||||
enabled: true
|
||||
redis.cmd.latency:
|
||||
enabled: true
|
||||
|
||||
# RabbitMQ receiver via management API
|
||||
# ENABLED: Using existing credentials from rabbitmq-secrets
|
||||
rabbitmq:
|
||||
endpoint: http://rabbitmq-service.bakery-ia:15672
|
||||
username: ${env:RABBITMQ_USER}
|
||||
password: ${env:RABBITMQ_PASSWORD}
|
||||
collection_interval: 30s
|
||||
|
||||
# PostgreSQL receivers for database metrics
|
||||
# Monitor all databases with proper TLS configuration
|
||||
postgresql/auth:
|
||||
endpoint: auth-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- auth_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/inventory:
|
||||
endpoint: inventory-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- inventory_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/orders:
|
||||
endpoint: orders-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- orders_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/ai-insights:
|
||||
endpoint: ai-insights-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- ai_insights_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/alert-processor:
|
||||
endpoint: alert-processor-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- alert_processor_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/distribution:
|
||||
endpoint: distribution-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- distribution_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/external:
|
||||
endpoint: external-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- external_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/forecasting:
|
||||
endpoint: forecasting-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- forecasting_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/notification:
|
||||
endpoint: notification-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- notification_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/orchestrator:
|
||||
endpoint: orchestrator-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- orchestrator_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/pos:
|
||||
endpoint: pos-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- pos_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/procurement:
|
||||
endpoint: procurement-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- procurement_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/production:
|
||||
endpoint: production-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- production_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/recipes:
|
||||
endpoint: recipes-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- recipes_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/sales:
|
||||
endpoint: sales-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- sales_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/suppliers:
|
||||
endpoint: suppliers-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- suppliers_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/tenant:
|
||||
endpoint: tenant-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- tenant_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
postgresql/training:
|
||||
endpoint: training-db-service.bakery-ia:5432
|
||||
username: ${env:POSTGRES_MONITOR_USER}
|
||||
password: ${env:POSTGRES_MONITOR_PASSWORD}
|
||||
databases:
|
||||
- training_db
|
||||
collection_interval: 60s
|
||||
tls:
|
||||
insecure: false
|
||||
cert_file: /etc/postgres-tls/server-cert.pem
|
||||
key_file: /etc/postgres-tls/server-key.pem
|
||||
ca_file: /etc/postgres-tls/ca-cert.pem
|
||||
|
||||
processors:
|
||||
# High-performance batch processing (official recommendation)
|
||||
batch:
|
||||
timeout: 1s # Reduced from 10s for faster processing
|
||||
send_batch_size: 50000 # Increased from 2048 (official recommendation for traces)
|
||||
send_batch_max_size: 50000
|
||||
|
||||
# Batch processor for meter data
|
||||
batch/meter:
|
||||
timeout: 1s
|
||||
send_batch_size: 20000
|
||||
send_batch_max_size: 25000
|
||||
|
||||
memory_limiter:
|
||||
check_interval: 1s
|
||||
limit_mib: 1500 # 75% of container memory (2Gi = ~2048Mi)
|
||||
spike_limit_mib: 300
|
||||
|
||||
# Resource detection for K8s
|
||||
resourcedetection:
|
||||
detectors: [env, system, docker]
|
||||
timeout: 5s
|
||||
|
||||
# Add resource attributes
|
||||
resource:
|
||||
attributes:
|
||||
- key: deployment.environment
|
||||
value: production
|
||||
action: upsert
|
||||
- key: cluster.name
|
||||
value: bakery-ia-prod
|
||||
action: upsert
|
||||
|
||||
# Kubernetes attributes processor - CRITICAL for logs
|
||||
# Extracts pod, namespace, container metadata from log attributes
|
||||
k8sattributes:
|
||||
auth_type: "serviceAccount"
|
||||
passthrough: false
|
||||
extract:
|
||||
metadata:
|
||||
- k8s.pod.name
|
||||
- k8s.pod.uid
|
||||
- k8s.deployment.name
|
||||
- k8s.namespace.name
|
||||
- k8s.node.name
|
||||
- k8s.container.name
|
||||
labels:
|
||||
- tag_name: "app"
|
||||
- tag_name: "pod-template-hash"
|
||||
- tag_name: "version"
|
||||
annotations:
|
||||
- tag_name: "description"
|
||||
|
||||
# SigNoz span metrics processor with delta aggregation (recommended)
|
||||
# Generates RED metrics (Rate, Error, Duration) from trace spans
|
||||
signozspanmetrics/delta:
|
||||
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
|
||||
metrics_exporter: signozclickhousemetrics
|
||||
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s]
|
||||
dimensions_cache_size: 100000
|
||||
dimensions:
|
||||
- name: service.namespace
|
||||
default: default
|
||||
- name: deployment.environment
|
||||
default: production
|
||||
- name: signoz.collector.id
|
||||
|
||||
exporters:
|
||||
# ClickHouse exporter for traces
|
||||
clickhousetraces:
|
||||
datasource: tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/?database=signoz_traces
|
||||
timeout: 10s
|
||||
retry_on_failure:
|
||||
enabled: true
|
||||
initial_interval: 5s
|
||||
max_interval: 30s
|
||||
max_elapsed_time: 300s
|
||||
|
||||
# ClickHouse exporter for metrics
|
||||
signozclickhousemetrics:
|
||||
dsn: "tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/signoz_metrics"
|
||||
timeout: 10s
|
||||
retry_on_failure:
|
||||
enabled: true
|
||||
initial_interval: 5s
|
||||
max_interval: 30s
|
||||
max_elapsed_time: 300s
|
||||
|
||||
# ClickHouse exporter for meter data (usage metrics)
|
||||
signozclickhousemeter:
|
||||
dsn: "tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/signoz_meter"
|
||||
timeout: 45s
|
||||
sending_queue:
|
||||
enabled: false
|
||||
|
||||
# ClickHouse exporter for logs
|
||||
clickhouselogsexporter:
|
||||
dsn: tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/?database=signoz_logs
|
||||
timeout: 10s
|
||||
retry_on_failure:
|
||||
enabled: true
|
||||
initial_interval: 5s
|
||||
max_interval: 30s
|
||||
|
||||
# Metadata exporter for service metadata
|
||||
metadataexporter:
|
||||
dsn: "tcp://admin:27ff0399-0d3a-4bd8-919d-17c2181e6fb9@signoz-clickhouse:9000/signoz_metadata"
|
||||
timeout: 10s
|
||||
cache:
|
||||
provider: in_memory
|
||||
|
||||
# Debug exporter for debugging (optional)
|
||||
debug:
|
||||
verbosity: detailed
|
||||
sampling_initial: 5
|
||||
sampling_thereafter: 200
|
||||
|
||||
service:
|
||||
extensions: [health_check, zpages]
|
||||
pipelines:
|
||||
# Traces pipeline - exports to ClickHouse and signozmeter connector
|
||||
traces:
|
||||
receivers: [otlp]
|
||||
processors: [memory_limiter, batch, signozspanmetrics/delta, resourcedetection, resource]
|
||||
exporters: [clickhousetraces, metadataexporter, signozmeter]
|
||||
|
||||
# Metrics pipeline - includes all infrastructure receivers
|
||||
metrics:
|
||||
receivers: [otlp,
|
||||
postgresql/auth, postgresql/inventory, postgresql/orders,
|
||||
postgresql/ai-insights, postgresql/alert-processor, postgresql/distribution,
|
||||
postgresql/external, postgresql/forecasting, postgresql/notification,
|
||||
postgresql/orchestrator, postgresql/pos, postgresql/procurement,
|
||||
postgresql/production, postgresql/recipes, postgresql/sales,
|
||||
postgresql/suppliers, postgresql/tenant, postgresql/training,
|
||||
redis, rabbitmq, k8s_cluster, prometheus]
|
||||
processors: [memory_limiter, batch, resourcedetection, resource]
|
||||
exporters: [signozclickhousemetrics]
|
||||
|
||||
# Meter pipeline - receives from signozmeter connector
|
||||
metrics/meter:
|
||||
receivers: [signozmeter]
|
||||
processors: [batch/meter]
|
||||
exporters: [signozclickhousemeter]
|
||||
|
||||
# Logs pipeline - includes both OTLP and Kubernetes pod logs
|
||||
logs:
|
||||
receivers: [otlp, filelog]
|
||||
processors: [memory_limiter, batch, resourcedetection, resource, k8sattributes]
|
||||
exporters: [clickhouselogsexporter]
|
||||
|
||||
# HPA for OTEL Collector
|
||||
autoscaling:
|
||||
enabled: true
|
||||
minReplicas: 2
|
||||
maxReplicas: 10
|
||||
targetCPUUtilizationPercentage: 70
|
||||
targetMemoryUtilizationPercentage: 80
|
||||
|
||||
# ClusterRole configuration for Kubernetes monitoring
|
||||
# CRITICAL: Required for k8s_cluster receiver to access Kubernetes API
|
||||
# Without these permissions, k8s metrics will not appear in SigNoz UI
|
||||
clusterRole:
|
||||
create: true
|
||||
name: "signoz-otel-collector-bakery-ia"
|
||||
annotations: {}
|
||||
# Complete RBAC rules required by k8sclusterreceiver
|
||||
# Based on OpenTelemetry and SigNoz official documentation
|
||||
rules:
|
||||
# Core API group - fundamental Kubernetes resources
|
||||
- apiGroups: [""]
|
||||
resources:
|
||||
- "events"
|
||||
- "namespaces"
|
||||
- "nodes"
|
||||
- "nodes/proxy"
|
||||
- "nodes/metrics"
|
||||
- "nodes/spec"
|
||||
- "pods"
|
||||
- "pods/status"
|
||||
- "replicationcontrollers"
|
||||
- "replicationcontrollers/status"
|
||||
- "resourcequotas"
|
||||
- "services"
|
||||
- "endpoints"
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Apps API group - modern workload controllers
|
||||
- apiGroups: ["apps"]
|
||||
resources: ["deployments", "daemonsets", "statefulsets", "replicasets"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Batch API group - job management
|
||||
- apiGroups: ["batch"]
|
||||
resources: ["jobs", "cronjobs"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Autoscaling API group - HPA metrics (CRITICAL)
|
||||
- apiGroups: ["autoscaling"]
|
||||
resources: ["horizontalpodautoscalers"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Extensions API group - legacy support
|
||||
- apiGroups: ["extensions"]
|
||||
resources: ["deployments", "daemonsets", "replicasets"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Metrics API group - resource metrics
|
||||
- apiGroups: ["metrics.k8s.io"]
|
||||
resources: ["nodes", "pods"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
clusterRoleBinding:
|
||||
annotations: {}
|
||||
name: "signoz-otel-collector-bakery-ia"
|
||||
|
||||
# Schema Migrator - Manages ClickHouse schema migrations
|
||||
schemaMigrator:
|
||||
enabled: true
|
||||
|
||||
image:
|
||||
repository: signoz/signoz-schema-migrator
|
||||
tag: v0.129.12 # Updated to latest version
|
||||
pullPolicy: IfNotPresent
|
||||
|
||||
# Enable Helm hooks for proper upgrade handling
|
||||
upgradeHelmHooks: true
|
||||
|
||||
# Additional Configuration
|
||||
serviceAccount:
|
||||
create: true
|
||||
annotations: {}
|
||||
name: "signoz"
|
||||
|
||||
# Security Context
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
fsGroup: 1000
|
||||
|
||||
# Pod Disruption Budgets for HA
|
||||
podDisruptionBudget:
|
||||
frontend:
|
||||
enabled: true
|
||||
minAvailable: 1
|
||||
queryService:
|
||||
enabled: true
|
||||
minAvailable: 1
|
||||
alertmanager:
|
||||
enabled: true
|
||||
minAvailable: 1
|
||||
clickhouse:
|
||||
enabled: true
|
||||
minAvailable: 1
|
||||
|
||||
# Network Policies for security
|
||||
networkPolicy:
|
||||
enabled: true
|
||||
policyTypes:
|
||||
- Ingress
|
||||
- Egress
|
||||
|
||||
# Monitoring SigNoz itself
|
||||
selfMonitoring:
|
||||
enabled: true
|
||||
serviceMonitor:
|
||||
enabled: true
|
||||
interval: 30s
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: gateway
|
||||
app.kubernetes.io/component: gateway
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
containers:
|
||||
- name: gateway
|
||||
image: bakery/gateway:latest
|
||||
|
||||
@@ -5,3 +5,4 @@ resources:
|
||||
- gateway-service.yaml
|
||||
- nominatim/nominatim.yaml
|
||||
- nominatim/nominatim-init-job.yaml
|
||||
- unbound/unbound.yaml
|
||||
|
||||
@@ -15,8 +15,6 @@ spec:
|
||||
app.kubernetes.io/name: nominatim-init
|
||||
app.kubernetes.io/component: data-init
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
restartPolicy: OnFailure
|
||||
containers:
|
||||
- name: nominatim-import
|
||||
|
||||
81
infrastructure/platform/infrastructure/unbound/unbound.yaml
Normal file
81
infrastructure/platform/infrastructure/unbound/unbound.yaml
Normal file
@@ -0,0 +1,81 @@
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: unbound-resolver
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: unbound-resolver
|
||||
app.kubernetes.io/component: dns
|
||||
app.kubernetes.io/part-of: bakery-ia
|
||||
spec:
|
||||
replicas: 1 # Scale to 2+ in production with anti-affinity
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: unbound-resolver
|
||||
app.kubernetes.io/component: dns
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: unbound-resolver
|
||||
app.kubernetes.io/component: dns
|
||||
spec:
|
||||
containers:
|
||||
- name: unbound
|
||||
image: mvance/unbound:latest
|
||||
ports:
|
||||
- containerPort: 53
|
||||
name: dns-udp
|
||||
protocol: UDP
|
||||
- containerPort: 53
|
||||
name: dns-tcp
|
||||
protocol: TCP
|
||||
resources:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "384Mi"
|
||||
readinessProbe:
|
||||
exec:
|
||||
command:
|
||||
- sh
|
||||
- -c
|
||||
- drill @127.0.0.1 -p 53 +dnssec example.org || nslookup -type=A example.org 127.0.0.1
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 30
|
||||
livenessProbe:
|
||||
exec:
|
||||
command:
|
||||
- sh
|
||||
- -c
|
||||
- drill @127.0.0.1 -p 53 +dnssec example.org || nslookup -type=A example.org 127.0.0.1
|
||||
initialDelaySeconds: 30
|
||||
periodSeconds: 60
|
||||
securityContext:
|
||||
capabilities:
|
||||
add: ["NET_BIND_SERVICE"]
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: unbound-dns
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: unbound-resolver
|
||||
app.kubernetes.io/component: dns
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- name: dns-udp
|
||||
port: 53
|
||||
targetPort: 53
|
||||
protocol: UDP
|
||||
- name: dns-tcp
|
||||
port: 53
|
||||
targetPort: 53
|
||||
protocol: TCP
|
||||
selector:
|
||||
app.kubernetes.io/name: unbound-resolver
|
||||
app.kubernetes.io/component: dns
|
||||
@@ -1,5 +1,20 @@
|
||||
# Dev-specific Mailu Helm values for Bakery-IA
|
||||
# Overrides base configuration for development environment
|
||||
# Development-tuned Mailu configuration
|
||||
global:
|
||||
# Use the unbound service IP - will be replaced during deployment
|
||||
custom_dns_servers: "unbound-dns.bakery-ia.svc.cluster.local" # Using service DNS name instead of IP
|
||||
|
||||
# Component-specific DNS configuration
|
||||
admin:
|
||||
dnsPolicy: "None"
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "unbound-dns.bakery-ia.svc.cluster.local" # Using service DNS name instead of IP
|
||||
|
||||
rspamd:
|
||||
dnsPolicy: "None"
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "unbound-dns.bakery-ia.svc.cluster.local" # Using service DNS name instead of IP
|
||||
|
||||
# Domain configuration for dev
|
||||
domain: "bakery-ia.local"
|
||||
@@ -12,7 +27,64 @@ externalRelay:
|
||||
username: "postmaster@bakery-ia.local"
|
||||
password: "mailgun-api-key-replace-in-production"
|
||||
|
||||
# Ingress configuration for dev - disabled to use with existing ingress
|
||||
# Environment-specific configurations
|
||||
persistence:
|
||||
enabled: true
|
||||
# Development: use default storage class
|
||||
storageClass: "standard"
|
||||
size: "5Gi"
|
||||
|
||||
# Resource optimizations for development
|
||||
resources:
|
||||
admin:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
front:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "200m"
|
||||
memory: "128Mi"
|
||||
postfix:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "256Mi"
|
||||
dovecot:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "256Mi"
|
||||
rspamd:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "200m"
|
||||
memory: "128Mi"
|
||||
clamav:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "512Mi"
|
||||
|
||||
replicaCount: 1 # Single replica for development
|
||||
|
||||
# Security settings
|
||||
secretKey: "generate-strong-key-here-for-development"
|
||||
|
||||
# Ingress configuration for development - disabled to use with existing ingress
|
||||
ingress:
|
||||
enabled: false # Disable chart's Ingress; use existing one
|
||||
tls: false # Disable TLS in chart since ingress handles it
|
||||
@@ -33,6 +105,15 @@ welcomeMessage:
|
||||
# Log level for dev
|
||||
logLevel: "DEBUG"
|
||||
|
||||
# Development-specific overrides
|
||||
env:
|
||||
DEBUG: "true"
|
||||
LOG_LEVEL: "INFO"
|
||||
|
||||
# Disable or simplify monitoring in development
|
||||
monitoring:
|
||||
enabled: false
|
||||
|
||||
# Network Policy for dev
|
||||
networkPolicy:
|
||||
enabled: true
|
||||
|
||||
@@ -1,5 +1,20 @@
|
||||
# Production-specific Mailu Helm values for Bakery-IA
|
||||
# Overrides base configuration for production environment
|
||||
# Production-tuned Mailu configuration
|
||||
global:
|
||||
# Use the unbound service IP - will be replaced during deployment
|
||||
custom_dns_servers: "unbound-dns.bakery-ia.svc.cluster.local" # Using service DNS name instead of IP
|
||||
|
||||
# Component-specific DNS configuration
|
||||
admin:
|
||||
dnsPolicy: "None"
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "unbound-dns.bakery-ia.svc.cluster.local" # Using service DNS name instead of IP
|
||||
|
||||
rspamd:
|
||||
dnsPolicy: "None"
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "unbound-dns.bakery-ia.svc.cluster.local" # Using service DNS name instead of IP
|
||||
|
||||
# Domain configuration for production
|
||||
domain: "bakewise.ai"
|
||||
@@ -12,6 +27,63 @@ externalRelay:
|
||||
username: "postmaster@bakewise.ai"
|
||||
password: "PRODUCTION_MAILGUN_API_KEY" # This should be set via secret
|
||||
|
||||
# Environment-specific configurations
|
||||
persistence:
|
||||
enabled: true
|
||||
# Production: use microk8s-hostpath or longhorn
|
||||
storageClass: "longhorn" # Assuming Longhorn is available in production
|
||||
size: "20Gi" # Larger storage for production email volume
|
||||
|
||||
# Resource allocations for production
|
||||
resources:
|
||||
admin:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "512Mi"
|
||||
front:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
postfix:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "512Mi"
|
||||
dovecot:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "512Mi"
|
||||
rspamd:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
clamav:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "512Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "1Gi"
|
||||
|
||||
replicaCount: 1 # Can be increased in production as needed
|
||||
|
||||
# Security settings
|
||||
secretKey: "generate-strong-key-here-for-production"
|
||||
|
||||
# Ingress configuration for production - disabled to use with existing ingress
|
||||
ingress:
|
||||
enabled: false # Disable chart's Ingress; use existing one
|
||||
@@ -40,7 +112,24 @@ antivirus:
|
||||
enabled: true
|
||||
flavor: "clamav"
|
||||
|
||||
# Network Policy for production
|
||||
# Production-specific settings
|
||||
env:
|
||||
DEBUG: "false"
|
||||
LOG_LEVEL: "WARNING"
|
||||
TLS_FLAVOR: "cert"
|
||||
REDIS_PASSWORD: "secure-redis-password"
|
||||
|
||||
# Enable monitoring in production
|
||||
monitoring:
|
||||
enabled: true
|
||||
|
||||
# Production-specific security settings
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
fsGroup: 1000
|
||||
|
||||
# Network policies for production
|
||||
networkPolicy:
|
||||
enabled: true
|
||||
ingressController:
|
||||
|
||||
@@ -1,6 +1,11 @@
|
||||
# Base Mailu Helm values for Bakery-IA
|
||||
# Preserves critical configurations from the original Kustomize setup
|
||||
|
||||
# Global DNS configuration for DNSSEC validation
|
||||
global:
|
||||
# This will be replaced with the actual Unbound service IP during deployment
|
||||
custom_dns_servers: "unbound-dns.bakery-ia.svc.cluster.local" # Using service DNS name instead of IP
|
||||
|
||||
# Domain configuration
|
||||
domain: "DOMAIN_PLACEHOLDER"
|
||||
hostnames:
|
||||
@@ -203,4 +208,18 @@ networkPolicy:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: ingress-nginx
|
||||
app.kubernetes.io/instance: ingress-nginx
|
||||
app.kubernetes.io/component: controller
|
||||
app.kubernetes.io/component: controller
|
||||
|
||||
# DNS Policy Configuration for DNSSEC validation
|
||||
# These settings ensure Mailu components use the Unbound DNS resolver
|
||||
dnsPolicy: "None"
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "unbound-dns.bakery-ia.svc.cluster.local" # Points to the Unbound service in the bakery-ia namespace
|
||||
options:
|
||||
- name: ndots
|
||||
value: "5"
|
||||
- name: timeout
|
||||
value: "5"
|
||||
- name: attempts
|
||||
value: "3"
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: {{SERVICE_NAME}}-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
containers:
|
||||
- name: postgres
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: redis
|
||||
app.kubernetes.io/component: cache
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 999 # redis group
|
||||
initContainers:
|
||||
|
||||
@@ -1,65 +0,0 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Setup Docker Hub image pull secrets for all namespaces
|
||||
# This script creates docker-registry secrets for pulling images from Docker Hub
|
||||
|
||||
set -e
|
||||
|
||||
# Docker Hub credentials
|
||||
DOCKER_SERVER="docker.io"
|
||||
DOCKER_USERNAME="uals"
|
||||
DOCKER_PASSWORD="dckr_pat_zzEY5Q58x1S0puraIoKEtbpue3A"
|
||||
DOCKER_EMAIL="ualfaro@gmail.com"
|
||||
SECRET_NAME="dockerhub-creds"
|
||||
|
||||
# List of namespaces used in the project
|
||||
NAMESPACES=(
|
||||
"bakery-ia"
|
||||
"bakery-ia-dev"
|
||||
"bakery-ia-prod"
|
||||
"default"
|
||||
)
|
||||
|
||||
echo "Setting up Docker Hub image pull secrets..."
|
||||
echo "==========================================="
|
||||
echo ""
|
||||
|
||||
for namespace in "${NAMESPACES[@]}"; do
|
||||
echo "Processing namespace: $namespace"
|
||||
|
||||
# Create namespace if it doesn't exist
|
||||
if ! kubectl get namespace "$namespace" >/dev/null 2>&1; then
|
||||
echo " Creating namespace: $namespace"
|
||||
kubectl create namespace "$namespace"
|
||||
fi
|
||||
|
||||
# Delete existing secret if it exists
|
||||
if kubectl get secret "$SECRET_NAME" -n "$namespace" >/dev/null 2>&1; then
|
||||
echo " Deleting existing secret in namespace: $namespace"
|
||||
kubectl delete secret "$SECRET_NAME" -n "$namespace"
|
||||
fi
|
||||
|
||||
# Create the docker-registry secret
|
||||
echo " Creating Docker Hub secret in namespace: $namespace"
|
||||
kubectl create secret docker-registry "$SECRET_NAME" \
|
||||
--docker-server="$DOCKER_SERVER" \
|
||||
--docker-username="$DOCKER_USERNAME" \
|
||||
--docker-password="$DOCKER_PASSWORD" \
|
||||
--docker-email="$DOCKER_EMAIL" \
|
||||
-n "$namespace"
|
||||
|
||||
echo " ✓ Secret created successfully in namespace: $namespace"
|
||||
echo ""
|
||||
done
|
||||
|
||||
echo "==========================================="
|
||||
echo "Docker Hub secrets setup completed!"
|
||||
echo ""
|
||||
echo "The secret '$SECRET_NAME' has been created in all namespaces:"
|
||||
for namespace in "${NAMESPACES[@]}"; do
|
||||
echo " - $namespace"
|
||||
done
|
||||
echo ""
|
||||
echo "Next steps:"
|
||||
echo "1. Apply Kubernetes manifests with imagePullSecrets configured"
|
||||
echo "2. Verify pods can pull images: kubectl get pods -A"
|
||||
@@ -1,67 +0,0 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Setup GitHub Container Registry (GHCR) image pull secrets for all namespaces
|
||||
# This script creates docker-registry secrets for pulling images from GHCR
|
||||
|
||||
set -e
|
||||
|
||||
# GitHub Container Registry credentials
|
||||
# Note: Use a GitHub Personal Access Token with 'read:packages' scope
|
||||
GHCR_SERVER="ghcr.io"
|
||||
GHCR_USERNAME="uals" # GitHub username
|
||||
GHCR_PASSWORD="ghp_zzEY5Q58x1S0puraIoKEtbpue3A" # GitHub Personal Access Token
|
||||
GHCR_EMAIL="ualfaro@gmail.com"
|
||||
SECRET_NAME="ghcr-creds"
|
||||
|
||||
# List of namespaces used in the project
|
||||
NAMESPACES=(
|
||||
"bakery-ia"
|
||||
"bakery-ia-dev"
|
||||
"bakery-ia-prod"
|
||||
"default"
|
||||
)
|
||||
|
||||
echo "Setting up GitHub Container Registry image pull secrets..."
|
||||
echo "=========================================================="
|
||||
echo ""
|
||||
|
||||
for namespace in "${NAMESPACES[@]}"; do
|
||||
echo "Processing namespace: $namespace"
|
||||
|
||||
# Create namespace if it doesn't exist
|
||||
if ! kubectl get namespace "$namespace" >/dev/null 2>&1; then
|
||||
echo " Creating namespace: $namespace"
|
||||
kubectl create namespace "$namespace"
|
||||
fi
|
||||
|
||||
# Delete existing secret if it exists
|
||||
if kubectl get secret "$SECRET_NAME" -n "$namespace" >/dev/null 2>&1; then
|
||||
echo " Deleting existing secret in namespace: $namespace"
|
||||
kubectl delete secret "$SECRET_NAME" -n "$namespace"
|
||||
fi
|
||||
|
||||
# Create the docker-registry secret for GHCR
|
||||
echo " Creating GHCR secret in namespace: $namespace"
|
||||
kubectl create secret docker-registry "$SECRET_NAME" \
|
||||
--docker-server="$GHCR_SERVER" \
|
||||
--docker-username="$GHCR_USERNAME" \
|
||||
--docker-password="$GHCR_PASSWORD" \
|
||||
--docker-email="$GHCR_EMAIL" \
|
||||
-n "$namespace"
|
||||
|
||||
echo " ✓ Secret created successfully in namespace: $namespace"
|
||||
echo ""
|
||||
done
|
||||
|
||||
echo "=========================================================="
|
||||
echo "GitHub Container Registry secrets setup completed!"
|
||||
echo ""
|
||||
echo "The secret '$SECRET_NAME' has been created in all namespaces:"
|
||||
for namespace in "${NAMESPACES[@]}"; do
|
||||
echo " - $namespace"
|
||||
done
|
||||
echo ""
|
||||
echo "Next steps:"
|
||||
echo "1. Update your Kubernetes manifests to include the GHCR imagePullSecrets"
|
||||
echo "2. Verify pods can pull images from GHCR: kubectl get pods -A"
|
||||
echo "3. Consider updating your CI/CD pipelines to push images to GHCR"
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: ai-insights-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: alert-processor-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: auth-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -21,8 +21,6 @@ spec:
|
||||
app: demo-session-db
|
||||
component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: distribution-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: external-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: forecasting-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: inventory-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: notification-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: orchestrator-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: orders-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: pos-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: procurement-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: production-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: rabbitmq
|
||||
app.kubernetes.io/component: message-broker
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
containers:
|
||||
- name: rabbitmq
|
||||
image: rabbitmq:4.1-management-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: recipes-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: sales-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: suppliers-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: tenant-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: training-db
|
||||
app.kubernetes.io/component: database
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
securityContext:
|
||||
fsGroup: 70
|
||||
initContainers:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: ai-insights-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: ai-insights-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: alert-processor-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: auth-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,9 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: auth-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
- name: ghcr-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -17,8 +17,6 @@ spec:
|
||||
labels:
|
||||
app: demo-cleanup
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
component: background-jobs
|
||||
service: demo-session
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-migrations
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -15,8 +15,6 @@ spec:
|
||||
app.kubernetes.io/name: demo-session-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: distribution-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: distribution-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -22,8 +22,6 @@ spec:
|
||||
app: external-service
|
||||
job: data-rotation
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
ttlSecondsAfterFinished: 172800
|
||||
backoffLimit: 2
|
||||
|
||||
|
||||
@@ -23,8 +23,6 @@ spec:
|
||||
app.kubernetes.io/component: microservice
|
||||
version: "2.0"
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -17,8 +17,6 @@ spec:
|
||||
app: external-service
|
||||
job: data-init
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
restartPolicy: OnFailure
|
||||
|
||||
initContainers:
|
||||
|
||||
@@ -16,9 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: external-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
- name: ghcr-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: forecasting-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: forecasting-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: frontend
|
||||
app.kubernetes.io/component: frontend
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
containers:
|
||||
- name: frontend
|
||||
image: bakery/dashboard:latest
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: inventory-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: inventory-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: notification-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: notification-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: orchestrator-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: orchestrator-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: orders-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: orders-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: pos-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: pos-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: procurement-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: procurement-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: production-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: production-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: recipes-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: recipes-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: sales-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: sales-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: suppliers-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: suppliers-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,9 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: tenant-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
- name: ghcr-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,9 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: tenant-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
- name: ghcr-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -16,8 +16,6 @@ spec:
|
||||
app.kubernetes.io/name: training-migration
|
||||
app.kubernetes.io/component: migration
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
- name: wait-for-db
|
||||
image: postgres:17-alpine
|
||||
|
||||
@@ -19,8 +19,6 @@ spec:
|
||||
app.kubernetes.io/name: training-service
|
||||
app.kubernetes.io/component: microservice
|
||||
spec:
|
||||
imagePullSecrets:
|
||||
- name: dockerhub-creds
|
||||
initContainers:
|
||||
# Wait for Redis to be ready
|
||||
- name: wait-for-redis
|
||||
|
||||
@@ -53,6 +53,8 @@ BASE_IMAGES=(
|
||||
"ghcr.io/mailu/postfix:2024.06"
|
||||
"ghcr.io/mailu/dovecot:2024.06"
|
||||
"ghcr.io/mailu/rspamd:2024.06"
|
||||
# DNS resolver (Unbound)
|
||||
"mvance/unbound:latest"
|
||||
)
|
||||
|
||||
# Local registry configuration
|
||||
|
||||
72
secrets_test.yaml
Normal file
72
secrets_test.yaml
Normal file
@@ -0,0 +1,72 @@
|
||||
# Secret for Gitea webhook validation
|
||||
# Used by EventListener to validate incoming webhooks
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-webhook-secret
|
||||
namespace: {{ .Values.namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
annotations:
|
||||
note: "Webhook secret for validating incoming webhooks"
|
||||
type: Opaque
|
||||
stringData:
|
||||
secretToken: {{ .Values.secrets.webhook.token | quote }}
|
||||
---
|
||||
# Secret for Gitea container registry credentials
|
||||
# Used by Kaniko to push images to Gitea registry
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-registry-credentials
|
||||
namespace: {{ .Values.namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: build
|
||||
annotations:
|
||||
note: "Registry credentials for pushing images"
|
||||
type: kubernetes.io/dockerconfigjson
|
||||
stringData:
|
||||
.dockerconfigjson: |
|
||||
{
|
||||
"auths": {
|
||||
{{ .Values.secrets.registry.registryUrl | quote }}: {
|
||||
"username": {{ .Values.secrets.registry.username | quote }},
|
||||
"password": {{ .Values.secrets.registry.password | quote }}
|
||||
}
|
||||
}
|
||||
}
|
||||
---
|
||||
# Secret for Git credentials (used by pipeline to push GitOps updates)
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-git-credentials
|
||||
namespace: {{ .Values.namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: gitops
|
||||
annotations:
|
||||
note: "Git credentials for GitOps updates"
|
||||
type: Opaque
|
||||
stringData:
|
||||
username: {{ .Values.secrets.git.username | quote }}
|
||||
password: {{ .Values.secrets.git.password | quote }}
|
||||
---
|
||||
# Secret for Flux GitRepository access
|
||||
# Used by Flux to pull from Gitea repository
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-credentials
|
||||
namespace: {{ .Values.pipeline.deployment.fluxNamespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: flux
|
||||
annotations:
|
||||
note: "Credentials for Flux GitRepository access"
|
||||
type: Opaque
|
||||
stringData:
|
||||
username: {{ .Values.secrets.git.username | quote }}
|
||||
password: {{ .Values.secrets.git.password | quote }}
|
||||
22
test_secrets.yaml
Normal file
22
test_secrets.yaml
Normal file
@@ -0,0 +1,22 @@
|
||||
# Test version of the secrets file to isolate the issue
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-registry-credentials
|
||||
namespace: {{ .Values.namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: build
|
||||
annotations:
|
||||
note: "Registry credentials for pushing images"
|
||||
type: kubernetes.io/dockerconfigjson
|
||||
stringData:
|
||||
.dockerconfigjson: |
|
||||
{
|
||||
"auths": {
|
||||
{{ .Values.secrets.registry.registryUrl | quote }}: {
|
||||
"username": {{ .Values.secrets.registry.username | quote }},
|
||||
"password": {{ .Values.secrets.registry.password | quote }}
|
||||
}
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user