Initial commit - production deployment
This commit is contained in:
119
infrastructure/NAMESPACES.md
Normal file
119
infrastructure/NAMESPACES.md
Normal file
@@ -0,0 +1,119 @@
|
||||
# Bakery-IA Namespace Management
|
||||
|
||||
## Overview
|
||||
|
||||
This document explains the namespace strategy for the Bakery-IA platform and how to properly manage namespaces during deployment.
|
||||
|
||||
## Namespace Architecture
|
||||
|
||||
The Bakery-IA platform uses the following namespaces:
|
||||
|
||||
### Core Namespaces
|
||||
|
||||
1. **`bakery-ia`** - Main application namespace
|
||||
- Contains all microservices, databases, and application components
|
||||
- Defined in: `infrastructure/namespaces/bakery-ia.yaml`
|
||||
|
||||
2. **`tekton-pipelines`** - CI/CD pipeline namespace
|
||||
- Contains Tekton pipeline resources, tasks, and triggers
|
||||
- Defined in: `infrastructure/namespaces/tekton-pipelines.yaml`
|
||||
|
||||
3. **`flux-system`** - GitOps namespace
|
||||
- Contains Flux CD components for GitOps deployments
|
||||
- Now defined in Helm chart: `infrastructure/cicd/flux/templates/namespace.yaml`
|
||||
|
||||
### Infrastructure Namespaces
|
||||
|
||||
Additional namespaces may be created for:
|
||||
- Monitoring components
|
||||
- Logging components
|
||||
- Security components
|
||||
|
||||
## Deployment Order
|
||||
|
||||
**CRITICAL**: Namespaces must be created BEFORE any resources that depend on them.
|
||||
|
||||
### Correct Deployment Sequence
|
||||
|
||||
```bash
|
||||
# 1. Create namespaces first
|
||||
kubectl apply -f infrastructure/namespaces/
|
||||
|
||||
# 2. Apply common configurations (depends on bakery-ia namespace)
|
||||
kubectl apply -f infrastructure/environments/common/configs/
|
||||
|
||||
# 3. Apply platform components
|
||||
kubectl apply -f infrastructure/platform/
|
||||
|
||||
# 4. Apply CI/CD components (depends on tekton-pipelines)
|
||||
kubectl apply -f infrastructure/cicd/
|
||||
|
||||
# 5. Apply monitoring components
|
||||
kubectl apply -f infrastructure/monitoring/
|
||||
```
|
||||
|
||||
## Common Issues and Solutions
|
||||
|
||||
### Issue: "namespace not found" errors
|
||||
|
||||
**Symptoms**: Errors like:
|
||||
```
|
||||
Error from server (NotFound): error when creating "path/to/resource.yaml": namespaces "[namespace-name]" not found
|
||||
```
|
||||
|
||||
**Solutions**:
|
||||
|
||||
1. **Ensure namespaces are created first** - Use the deployment script that applies namespaces before other resources
|
||||
|
||||
2. **Check for templating issues** - If you see names like `[redacted secret rabbitmq-secrets:RABBITMQ_USER]-ia`, there may be environment variable substitution happening incorrectly
|
||||
|
||||
3. **Verify namespace YAML files** - Ensure the namespace files exist and are properly formatted
|
||||
|
||||
### Issue: Resource conflicts across namespaces
|
||||
|
||||
**Solution**: Use proper namespace isolation and RBAC policies to prevent cross-namespace conflicts.
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Namespace Isolation**: Keep resources properly isolated by namespace
|
||||
2. **RBAC**: Use namespace-specific RBAC roles and bindings
|
||||
3. **Resource Quotas**: Apply resource quotas per namespace
|
||||
4. **Network Policies**: Use network policies to control cross-namespace communication
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Verify namespaces exist
|
||||
|
||||
```bash
|
||||
kubectl get namespaces
|
||||
```
|
||||
|
||||
### Check namespace labels
|
||||
|
||||
```bash
|
||||
kubectl get namespace bakery-ia --show-labels
|
||||
```
|
||||
|
||||
### View namespace events
|
||||
|
||||
```bash
|
||||
kubectl describe namespace bakery-ia
|
||||
```
|
||||
|
||||
## Migration from Old Structure
|
||||
|
||||
If you're migrating from the old structure where namespaces were scattered across different directories:
|
||||
|
||||
1. **Remove old namespace files** from:
|
||||
- `infrastructure/environments/common/configs/namespace.yaml`
|
||||
- `infrastructure/cicd/flux/namespace.yaml`
|
||||
|
||||
2. **Update kustomization files** to reference the centralized namespace files
|
||||
|
||||
3. **Use the new deployment script** that follows the correct order
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- Add namespace lifecycle management
|
||||
- Implement namespace cleanup scripts
|
||||
- Add namespace validation checks to CI/CD pipelines
|
||||
57
infrastructure/README.md
Normal file
57
infrastructure/README.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Bakery-IA Infrastructure
|
||||
|
||||
This directory contains all infrastructure-as-code for the Bakery-IA project, organized according to best practices for maintainability and scalability.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
infrastructure/
|
||||
├── environments/ # Environment-specific configurations
|
||||
│ ├── dev/ # Development environment
|
||||
│ │ ├── k8s-manifests/ # Kubernetes manifests for dev
|
||||
│ │ └── values/ # Environment-specific values
|
||||
│ ├── staging/ # Staging environment
|
||||
│ │ ├── k8s-manifests/
|
||||
│ │ └── values/
|
||||
│ └── prod/ # Production environment
|
||||
│ ├── k8s-manifests/
|
||||
│ ├── terraform/ # Production-specific IaC
|
||||
│ └── values/
|
||||
├── platform/ # Platform-level infrastructure
|
||||
│ ├── cluster/ # Cluster configuration (EKS, Kind)
|
||||
│ ├── networking/ # Network configuration
|
||||
│ ├── security/ # Security policies and TLS
|
||||
│ └── storage/ # Storage configuration
|
||||
├── services/ # Application services
|
||||
│ ├── databases/ # Database configurations
|
||||
│ ├── api-gateway/ # API gateway configuration
|
||||
│ └── microservices/ # Individual microservice configs
|
||||
├── monitoring/ # Observability stack
|
||||
│ └── signoz/ # SigNoz configuration
|
||||
├── cicd/ # CI/CD pipeline components
|
||||
├── security/ # Security configurations
|
||||
├── scripts/ # Automation scripts
|
||||
└── docs/ # Infrastructure documentation
|
||||
```
|
||||
|
||||
## Environments
|
||||
|
||||
Each environment (dev, staging, prod) has its own configuration with appropriate isolation and security settings.
|
||||
|
||||
## Services
|
||||
|
||||
Services are organized by business domain with clear separation between databases, microservices, and infrastructure components.
|
||||
|
||||
## Getting Started
|
||||
|
||||
1. **Local Development**: Use `tilt up` to start the development environment
|
||||
2. **Deployment**: Use `skaffold run` to deploy to your target environment
|
||||
3. **CI/CD**: Tekton pipelines manage automated deployments
|
||||
|
||||
## Security
|
||||
|
||||
Security configurations are centralized in the `security/` directory with:
|
||||
- TLS certificates and rotation scripts
|
||||
- Network policies
|
||||
- RBAC configurations
|
||||
- Compliance checks
|
||||
298
infrastructure/cicd/README.md
Normal file
298
infrastructure/cicd/README.md
Normal file
@@ -0,0 +1,298 @@
|
||||
# Bakery-IA CI/CD Implementation
|
||||
|
||||
This directory contains the configuration for the production-grade CI/CD system for Bakery-IA using Gitea, Tekton, and Flux CD.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Developer] -->|Push Code| B[Gitea]
|
||||
B -->|Webhook| C[Tekton Pipelines]
|
||||
C -->|Build/Test| D[Gitea Registry]
|
||||
D -->|New Image| E[Flux CD]
|
||||
E -->|kubectl apply| F[MicroK8s Cluster]
|
||||
F -->|Metrics| G[SigNoz]
|
||||
```
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
infrastructure/ci-cd/
|
||||
├── gitea/ # Gitea configuration (Git server + registry)
|
||||
│ └── values.yaml # Helm values for Gitea (ingress now in main config)
|
||||
├── tekton/ # Tekton CI/CD pipeline configuration
|
||||
│ ├── tasks/ # Individual pipeline tasks
|
||||
│ │ ├── git-clone.yaml
|
||||
│ │ ├── detect-changes.yaml
|
||||
│ │ ├── kaniko-build.yaml
|
||||
│ │ └── update-gitops.yaml
|
||||
│ ├── pipelines/ # Pipeline definitions
|
||||
│ │ └── ci-pipeline.yaml
|
||||
│ └── triggers/ # Webhook trigger configuration
|
||||
│ ├── trigger-template.yaml
|
||||
│ ├── trigger-binding.yaml
|
||||
│ ├── event-listener.yaml
|
||||
│ └── gitlab-interceptor.yaml
|
||||
├── flux/ # Flux CD GitOps Helm chart configuration
|
||||
│ ├── Chart.yaml # Helm chart definition
|
||||
│ ├── values.yaml # Default configuration values
|
||||
│ ├── templates/ # Kubernetes manifest templates
|
||||
│ │ ├── gitrepository.yaml
|
||||
│ │ ├── kustomization.yaml
|
||||
│ │ └── namespace.yaml
|
||||
│ └── values/ # Additional value files
|
||||
├── monitoring/ # Monitoring configuration
|
||||
│ └── otel-collector.yaml # OpenTelemetry collector
|
||||
└── README.md # This file
|
||||
```
|
||||
|
||||
## Deployment Instructions
|
||||
|
||||
### Phase 1: Infrastructure Setup
|
||||
|
||||
1. **Deploy Gitea**:
|
||||
```bash
|
||||
# Add Helm repo
|
||||
microk8s helm repo add gitea https://dl.gitea.io/charts
|
||||
|
||||
# Create namespace
|
||||
microk8s kubectl create namespace gitea
|
||||
|
||||
# Install Gitea
|
||||
microk8s helm install gitea gitea/gitea \
|
||||
-n gitea \
|
||||
-f infrastructure/ci-cd/gitea/values.yaml
|
||||
|
||||
# Note: Gitea ingress is now included in the main ingress configuration
|
||||
# No separate ingress needs to be applied
|
||||
```
|
||||
|
||||
2. **Deploy Tekton**:
|
||||
```bash
|
||||
# Create namespace
|
||||
microk8s kubectl create namespace tekton-pipelines
|
||||
|
||||
# Install Tekton Pipelines
|
||||
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||||
|
||||
# Install Tekton Triggers
|
||||
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
|
||||
|
||||
# Apply Tekton configurations
|
||||
microk8s kubectl apply -f infrastructure/ci-cd/tekton/tasks/
|
||||
microk8s kubectl apply -f infrastructure/ci-cd/tekton/pipelines/
|
||||
microk8s kubectl apply -f infrastructure/ci-cd/tekton/triggers/
|
||||
```
|
||||
|
||||
3. **Deploy Flux CD** (already enabled in MicroK8s):
|
||||
```bash
|
||||
# Verify Flux installation
|
||||
microk8s kubectl get pods -n flux-system
|
||||
|
||||
# Apply Flux configurations using kustomize
|
||||
microk8s kubectl apply -k infrastructure/ci-cd/flux/
|
||||
```
|
||||
|
||||
### Phase 2: Configuration
|
||||
|
||||
1. **Set up Gitea webhook**:
|
||||
- Go to your Gitea repository settings
|
||||
- Add webhook with URL: `http://tekton-triggers.tekton-pipelines.svc.cluster.local:8080`
|
||||
- Use the secret from `gitea-webhook-secret`
|
||||
|
||||
2. **Configure registry credentials**:
|
||||
```bash
|
||||
# Create registry credentials secret
|
||||
microk8s kubectl create secret docker-registry gitea-registry-credentials \
|
||||
-n tekton-pipelines \
|
||||
--docker-server=gitea.bakery-ia.local:5000 \
|
||||
--docker-username=your-username \
|
||||
--docker-password=your-password
|
||||
```
|
||||
|
||||
3. **Configure Git credentials for Flux**:
|
||||
```bash
|
||||
# Create Git credentials secret
|
||||
microk8s kubectl create secret generic gitea-credentials \
|
||||
-n flux-system \
|
||||
--from-literal=username=your-username \
|
||||
--from-literal=password=your-password
|
||||
```
|
||||
|
||||
### Phase 3: Monitoring
|
||||
|
||||
```bash
|
||||
# Apply OpenTelemetry configuration
|
||||
microk8s kubectl apply -f infrastructure/ci-cd/monitoring/otel-collector.yaml
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Triggering a Pipeline
|
||||
|
||||
1. **Manual trigger**:
|
||||
```bash
|
||||
# Create a PipelineRun manually
|
||||
microk8s kubectl create -f - <<EOF
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: PipelineRun
|
||||
metadata:
|
||||
name: manual-ci-run
|
||||
namespace: tekton-pipelines
|
||||
spec:
|
||||
pipelineRef:
|
||||
name: bakery-ia-ci
|
||||
workspaces:
|
||||
- name: shared-workspace
|
||||
volumeClaimTemplate:
|
||||
spec:
|
||||
accessModes: ["ReadWriteOnce"]
|
||||
resources:
|
||||
requests:
|
||||
storage: 5Gi
|
||||
- name: docker-credentials
|
||||
secret:
|
||||
secretName: gitea-registry-credentials
|
||||
params:
|
||||
- name: git-url
|
||||
value: "http://gitea.bakery-ia.local/bakery-admin/bakery-ia.git"
|
||||
- name: git-revision
|
||||
value: "main"
|
||||
EOF
|
||||
```
|
||||
|
||||
2. **Automatic trigger**: Push code to the repository and the webhook will trigger the pipeline automatically.
|
||||
|
||||
### Monitoring Pipeline Runs
|
||||
|
||||
```bash
|
||||
# List all PipelineRuns
|
||||
microk8s kubectl get pipelineruns -n tekton-pipelines
|
||||
|
||||
# View logs for a specific PipelineRun
|
||||
microk8s kubectl logs -n tekton-pipelines <pipelinerun-pod> -c <step-name>
|
||||
|
||||
# View Tekton dashboard
|
||||
microk8s kubectl port-forward -n tekton-pipelines svc/tekton-dashboard 9097:9097
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Pipeline not triggering**:
|
||||
- Check Gitea webhook logs
|
||||
- Verify EventListener pods are running
|
||||
- Check TriggerBinding configuration
|
||||
|
||||
2. **Build failures**:
|
||||
- Check Kaniko logs for build errors
|
||||
- Verify Dockerfile paths are correct
|
||||
- Ensure registry credentials are valid
|
||||
|
||||
3. **Flux not applying changes**:
|
||||
- Check GitRepository status
|
||||
- Verify Kustomization reconciliation
|
||||
- Check Flux logs for errors
|
||||
|
||||
### Debugging Commands
|
||||
|
||||
```bash
|
||||
# Check Tekton controller logs
|
||||
microk8s kubectl logs -n tekton-pipelines -l app=tekton-pipelines-controller
|
||||
|
||||
# Check Flux reconciliation
|
||||
microk8s kubectl get kustomizations -n flux-system -o yaml
|
||||
|
||||
# Check Gitea webhook delivery
|
||||
microk8s kubectl logs -n tekton-pipelines -l app=tekton-triggers-controller
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **Secrets Management**:
|
||||
- Use Kubernetes secrets for sensitive data
|
||||
- Rotate credentials regularly
|
||||
- Use RBAC for namespace isolation
|
||||
|
||||
2. **Network Security**:
|
||||
- Configure network policies
|
||||
- Use internal DNS names
|
||||
- Restrict ingress access
|
||||
|
||||
3. **Registry Security**:
|
||||
- Enable image scanning
|
||||
- Use image signing
|
||||
- Implement cleanup policies
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Upgrading Components
|
||||
|
||||
```bash
|
||||
# Upgrade Tekton
|
||||
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||||
|
||||
# Upgrade Flux
|
||||
microk8s helm upgrade fluxcd fluxcd/flux2 -n flux-system
|
||||
|
||||
# Upgrade Gitea
|
||||
microk8s helm upgrade gitea gitea/gitea -n gitea -f infrastructure/ci-cd/gitea/values.yaml
|
||||
```
|
||||
|
||||
### Backup Procedures
|
||||
|
||||
```bash
|
||||
# Backup Gitea
|
||||
microk8s kubectl exec -n gitea gitea-0 -- gitea dump -c /data/gitea/conf/app.ini
|
||||
|
||||
# Backup Flux configurations
|
||||
microk8s kubectl get all -n flux-system -o yaml > flux-backup.yaml
|
||||
|
||||
# Backup Tekton configurations
|
||||
microk8s kubectl get all -n tekton-pipelines -o yaml > tekton-backup.yaml
|
||||
```
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
1. **Resource Management**:
|
||||
- Set appropriate resource limits
|
||||
- Limit concurrent builds
|
||||
- Use node selectors for build pods
|
||||
|
||||
2. **Caching**:
|
||||
- Configure Kaniko cache
|
||||
- Use persistent volumes for dependencies
|
||||
- Cache Docker layers
|
||||
|
||||
3. **Parallelization**:
|
||||
- Build independent services in parallel
|
||||
- Use matrix builds for different architectures
|
||||
- Optimize task dependencies
|
||||
|
||||
## Integration with Existing System
|
||||
|
||||
The CI/CD system integrates with:
|
||||
- **SigNoz**: For monitoring and observability
|
||||
- **MicroK8s**: For cluster management
|
||||
- **Existing Kubernetes manifests**: In `infrastructure/kubernetes/`
|
||||
- **Current services**: All 19 microservices in `services/`
|
||||
|
||||
## Migration Plan
|
||||
|
||||
1. **Phase 1**: Set up infrastructure (Gitea, Tekton, Flux)
|
||||
2. **Phase 2**: Configure pipelines and triggers
|
||||
3. **Phase 3**: Test with non-critical services
|
||||
4. **Phase 4**: Gradual rollout to all services
|
||||
5. **Phase 5**: Decommission old deployment methods
|
||||
|
||||
## Support
|
||||
|
||||
For issues with the CI/CD system:
|
||||
- Check logs and monitoring first
|
||||
- Review the troubleshooting section
|
||||
- Consult the original implementation plan
|
||||
- Refer to component documentation:
|
||||
- [Tekton Documentation](https://tekton.dev/docs/)
|
||||
- [Flux CD Documentation](https://fluxcd.io/docs/)
|
||||
- [Gitea Documentation](https://docs.gitea.io/)
|
||||
6
infrastructure/cicd/flux/Chart.yaml
Normal file
6
infrastructure/cicd/flux/Chart.yaml
Normal file
@@ -0,0 +1,6 @@
|
||||
apiVersion: v2
|
||||
name: flux-cd
|
||||
description: A Helm chart for deploying Flux CD GitOps toolkit for Bakery-IA
|
||||
type: application
|
||||
version: 0.1.0
|
||||
appVersion: "2.2.3"
|
||||
15
infrastructure/cicd/flux/templates/gitrepository.yaml
Normal file
15
infrastructure/cicd/flux/templates/gitrepository.yaml
Normal file
@@ -0,0 +1,15 @@
|
||||
{{- if .Values.gitRepository }}
|
||||
apiVersion: source.toolkit.fluxcd.io/v1
|
||||
kind: GitRepository
|
||||
metadata:
|
||||
name: {{ .Values.gitRepository.name }}
|
||||
namespace: {{ .Values.gitRepository.namespace }}
|
||||
spec:
|
||||
interval: {{ .Values.gitRepository.interval }}
|
||||
url: {{ .Values.gitRepository.url }}
|
||||
ref:
|
||||
branch: {{ .Values.gitRepository.ref.branch }}
|
||||
secretRef:
|
||||
name: {{ .Values.gitRepository.secretRef.name }}
|
||||
timeout: {{ .Values.gitRepository.timeout }}
|
||||
{{- end }}
|
||||
43
infrastructure/cicd/flux/templates/kustomization.yaml
Normal file
43
infrastructure/cicd/flux/templates/kustomization.yaml
Normal file
@@ -0,0 +1,43 @@
|
||||
{{- if .Values.kustomization }}
|
||||
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
||||
kind: Kustomization
|
||||
metadata:
|
||||
name: {{ .Values.kustomization.name }}
|
||||
namespace: {{ .Values.kustomization.namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: flux
|
||||
spec:
|
||||
# Wait for GitRepository to be ready before reconciling
|
||||
dependsOn: []
|
||||
interval: {{ .Values.kustomization.interval }}
|
||||
path: {{ .Values.kustomization.path }}
|
||||
prune: {{ .Values.kustomization.prune }}
|
||||
sourceRef:
|
||||
kind: {{ .Values.kustomization.sourceRef.kind }}
|
||||
name: {{ .Values.kustomization.sourceRef.name }}
|
||||
targetNamespace: {{ .Values.kustomization.targetNamespace }}
|
||||
timeout: {{ .Values.kustomization.timeout }}
|
||||
retryInterval: {{ .Values.kustomization.retryInterval }}
|
||||
wait: {{ .Values.kustomization.wait }}
|
||||
{{- if .Values.kustomization.healthChecks }}
|
||||
healthChecks:
|
||||
{{- range .Values.kustomization.healthChecks }}
|
||||
- apiVersion: {{ .apiVersion }}
|
||||
kind: {{ .kind }}
|
||||
name: {{ .name }}
|
||||
namespace: {{ .namespace }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if .Values.kustomization.postBuild }}
|
||||
postBuild:
|
||||
substituteFrom:
|
||||
{{- range .Values.kustomization.postBuild.substituteFrom }}
|
||||
- kind: {{ .kind }}
|
||||
name: {{ .name }}
|
||||
{{- if .optional }}
|
||||
optional: {{ .optional }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
9
infrastructure/cicd/flux/templates/namespace.yaml
Normal file
9
infrastructure/cicd/flux/templates/namespace.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
{{- if .Values.createNamespace | default false }}
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: {{ .Values.gitRepository.namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: flux
|
||||
kubernetes.io/metadata.name: {{ .Values.gitRepository.namespace }}
|
||||
{{- end }}
|
||||
73
infrastructure/cicd/flux/values.yaml
Normal file
73
infrastructure/cicd/flux/values.yaml
Normal file
@@ -0,0 +1,73 @@
|
||||
# Default values for flux-cd
|
||||
# This is a YAML-formatted file.
|
||||
# Declare variables to be passed into your templates.
|
||||
|
||||
gitRepository:
|
||||
name: bakery-ia
|
||||
namespace: flux-system
|
||||
interval: 1m
|
||||
url: http://gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bakery-ia.git
|
||||
ref:
|
||||
branch: main
|
||||
secretRef:
|
||||
name: gitea-credentials
|
||||
timeout: 60s
|
||||
|
||||
kustomization:
|
||||
name: bakery-ia-prod
|
||||
namespace: flux-system
|
||||
interval: 5m
|
||||
path: ./infrastructure/environments/prod
|
||||
prune: true
|
||||
sourceRef:
|
||||
kind: GitRepository
|
||||
name: bakery-ia
|
||||
targetNamespace: bakery-ia
|
||||
timeout: 10m
|
||||
retryInterval: 1m
|
||||
wait: true
|
||||
healthChecks:
|
||||
# Core Infrastructure
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: gateway
|
||||
namespace: bakery-ia
|
||||
# Authentication & Authorization
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: auth-service
|
||||
namespace: bakery-ia
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: tenant-service
|
||||
namespace: bakery-ia
|
||||
# Core Business Services
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: inventory-service
|
||||
namespace: bakery-ia
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: orders-service
|
||||
namespace: bakery-ia
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: pos-service
|
||||
namespace: bakery-ia
|
||||
# Data Services
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: forecasting-service
|
||||
namespace: bakery-ia
|
||||
- apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: notification-service
|
||||
namespace: bakery-ia
|
||||
postBuild:
|
||||
substituteFrom:
|
||||
- kind: ConfigMap
|
||||
name: bakery-ia-config
|
||||
optional: true
|
||||
- kind: Secret
|
||||
name: bakery-ia-secrets
|
||||
optional: true
|
||||
151
infrastructure/cicd/gitea/IMPLEMENTATION_SUMMARY.md
Normal file
151
infrastructure/cicd/gitea/IMPLEMENTATION_SUMMARY.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# Gitea Automatic Repository Creation - Implementation Summary
|
||||
|
||||
## Overview
|
||||
|
||||
This implementation adds automatic repository creation to the Gitea Helm chart configuration for the Bakery-IA project. When Gitea is installed or upgraded via Helm, it will automatically create a `bakery-ia` repository with the specified configuration.
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. Updated Helm Values (`values.yaml`)
|
||||
|
||||
Added the `initialRepositories` configuration under the `gitea:` section:
|
||||
|
||||
```yaml
|
||||
# Initial repositories to create automatically after Gitea installation
|
||||
# These will be created with the admin user as owner
|
||||
gitea:
|
||||
initialRepositories:
|
||||
- name: bakery-ia
|
||||
description: "Main repository for Bakery IA project - Automatically created by Helm"
|
||||
private: false
|
||||
auto_init: true
|
||||
default_branch: main
|
||||
owner: "{{ .Values.gitea.admin.username }}"
|
||||
# Enable issues, wiki, and other features
|
||||
enable_issues: true
|
||||
enable_wiki: true
|
||||
enable_pull_requests: true
|
||||
enable_projects: true
|
||||
```
|
||||
|
||||
### 2. Created Setup Script (`setup-gitea-repository.sh`)
|
||||
|
||||
A comprehensive bash script that:
|
||||
- Checks if Gitea is accessible
|
||||
- Verifies if the repository exists (creates it if not)
|
||||
- Configures the local Git repository
|
||||
- Pushes the existing code to the new Gitea repository
|
||||
|
||||
### 3. Created Test Script (`test-repository-creation.sh`)
|
||||
|
||||
A test script that verifies:
|
||||
- Gitea accessibility
|
||||
- Repository existence
|
||||
- Repository configuration (issues, wiki, pull requests)
|
||||
- Provides detailed repository information
|
||||
|
||||
### 4. Created Documentation
|
||||
|
||||
- **README.md**: Complete guide on installation, usage, and troubleshooting
|
||||
- **IMPLEMENTATION_SUMMARY.md**: This file, summarizing the implementation
|
||||
|
||||
## How It Works
|
||||
|
||||
### Automatic Repository Creation Flow
|
||||
|
||||
1. **Helm Installation**: When `helm install` or `helm upgrade` is executed with the updated values
|
||||
2. **Gitea Initialization**: Gitea starts and creates the admin user
|
||||
3. **Repository Creation**: Gitea processes the `initialRepositories` configuration and creates the specified repositories
|
||||
4. **Completion**: The repository is ready for use immediately after Gitea is fully initialized
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Automatic**: No manual intervention required after Helm installation
|
||||
- **Idempotent**: Safe to run multiple times (won't duplicate repositories)
|
||||
- **Configurable**: All repository settings are defined in Helm values
|
||||
- **Integrated**: Uses native Gitea Helm chart features
|
||||
|
||||
## Usage
|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
# Install Gitea with automatic repository creation
|
||||
helm install gitea gitea/gitea -n gitea \
|
||||
-f infrastructure/cicd/gitea/values.yaml \
|
||||
--set gitea.admin.password=your-secure-password
|
||||
```
|
||||
|
||||
### Push Existing Code
|
||||
|
||||
```bash
|
||||
export GITEA_ADMIN_PASSWORD="your-secure-password"
|
||||
./infrastructure/cicd/gitea/setup-gitea-repository.sh
|
||||
```
|
||||
|
||||
### Verify Repository
|
||||
|
||||
```bash
|
||||
export GITEA_ADMIN_PASSWORD="your-secure-password"
|
||||
./infrastructure/cicd/gitea/test-repository-creation.sh
|
||||
```
|
||||
|
||||
## Repository Configuration
|
||||
|
||||
The automatically created repository includes:
|
||||
|
||||
| Feature | Enabled | Description |
|
||||
|---------|---------|-------------|
|
||||
| Name | bakery-ia | Main project repository |
|
||||
| Description | Main repository for Bakery IA project | Clear identification |
|
||||
| Visibility | Public | Accessible without authentication |
|
||||
| Auto Init | Yes | Creates initial README.md |
|
||||
| Default Branch | main | Standard branch naming |
|
||||
| Issues | Yes | Bug and feature tracking |
|
||||
| Wiki | Yes | Project documentation |
|
||||
| Pull Requests | Yes | Code review workflow |
|
||||
| Projects | Yes | Project management |
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
The repository is ready for immediate CI/CD integration:
|
||||
|
||||
- **Repository URL**: `https://gitea.bakery-ia.local/bakery-admin/bakery-ia.git`
|
||||
- **Clone URL**: `https://gitea.bakery-ia.local/bakery-admin/bakery-ia.git`
|
||||
- **SSH URL**: `git@gitea.bakery-ia.local:bakery-admin/bakery-ia.git`
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Automation**: Eliminates manual repository creation step
|
||||
2. **Consistency**: Ensures all environments have the same repository structure
|
||||
3. **Reliability**: Uses Helm's declarative configuration management
|
||||
4. **Documentation**: Clear repository purpose and features
|
||||
5. **CI/CD Ready**: Repository is immediately available for pipeline configuration
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Repository Not Created
|
||||
|
||||
1. **Check Helm Values**: Ensure the `initialRepositories` section is correctly formatted
|
||||
2. **Verify Gitea Logs**: `kubectl logs -n gitea -l app.kubernetes.io/name=gitea`
|
||||
3. **Manual Creation**: Use the setup script to create the repository manually
|
||||
|
||||
### Authentication Issues
|
||||
|
||||
1. **Verify Password**: Ensure `GITEA_ADMIN_PASSWORD` is correct
|
||||
2. **Check Accessibility**: Confirm Gitea service is running and accessible
|
||||
3. **Network Configuration**: Verify ingress and DNS settings
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Potential improvements for future iterations:
|
||||
|
||||
1. **Multiple Repositories**: Add more repositories for different components
|
||||
2. **Webhooks**: Automatically configure webhooks for CI/CD triggers
|
||||
3. **Teams and Permissions**: Set up teams and access controls
|
||||
4. **Template Repositories**: Create repository templates with standard files
|
||||
5. **Backup Configuration**: Add automatic backup configuration
|
||||
|
||||
## Conclusion
|
||||
|
||||
This implementation provides a robust, automated solution for Gitea repository creation in the Bakery-IA project. It leverages Helm's native capabilities to ensure consistent, reliable repository setup across all environments.
|
||||
188
infrastructure/cicd/gitea/README.md
Normal file
188
infrastructure/cicd/gitea/README.md
Normal file
@@ -0,0 +1,188 @@
|
||||
# Gitea Configuration for Bakery-IA CI/CD
|
||||
|
||||
This directory contains the Helm values and scripts for setting up Gitea as the Git server for the Bakery-IA project.
|
||||
|
||||
## Features
|
||||
|
||||
- **Automatic Admin User**: Admin user is created automatically from Kubernetes secret
|
||||
- **Automatic Repository Creation**: The `bakery-ia` repository is created via a Kubernetes Job after Gitea starts
|
||||
- **Registry Support**: Container registry enabled for storing Docker images
|
||||
- **Tekton Integration**: Webhook automatically configured if Tekton is installed
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Development
|
||||
|
||||
```bash
|
||||
# 1. Setup secrets and init job (uses default dev password)
|
||||
./infrastructure/cicd/gitea/setup-admin-secret.sh
|
||||
|
||||
# 2. Install Gitea
|
||||
helm repo add gitea https://dl.gitea.io/charts
|
||||
helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml
|
||||
|
||||
# 3. Wait for everything to be ready
|
||||
kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s
|
||||
|
||||
# 4. Check init job completed
|
||||
kubectl logs -n gitea -l app.kubernetes.io/component=init --tail=50
|
||||
```
|
||||
|
||||
### Production
|
||||
|
||||
```bash
|
||||
# 1. Generate and export secure password
|
||||
export GITEA_ADMIN_PASSWORD=$(openssl rand -base64 32)
|
||||
|
||||
# 2. Setup secrets with production flag (requires GITEA_ADMIN_PASSWORD)
|
||||
./infrastructure/cicd/gitea/setup-admin-secret.sh --production
|
||||
|
||||
# 3. Install Gitea with production values
|
||||
helm repo add gitea https://dl.gitea.io/charts
|
||||
helm upgrade --install gitea gitea/gitea -n gitea \
|
||||
-f infrastructure/cicd/gitea/values.yaml \
|
||||
-f infrastructure/cicd/gitea/values-prod.yaml
|
||||
|
||||
# 4. Wait for everything to be ready
|
||||
kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s
|
||||
|
||||
# 5. Install Tekton CI/CD (see tekton-helm/README.md for details)
|
||||
export TEKTON_WEBHOOK_TOKEN=$(openssl rand -hex 32)
|
||||
helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \
|
||||
-n tekton-pipelines \
|
||||
-f infrastructure/cicd/tekton-helm/values.yaml \
|
||||
-f infrastructure/cicd/tekton-helm/values-prod.yaml \
|
||||
--set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \
|
||||
--set secrets.registry.password=$GITEA_ADMIN_PASSWORD \
|
||||
--set secrets.git.password=$GITEA_ADMIN_PASSWORD
|
||||
```
|
||||
|
||||
## Files
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `values.yaml` | Helm values for Gitea chart |
|
||||
| `values-prod.yaml` | Production Helm values |
|
||||
| `setup-admin-secret.sh` | Creates secrets and applies init job |
|
||||
| `gitea-init-job.yaml` | Kubernetes Job to create initial repository |
|
||||
| `setup-gitea-repository.sh` | Helper to push local code to Gitea |
|
||||
|
||||
## How It Works
|
||||
|
||||
### 1. Admin User Initialization
|
||||
|
||||
The Gitea Helm chart automatically creates the admin user on first install. Credentials are read from a Kubernetes secret:
|
||||
|
||||
```yaml
|
||||
gitea:
|
||||
admin:
|
||||
username: bakery-admin
|
||||
email: admin@bakery-ia.local
|
||||
existingSecret: gitea-admin-secret # Secret with username/password keys
|
||||
passwordMode: keepUpdated # Sync password changes from secret
|
||||
```
|
||||
|
||||
The `setup-admin-secret.sh` script creates this secret before Helm install.
|
||||
|
||||
### 2. Repository Initialization
|
||||
|
||||
Since the Gitea Helm chart doesn't support automatic repository creation, we use a Kubernetes Job (`gitea-init-job.yaml`) that:
|
||||
|
||||
1. Waits for Gitea to be ready
|
||||
2. Creates the `bakery-ia` repository via Gitea API
|
||||
3. Optionally configures a webhook for Tekton CI/CD
|
||||
|
||||
The Job is idempotent - it skips creation if the repository already exists.
|
||||
|
||||
## Detailed Installation
|
||||
|
||||
### Step 1: Create Secrets
|
||||
|
||||
```bash
|
||||
# Using default password (for dev environments)
|
||||
./infrastructure/cicd/gitea/setup-admin-secret.sh
|
||||
|
||||
# Or specify a custom password
|
||||
./infrastructure/cicd/gitea/setup-admin-secret.sh "your-secure-password"
|
||||
|
||||
# Or use environment variable
|
||||
export GITEA_ADMIN_PASSWORD="your-secure-password"
|
||||
./infrastructure/cicd/gitea/setup-admin-secret.sh
|
||||
```
|
||||
|
||||
This creates:
|
||||
- `gitea-admin-secret` in `gitea` namespace - used by Gitea for admin credentials
|
||||
- `gitea-registry-secret` in `bakery-ia` namespace - used for `imagePullSecrets`
|
||||
- Applies `gitea-init-job.yaml` (ConfigMap + Job)
|
||||
|
||||
### Step 2: Install Gitea
|
||||
|
||||
```bash
|
||||
helm repo add gitea https://dl.gitea.io/charts
|
||||
helm repo update
|
||||
|
||||
helm install gitea gitea/gitea -n gitea \
|
||||
-f infrastructure/cicd/gitea/values.yaml
|
||||
```
|
||||
|
||||
### Step 3: Verify Installation
|
||||
|
||||
```bash
|
||||
# Wait for Gitea pod
|
||||
kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s
|
||||
|
||||
# Check init job logs
|
||||
kubectl logs -n gitea job/gitea-init-repo
|
||||
|
||||
# Verify repository was created
|
||||
curl -u bakery-admin:pvYUkGWJijqc0QfIZEXw \
|
||||
https://gitea.bakery-ia.local/api/v1/repos/bakery-admin/bakery-ia
|
||||
```
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
Repository URL:
|
||||
```
|
||||
https://gitea.bakery-ia.local/bakery-admin/bakery-ia.git
|
||||
```
|
||||
|
||||
Internal cluster URL (for pipelines):
|
||||
```
|
||||
http://gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bakery-ia.git
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Init Job Failed
|
||||
|
||||
```bash
|
||||
# Check job status
|
||||
kubectl get jobs -n gitea
|
||||
|
||||
# View logs
|
||||
kubectl logs -n gitea job/gitea-init-repo
|
||||
|
||||
# Re-run the job
|
||||
kubectl delete job gitea-init-repo -n gitea
|
||||
kubectl apply -f infrastructure/cicd/gitea/gitea-init-job.yaml
|
||||
```
|
||||
|
||||
### Repository Not Created
|
||||
|
||||
1. Check if Gitea is ready: `kubectl get pods -n gitea`
|
||||
2. Check init job logs: `kubectl logs -n gitea job/gitea-init-repo`
|
||||
3. Manually create via API or use `setup-gitea-repository.sh`
|
||||
|
||||
### Authentication Issues
|
||||
|
||||
1. Verify secret exists: `kubectl get secret gitea-admin-secret -n gitea`
|
||||
2. Check credentials: `kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d`
|
||||
|
||||
## Upgrading
|
||||
|
||||
```bash
|
||||
helm upgrade gitea gitea/gitea -n gitea \
|
||||
-f infrastructure/cicd/gitea/values.yaml
|
||||
```
|
||||
|
||||
Repositories and data are preserved during upgrades (stored in PVC).
|
||||
176
infrastructure/cicd/gitea/gitea-init-job.yaml
Normal file
176
infrastructure/cicd/gitea/gitea-init-job.yaml
Normal file
@@ -0,0 +1,176 @@
|
||||
# Gitea Initialization Job
|
||||
# This Job runs after Gitea is installed to create the initial repository
|
||||
# It uses the same admin credentials from gitea-admin-secret
|
||||
#
|
||||
# Apply after Gitea is ready:
|
||||
# kubectl apply -f gitea-init-job.yaml -n gitea
|
||||
#
|
||||
# To re-run (if needed):
|
||||
# kubectl delete job gitea-init-repo -n gitea
|
||||
# kubectl apply -f gitea-init-job.yaml -n gitea
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: gitea-init-script
|
||||
namespace: gitea
|
||||
labels:
|
||||
app.kubernetes.io/name: gitea
|
||||
app.kubernetes.io/component: init
|
||||
data:
|
||||
init-repo.sh: |
|
||||
#!/bin/sh
|
||||
set -e
|
||||
|
||||
GITEA_URL="http://gitea-http.gitea.svc.cluster.local:3000"
|
||||
REPO_NAME="bakery-ia"
|
||||
MAX_RETRIES=30
|
||||
RETRY_INTERVAL=10
|
||||
|
||||
echo "=== Gitea Repository Initialization ==="
|
||||
echo "Gitea URL: $GITEA_URL"
|
||||
echo "Repository: $REPO_NAME"
|
||||
echo "Admin User: $GITEA_ADMIN_USER"
|
||||
|
||||
# Wait for Gitea to be ready
|
||||
echo ""
|
||||
echo "Waiting for Gitea to be ready..."
|
||||
RETRIES=0
|
||||
until curl -sf "$GITEA_URL/api/v1/version" > /dev/null 2>&1; do
|
||||
RETRIES=$((RETRIES + 1))
|
||||
if [ $RETRIES -ge $MAX_RETRIES ]; then
|
||||
echo "ERROR: Gitea did not become ready after $MAX_RETRIES attempts"
|
||||
exit 1
|
||||
fi
|
||||
echo " Attempt $RETRIES/$MAX_RETRIES - Gitea not ready, waiting ${RETRY_INTERVAL}s..."
|
||||
sleep $RETRY_INTERVAL
|
||||
done
|
||||
echo "Gitea is ready!"
|
||||
|
||||
# Check if repository already exists
|
||||
echo ""
|
||||
echo "Checking if repository '$REPO_NAME' exists..."
|
||||
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
|
||||
-u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
|
||||
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME")
|
||||
|
||||
if [ "$HTTP_CODE" = "200" ]; then
|
||||
echo "Repository '$REPO_NAME' already exists. Nothing to do."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Create the repository
|
||||
echo "Creating repository '$REPO_NAME'..."
|
||||
RESPONSE=$(curl -s -w "\n%{http_code}" \
|
||||
-u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
|
||||
-X POST "$GITEA_URL/api/v1/user/repos" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"name": "'"$REPO_NAME"'",
|
||||
"description": "Main repository for Bakery IA project - Automatically created",
|
||||
"private": false,
|
||||
"auto_init": true,
|
||||
"default_branch": "main",
|
||||
"readme": "Default"
|
||||
}')
|
||||
|
||||
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
|
||||
BODY=$(echo "$RESPONSE" | sed '$d')
|
||||
|
||||
if [ "$HTTP_CODE" = "201" ]; then
|
||||
echo "Repository '$REPO_NAME' created successfully!"
|
||||
echo ""
|
||||
echo "Repository URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME"
|
||||
echo "Clone URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME.git"
|
||||
else
|
||||
echo "ERROR: Failed to create repository (HTTP $HTTP_CODE)"
|
||||
echo "Response: $BODY"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Configure webhook for Tekton (optional - if Tekton is installed)
|
||||
echo ""
|
||||
echo "Checking if Tekton EventListener is available..."
|
||||
TEKTON_URL="http://el-bakery-ia-listener.tekton-pipelines.svc.cluster.local:8080"
|
||||
if curl -sf "$TEKTON_URL" > /dev/null 2>&1; then
|
||||
echo "Tekton EventListener found. Creating webhook..."
|
||||
WEBHOOK_RESPONSE=$(curl -s -w "\n%{http_code}" \
|
||||
-u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
|
||||
-X POST "$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME/hooks" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"type": "gitea",
|
||||
"config": {
|
||||
"url": "'"$TEKTON_URL"'",
|
||||
"content_type": "json"
|
||||
},
|
||||
"events": ["push"],
|
||||
"active": true
|
||||
}')
|
||||
|
||||
WEBHOOK_CODE=$(echo "$WEBHOOK_RESPONSE" | tail -1)
|
||||
if [ "$WEBHOOK_CODE" = "201" ]; then
|
||||
echo "Webhook created successfully!"
|
||||
else
|
||||
echo "Warning: Could not create webhook (HTTP $WEBHOOK_CODE). You may need to configure it manually."
|
||||
fi
|
||||
else
|
||||
echo "Tekton EventListener not available. Skipping webhook creation."
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=== Initialization Complete ==="
|
||||
---
|
||||
apiVersion: batch/v1
|
||||
kind: Job
|
||||
metadata:
|
||||
name: gitea-init-repo
|
||||
namespace: gitea
|
||||
labels:
|
||||
app.kubernetes.io/name: gitea
|
||||
app.kubernetes.io/component: init
|
||||
annotations:
|
||||
# Helm hook annotations (if used with Helm)
|
||||
helm.sh/hook: post-install,post-upgrade
|
||||
helm.sh/hook-weight: "10"
|
||||
helm.sh/hook-delete-policy: before-hook-creation
|
||||
spec:
|
||||
ttlSecondsAfterFinished: 300
|
||||
backoffLimit: 3
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: gitea
|
||||
app.kubernetes.io/component: init
|
||||
spec:
|
||||
restartPolicy: OnFailure
|
||||
containers:
|
||||
- name: init-repo
|
||||
image: curlimages/curl:8.5.0
|
||||
command: ["/bin/sh", "/scripts/init-repo.sh"]
|
||||
env:
|
||||
- name: GITEA_ADMIN_USER
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: gitea-admin-secret
|
||||
key: username
|
||||
- name: GITEA_ADMIN_PASSWORD
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: gitea-admin-secret
|
||||
key: password
|
||||
volumeMounts:
|
||||
- name: init-script
|
||||
mountPath: /scripts
|
||||
resources:
|
||||
limits:
|
||||
cpu: 100m
|
||||
memory: 64Mi
|
||||
requests:
|
||||
cpu: 50m
|
||||
memory: 32Mi
|
||||
volumes:
|
||||
- name: init-script
|
||||
configMap:
|
||||
name: gitea-init-script
|
||||
defaultMode: 0755
|
||||
209
infrastructure/cicd/gitea/setup-admin-secret.sh
Executable file
209
infrastructure/cicd/gitea/setup-admin-secret.sh
Executable file
@@ -0,0 +1,209 @@
|
||||
#!/bin/bash
|
||||
# Setup Gitea Admin Secret and Initialize Gitea
|
||||
#
|
||||
# This script:
|
||||
# 1. Creates gitea-admin-secret (gitea namespace) - Used by Gitea Helm chart for admin credentials
|
||||
# 2. Creates gitea-registry-secret (bakery-ia namespace) - Used by pods for imagePullSecrets
|
||||
# 3. Applies the gitea-init-job.yaml to create the initial repository
|
||||
#
|
||||
# Usage:
|
||||
# Development:
|
||||
# ./setup-admin-secret.sh # Uses default dev password
|
||||
# ./setup-admin-secret.sh [password] # Uses provided password
|
||||
# ./setup-admin-secret.sh --secrets-only # Only create secrets, skip init job
|
||||
#
|
||||
# Production:
|
||||
# export GITEA_ADMIN_PASSWORD=$(openssl rand -base64 32)
|
||||
# ./setup-admin-secret.sh --production
|
||||
# ./setup-admin-secret.sh --production --secrets-only
|
||||
#
|
||||
# Environment variables:
|
||||
# GITEA_ADMIN_PASSWORD - Password to use (required for --production)
|
||||
|
||||
set -e
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
KUBECTL="kubectl"
|
||||
GITEA_NAMESPACE="gitea"
|
||||
BAKERY_NAMESPACE="bakery-ia"
|
||||
REGISTRY_HOST="registry.bakery-ia.local"
|
||||
ADMIN_USERNAME="bakery-admin"
|
||||
# Default password for dev environment only
|
||||
# For PRODUCTION: Always set GITEA_ADMIN_PASSWORD environment variable
|
||||
# Generate secure password with: openssl rand -base64 32
|
||||
DEV_DEFAULT_PASSWORD="pvYUkGWJijqc0QfIZEXw"
|
||||
SECRETS_ONLY=false
|
||||
IS_PRODUCTION=false
|
||||
|
||||
# Check if running in microk8s
|
||||
if command -v microk8s &> /dev/null; then
|
||||
KUBECTL="microk8s kubectl"
|
||||
fi
|
||||
|
||||
# Parse arguments
|
||||
for arg in "$@"; do
|
||||
case $arg in
|
||||
--secrets-only)
|
||||
SECRETS_ONLY=true
|
||||
;;
|
||||
--production)
|
||||
IS_PRODUCTION=true
|
||||
REGISTRY_HOST="registry.bakewise.ai"
|
||||
;;
|
||||
*)
|
||||
if [ -z "$ADMIN_PASSWORD" ] && [ "$arg" != "--secrets-only" ] && [ "$arg" != "--production" ]; then
|
||||
ADMIN_PASSWORD="$arg"
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Get password from argument, environment variable, or use default (dev only)
|
||||
if [ -z "$ADMIN_PASSWORD" ]; then
|
||||
if [ -n "$GITEA_ADMIN_PASSWORD" ]; then
|
||||
ADMIN_PASSWORD="$GITEA_ADMIN_PASSWORD"
|
||||
echo "Using password from GITEA_ADMIN_PASSWORD environment variable"
|
||||
elif [ "$IS_PRODUCTION" = true ]; then
|
||||
echo "ERROR: Production deployment requires GITEA_ADMIN_PASSWORD environment variable"
|
||||
echo "Generate a secure password with: openssl rand -base64 32"
|
||||
echo ""
|
||||
echo "Usage for production:"
|
||||
echo " export GITEA_ADMIN_PASSWORD=\$(openssl rand -base64 32)"
|
||||
echo " ./setup-admin-secret.sh --production"
|
||||
exit 1
|
||||
else
|
||||
ADMIN_PASSWORD="$DEV_DEFAULT_PASSWORD"
|
||||
echo "WARNING: Using default dev password. For production, set GITEA_ADMIN_PASSWORD"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Validate password strength for production
|
||||
if [ "$IS_PRODUCTION" = true ] && [ ${#ADMIN_PASSWORD} -lt 16 ]; then
|
||||
echo "ERROR: Production password must be at least 16 characters"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Create namespaces if they don't exist
|
||||
$KUBECTL create namespace "$GITEA_NAMESPACE" --dry-run=client -o yaml | $KUBECTL apply -f -
|
||||
$KUBECTL create namespace "$BAKERY_NAMESPACE" --dry-run=client -o yaml | $KUBECTL apply -f -
|
||||
|
||||
# 1. Create gitea-admin-secret for Gitea Helm chart
|
||||
echo "Creating gitea-admin-secret in $GITEA_NAMESPACE namespace..."
|
||||
$KUBECTL create secret generic gitea-admin-secret \
|
||||
--namespace "$GITEA_NAMESPACE" \
|
||||
--from-literal=username="$ADMIN_USERNAME" \
|
||||
--from-literal=password="$ADMIN_PASSWORD" \
|
||||
--dry-run=client -o yaml | $KUBECTL apply -f -
|
||||
|
||||
# 2. Create gitea-registry-secret for imagePullSecrets
|
||||
echo "Creating gitea-registry-secret in $BAKERY_NAMESPACE namespace..."
|
||||
|
||||
# Create Docker config JSON for registry authentication
|
||||
# Include both external (ingress) and internal (cluster) registry URLs
|
||||
AUTH_BASE64=$(echo -n "${ADMIN_USERNAME}:${ADMIN_PASSWORD}" | base64)
|
||||
INTERNAL_REGISTRY_HOST="gitea-http.gitea.svc.cluster.local:3000"
|
||||
DOCKER_CONFIG_JSON=$(cat <<EOF
|
||||
{
|
||||
"auths": {
|
||||
"${REGISTRY_HOST}": {
|
||||
"username": "${ADMIN_USERNAME}",
|
||||
"password": "${ADMIN_PASSWORD}",
|
||||
"auth": "${AUTH_BASE64}"
|
||||
},
|
||||
"${INTERNAL_REGISTRY_HOST}": {
|
||||
"username": "${ADMIN_USERNAME}",
|
||||
"password": "${ADMIN_PASSWORD}",
|
||||
"auth": "${AUTH_BASE64}"
|
||||
}
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)
|
||||
|
||||
# Base64 encode the entire config (use -w0 on Linux, no flag needed on macOS)
|
||||
if [[ "$OSTYPE" == "darwin"* ]]; then
|
||||
DOCKER_CONFIG_BASE64=$(echo -n "$DOCKER_CONFIG_JSON" | base64)
|
||||
else
|
||||
DOCKER_CONFIG_BASE64=$(echo -n "$DOCKER_CONFIG_JSON" | base64 -w0)
|
||||
fi
|
||||
|
||||
# Create the registry secret
|
||||
cat <<EOF | $KUBECTL apply -f -
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-registry-secret
|
||||
namespace: ${BAKERY_NAMESPACE}
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: registry
|
||||
app.kubernetes.io/managed-by: setup-admin-secret
|
||||
type: kubernetes.io/dockerconfigjson
|
||||
data:
|
||||
.dockerconfigjson: ${DOCKER_CONFIG_BASE64}
|
||||
EOF
|
||||
|
||||
echo ""
|
||||
echo "=========================================="
|
||||
echo "Gitea secrets created successfully!"
|
||||
echo "=========================================="
|
||||
echo ""
|
||||
echo "Environment: $([ "$IS_PRODUCTION" = true ] && echo "PRODUCTION" || echo "Development")"
|
||||
echo ""
|
||||
echo "Credentials:"
|
||||
echo " Username: $ADMIN_USERNAME"
|
||||
if [ "$IS_PRODUCTION" = true ]; then
|
||||
echo " Password: (stored in secret, not displayed for security)"
|
||||
else
|
||||
echo " Password: $ADMIN_PASSWORD"
|
||||
fi
|
||||
echo ""
|
||||
echo "Secrets created:"
|
||||
echo " 1. gitea-admin-secret (namespace: $GITEA_NAMESPACE) - For Gitea Helm chart"
|
||||
echo " 2. gitea-registry-secret (namespace: $BAKERY_NAMESPACE) - For imagePullSecrets"
|
||||
echo ""
|
||||
echo "Registry URLs:"
|
||||
echo " External: https://$REGISTRY_HOST"
|
||||
echo " Internal: $INTERNAL_REGISTRY_HOST"
|
||||
echo ""
|
||||
|
||||
# Apply the init job ConfigMap and Job (but Job won't run until Gitea is installed)
|
||||
if [ "$SECRETS_ONLY" = false ]; then
|
||||
INIT_JOB_FILE="$SCRIPT_DIR/gitea-init-job.yaml"
|
||||
if [ -f "$INIT_JOB_FILE" ]; then
|
||||
echo "Applying Gitea initialization resources..."
|
||||
$KUBECTL apply -f "$INIT_JOB_FILE"
|
||||
echo ""
|
||||
echo "Init job will create the 'bakery-ia' repository once Gitea is ready."
|
||||
else
|
||||
echo "Warning: gitea-init-job.yaml not found at $INIT_JOB_FILE"
|
||||
fi
|
||||
echo ""
|
||||
fi
|
||||
|
||||
echo "Next steps:"
|
||||
if [ "$IS_PRODUCTION" = true ]; then
|
||||
echo " 1. Install Gitea for production:"
|
||||
echo " helm upgrade --install gitea gitea/gitea -n gitea \\"
|
||||
echo " -f infrastructure/cicd/gitea/values.yaml \\"
|
||||
echo " -f infrastructure/cicd/gitea/values-prod.yaml"
|
||||
echo ""
|
||||
echo " 2. Install Tekton CI/CD for production:"
|
||||
echo " export TEKTON_WEBHOOK_TOKEN=\$(openssl rand -hex 32)"
|
||||
echo " helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \\"
|
||||
echo " -n tekton-pipelines \\"
|
||||
echo " -f infrastructure/cicd/tekton-helm/values.yaml \\"
|
||||
echo " -f infrastructure/cicd/tekton-helm/values-prod.yaml \\"
|
||||
echo " --set secrets.webhook.token=\$TEKTON_WEBHOOK_TOKEN \\"
|
||||
echo " --set secrets.registry.password=\$GITEA_ADMIN_PASSWORD \\"
|
||||
echo " --set secrets.git.password=\$GITEA_ADMIN_PASSWORD"
|
||||
else
|
||||
echo " 1. Install Gitea (if not already installed):"
|
||||
echo " helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml"
|
||||
fi
|
||||
echo ""
|
||||
echo " $([ "$IS_PRODUCTION" = true ] && echo "3" || echo "2"). Wait for Gitea to be ready:"
|
||||
echo " kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s"
|
||||
echo ""
|
||||
echo " $([ "$IS_PRODUCTION" = true ] && echo "4" || echo "3"). Check init job status:"
|
||||
echo " kubectl logs -n gitea -l app.kubernetes.io/component=init --tail=50"
|
||||
119
infrastructure/cicd/gitea/setup-gitea-repository.sh
Executable file
119
infrastructure/cicd/gitea/setup-gitea-repository.sh
Executable file
@@ -0,0 +1,119 @@
|
||||
#!/bin/bash
|
||||
# Script to setup and push code to the automatically created Gitea repository
|
||||
# This script should be run after Gitea is installed and the repository is created
|
||||
|
||||
set -e
|
||||
|
||||
echo "=== Gitea Repository Setup Script ==="
|
||||
echo "This script will configure the bakery-ia repository in Gitea"
|
||||
echo
|
||||
|
||||
# Configuration - update these values as needed
|
||||
GITEA_URL="https://gitea.bakery-ia.local"
|
||||
GITEA_ADMIN_USER="bakery-admin"
|
||||
REPO_NAME="bakery-ia"
|
||||
LOCAL_DIR="/Users/urtzialfaro/Documents/bakery-ia"
|
||||
|
||||
# Check if Gitea admin password is set
|
||||
if [ -z "$GITEA_ADMIN_PASSWORD" ]; then
|
||||
echo "Error: GITEA_ADMIN_PASSWORD environment variable is not set"
|
||||
echo "Please set it to the admin password you used during Gitea installation"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Checking if Gitea is accessible..."
|
||||
if ! curl -s -o /dev/null -w "%{http_code}" "$GITEA_URL" | grep -q "200"; then
|
||||
echo "Error: Cannot access Gitea at $GITEA_URL"
|
||||
echo "Please ensure Gitea is running and accessible"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✓ Gitea is accessible"
|
||||
|
||||
echo "Checking if repository $REPO_NAME exists..."
|
||||
REPO_CHECK=$(curl -s -w "%{http_code}" -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
|
||||
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME" | tail -1)
|
||||
|
||||
if [ "$REPO_CHECK" != "200" ]; then
|
||||
echo "Repository $REPO_NAME does not exist or is not accessible"
|
||||
echo "Attempting to create it..."
|
||||
|
||||
CREATE_RESPONSE=$(curl -s -w "%{http_code}" -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
|
||||
-X POST "$GITEA_URL/api/v1/user/repos" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"name": "'"$REPO_NAME"'",
|
||||
"description": "Main repository for Bakery IA project",
|
||||
"private": false,
|
||||
"auto_init": true,
|
||||
"default_branch": "main"
|
||||
}')
|
||||
|
||||
HTTP_CODE=$(echo "$CREATE_RESPONSE" | tail -1)
|
||||
RESPONSE_BODY=$(echo "$CREATE_RESPONSE" | sed '$d')
|
||||
|
||||
if [ "$HTTP_CODE" != "201" ]; then
|
||||
echo "Error creating repository: HTTP $HTTP_CODE"
|
||||
echo "Response: $RESPONSE_BODY"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✓ Repository $REPO_NAME created successfully"
|
||||
else
|
||||
echo "✓ Repository $REPO_NAME already exists"
|
||||
fi
|
||||
|
||||
echo "Configuring Git repository..."
|
||||
cd "$LOCAL_DIR"
|
||||
|
||||
# Check if this is already a git repository
|
||||
if [ ! -d ".git" ]; then
|
||||
echo "Initializing Git repository..."
|
||||
git init
|
||||
git branch -M main
|
||||
else
|
||||
echo "Git repository already initialized"
|
||||
fi
|
||||
|
||||
# Configure Git user if not already set
|
||||
if [ -z "$(git config user.name)" ]; then
|
||||
git config user.name "$GITEA_ADMIN_USER"
|
||||
git config user.email "admin@bakery-ia.local"
|
||||
echo "✓ Configured Git user: $GITEA_ADMIN_USER"
|
||||
fi
|
||||
|
||||
# Set the remote URL
|
||||
GIT_REMOTE_URL="$GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME.git"
|
||||
|
||||
if git remote | grep -q "origin"; then
|
||||
CURRENT_REMOTE=$(git remote get-url origin)
|
||||
if [ "$CURRENT_REMOTE" != "$GIT_REMOTE_URL" ]; then
|
||||
echo "Updating remote origin to: $GIT_REMOTE_URL"
|
||||
git remote set-url origin "$GIT_REMOTE_URL"
|
||||
else
|
||||
echo "Remote origin is already set correctly"
|
||||
fi
|
||||
else
|
||||
echo "Setting remote origin to: $GIT_REMOTE_URL"
|
||||
git remote add origin "$GIT_REMOTE_URL"
|
||||
fi
|
||||
|
||||
echo "Checking if there are changes to commit..."
|
||||
if [ -n "$(git status --porcelain)" ]; then
|
||||
echo "Committing changes..."
|
||||
git add .
|
||||
git commit -m "Initial commit - Bakery IA project setup"
|
||||
echo "✓ Changes committed"
|
||||
else
|
||||
echo "No changes to commit"
|
||||
fi
|
||||
|
||||
echo "Pushing to Gitea repository..."
|
||||
git push --set-upstream origin main
|
||||
|
||||
echo "✓ Code pushed successfully to Gitea!"
|
||||
|
||||
echo "Repository URL: $GIT_REMOTE_URL"
|
||||
echo "You can now configure your CI/CD pipelines to use this repository."
|
||||
|
||||
echo "=== Setup Complete ==="
|
||||
84
infrastructure/cicd/gitea/test-repository-creation.sh
Executable file
84
infrastructure/cicd/gitea/test-repository-creation.sh
Executable file
@@ -0,0 +1,84 @@
|
||||
#!/bin/bash
|
||||
# Test script to verify that the Gitea repository was created successfully
|
||||
|
||||
set -e
|
||||
|
||||
echo "=== Gitea Repository Creation Test ==="
|
||||
echo
|
||||
|
||||
# Configuration - update these values as needed
|
||||
GITEA_URL="https://gitea.bakery-ia.local"
|
||||
GITEA_ADMIN_USER="bakery-admin"
|
||||
REPO_NAME="bakery-ia"
|
||||
|
||||
# Check if Gitea admin password is set
|
||||
if [ -z "$GITEA_ADMIN_PASSWORD" ]; then
|
||||
echo "Error: GITEA_ADMIN_PASSWORD environment variable is not set"
|
||||
echo "Please set it to the admin password you used during Gitea installation"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Testing Gitea accessibility..."
|
||||
if ! curl -s -o /dev/null -w "%{http_code}" "$GITEA_URL" | grep -q "200"; then
|
||||
echo "❌ Error: Cannot access Gitea at $GITEA_URL"
|
||||
echo "Please ensure Gitea is running and accessible"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✅ Gitea is accessible"
|
||||
|
||||
echo "Testing repository existence..."
|
||||
REPO_CHECK=$(curl -s -w "%{http_code}" -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
|
||||
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME" | tail -1)
|
||||
|
||||
if [ "$REPO_CHECK" == "200" ]; then
|
||||
echo "✅ Repository '$REPO_NAME' exists"
|
||||
|
||||
# Get repository details
|
||||
REPO_DETAILS=$(curl -s -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
|
||||
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME")
|
||||
|
||||
REPO_DESCRIPTION=$(echo "$REPO_DETAILS" | jq -r '.description')
|
||||
REPO_PRIVATE=$(echo "$REPO_DETAILS" | jq -r '.private')
|
||||
REPO_DEFAULT_BRANCH=$(echo "$REPO_DETAILS" | jq -r '.default_branch')
|
||||
|
||||
echo "Repository Details:"
|
||||
echo " - Name: $REPO_NAME"
|
||||
echo " - Description: $REPO_DESCRIPTION"
|
||||
echo " - Private: $REPO_PRIVATE"
|
||||
echo " - Default Branch: $REPO_DEFAULT_BRANCH"
|
||||
echo " - URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME"
|
||||
echo " - Clone URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME.git"
|
||||
|
||||
# Test if repository has issues enabled
|
||||
if echo "$REPO_DETAILS" | jq -e '.has_issues == true' > /dev/null; then
|
||||
echo "✅ Issues are enabled"
|
||||
else
|
||||
echo "❌ Issues are not enabled"
|
||||
fi
|
||||
|
||||
# Test if repository has wiki enabled
|
||||
if echo "$REPO_DETAILS" | jq -e '.has_wiki == true' > /dev/null; then
|
||||
echo "✅ Wiki is enabled"
|
||||
else
|
||||
echo "❌ Wiki is not enabled"
|
||||
fi
|
||||
|
||||
# Test if repository has pull requests enabled
|
||||
if echo "$REPO_DETAILS" | jq -e '.has_pull_requests == true' > /dev/null; then
|
||||
echo "✅ Pull requests are enabled"
|
||||
else
|
||||
echo "❌ Pull requests are not enabled"
|
||||
fi
|
||||
|
||||
echo
|
||||
echo "✅ All tests passed! Repository is ready for use."
|
||||
|
||||
else
|
||||
echo "❌ Repository '$REPO_NAME' does not exist"
|
||||
echo "Expected HTTP 200, got: $REPO_CHECK"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo
|
||||
echo "=== Test Complete ==="
|
||||
65
infrastructure/cicd/gitea/values-prod.yaml
Normal file
65
infrastructure/cicd/gitea/values-prod.yaml
Normal file
@@ -0,0 +1,65 @@
|
||||
# Gitea Helm values for Production environment
|
||||
# This file overrides values.yaml for production deployment
|
||||
#
|
||||
# Installation:
|
||||
# helm upgrade --install gitea gitea/gitea -n gitea \
|
||||
# -f infrastructure/cicd/gitea/values.yaml \
|
||||
# -f infrastructure/cicd/gitea/values-prod.yaml
|
||||
|
||||
ingress:
|
||||
enabled: true
|
||||
className: nginx
|
||||
annotations:
|
||||
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
||||
nginx.ingress.kubernetes.io/proxy-body-size: "500m"
|
||||
nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
|
||||
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
|
||||
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-production"
|
||||
hosts:
|
||||
- host: gitea.bakewise.ai
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
tls:
|
||||
- secretName: gitea-tls-cert
|
||||
hosts:
|
||||
- gitea.bakewise.ai
|
||||
apiIngress:
|
||||
enabled: true
|
||||
className: nginx
|
||||
annotations:
|
||||
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
||||
nginx.ingress.kubernetes.io/proxy-body-size: "500m"
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-production"
|
||||
hosts:
|
||||
- host: registry.bakewise.ai
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
tls:
|
||||
- secretName: registry-tls-cert
|
||||
hosts:
|
||||
- registry.bakewise.ai
|
||||
|
||||
gitea:
|
||||
admin:
|
||||
email: admin@bakewise.ai
|
||||
config:
|
||||
server:
|
||||
DOMAIN: gitea.bakewise.ai
|
||||
SSH_DOMAIN: gitea.bakewise.ai
|
||||
ROOT_URL: https://gitea.bakewise.ai
|
||||
|
||||
# Production resources - adjust based on expected load
|
||||
resources:
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 1Gi
|
||||
requests:
|
||||
cpu: 200m
|
||||
memory: 512Mi
|
||||
|
||||
# Larger storage for production
|
||||
persistence:
|
||||
size: 50Gi
|
||||
132
infrastructure/cicd/gitea/values.yaml
Normal file
132
infrastructure/cicd/gitea/values.yaml
Normal file
@@ -0,0 +1,132 @@
|
||||
# Gitea Helm values configuration for Bakery-IA CI/CD
|
||||
# This configuration sets up Gitea with registry support and appropriate storage
|
||||
#
|
||||
# Prerequisites:
|
||||
# 1. Run setup-admin-secret.sh to create the gitea-admin-secret
|
||||
# 2. Apply the post-install job: kubectl apply -f gitea-init-job.yaml
|
||||
#
|
||||
# Installation:
|
||||
# helm repo add gitea https://dl.gitea.io/charts
|
||||
# helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml
|
||||
#
|
||||
# NOTE: The namespace is determined by the -n flag during helm install, not in this file.
|
||||
|
||||
# Use regular Gitea image instead of rootless to ensure registry functionality
|
||||
# Rootless images don't support container registry due to security restrictions
|
||||
image:
|
||||
rootless: false
|
||||
|
||||
service:
|
||||
http:
|
||||
type: ClusterIP
|
||||
port: 3000
|
||||
ssh:
|
||||
type: ClusterIP
|
||||
port: 2222
|
||||
# NOTE: Gitea's container registry is served on port 3000 (same as HTTP) under /v2/
|
||||
# The registry.PORT in gitea config is NOT used for external access
|
||||
# Registry authentication and API is handled by the main HTTP service
|
||||
|
||||
ingress:
|
||||
enabled: true
|
||||
className: nginx
|
||||
annotations:
|
||||
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
||||
nginx.ingress.kubernetes.io/proxy-body-size: "500m"
|
||||
nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
|
||||
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
|
||||
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
|
||||
hosts:
|
||||
- host: gitea.bakery-ia.local
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
tls:
|
||||
- secretName: bakery-dev-tls-cert
|
||||
hosts:
|
||||
- gitea.bakery-ia.local
|
||||
- registry.bakery-ia.local
|
||||
|
||||
persistence:
|
||||
enabled: true
|
||||
size: 10Gi
|
||||
# Use standard storage class (works with Kind's default provisioner)
|
||||
# For microk8s: storageClass: "microk8s-hostpath"
|
||||
# For Kind: leave empty or use "standard"
|
||||
storageClass: ""
|
||||
|
||||
# =============================================================================
|
||||
# ADMIN USER CONFIGURATION
|
||||
# =============================================================================
|
||||
# The admin user is automatically created on first install.
|
||||
# Credentials are read from the 'gitea-admin-secret' Kubernetes secret.
|
||||
#
|
||||
# Create the secret BEFORE installing Gitea:
|
||||
# ./setup-admin-secret.sh
|
||||
#
|
||||
# The secret must contain:
|
||||
# - username: admin username (default: bakery-admin)
|
||||
# - password: admin password
|
||||
# =============================================================================
|
||||
gitea:
|
||||
admin:
|
||||
username: bakery-admin
|
||||
email: admin@bakery-ia.local
|
||||
# Use existing secret for admin credentials (created by setup-admin-secret.sh)
|
||||
existingSecret: gitea-admin-secret
|
||||
# keepUpdated ensures password changes in secret are applied
|
||||
passwordMode: keepUpdated
|
||||
|
||||
config:
|
||||
server:
|
||||
DOMAIN: gitea.bakery-ia.local
|
||||
SSH_DOMAIN: gitea.bakery-ia.local
|
||||
SSH_PORT: 2222
|
||||
# Use HTTPS for external access; TLS termination happens at ingress
|
||||
ROOT_URL: https://gitea.bakery-ia.local
|
||||
HTTP_PORT: 3000
|
||||
# Disable built-in HTTPS since ingress handles TLS
|
||||
PROTOCOL: http
|
||||
repository:
|
||||
ENABLE_PUSH_CREATE_USER: true
|
||||
ENABLE_PUSH_CREATE_ORG: true
|
||||
DEFAULT_BRANCH: main
|
||||
packages:
|
||||
ENABLED: true
|
||||
webhook:
|
||||
ALLOWED_HOST_LIST: "*"
|
||||
# Allow internal cluster URLs for Tekton EventListener
|
||||
SKIP_TLS_VERIFY: true
|
||||
service:
|
||||
DISABLE_REGISTRATION: false
|
||||
REQUIRE_SIGNIN_VIEW: false
|
||||
|
||||
# Use embedded SQLite for simpler local development
|
||||
# For production, enable postgresql
|
||||
postgresql:
|
||||
enabled: false
|
||||
|
||||
# Use embedded in-memory cache for local dev
|
||||
redis-cluster:
|
||||
enabled: false
|
||||
|
||||
# Resource configuration for local development
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 256Mi
|
||||
|
||||
# Init containers timeout
|
||||
initContainers:
|
||||
resources:
|
||||
limits:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
requests:
|
||||
cpu: 50m
|
||||
memory: 64Mi
|
||||
|
||||
|
||||
15
infrastructure/cicd/tekton-helm/Chart.yaml
Normal file
15
infrastructure/cicd/tekton-helm/Chart.yaml
Normal file
@@ -0,0 +1,15 @@
|
||||
apiVersion: v2
|
||||
name: tekton-cicd
|
||||
description: Tekton CI/CD infrastructure for Bakery-IA
|
||||
type: application
|
||||
version: 0.1.0
|
||||
appVersion: "0.57.0"
|
||||
maintainers:
|
||||
- name: Bakery-IA Team
|
||||
email: team@bakery-ia.local
|
||||
annotations:
|
||||
category: Infrastructure
|
||||
app.kubernetes.io/name: tekton-cicd
|
||||
app.kubernetes.io/instance: tekton-cicd
|
||||
app.kubernetes.io/version: "0.57.0"
|
||||
app.kubernetes.io/part-of: bakery-ia
|
||||
145
infrastructure/cicd/tekton-helm/GITEA_SECRET_INTEGRATION.md
Normal file
145
infrastructure/cicd/tekton-helm/GITEA_SECRET_INTEGRATION.md
Normal file
@@ -0,0 +1,145 @@
|
||||
# Gitea Admin Secret Integration for Tekton
|
||||
|
||||
This document explains how Tekton CI/CD integrates with the existing Gitea admin secret to ensure credential consistency across the system.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Gitea Admin Secret] --> B[Tekton Registry Credentials]
|
||||
A --> C[Tekton Git Credentials]
|
||||
A --> D[Flux Git Credentials]
|
||||
B --> E[Kaniko Build Task]
|
||||
C --> F[GitOps Update Task]
|
||||
D --> G[Flux GitRepository]
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
The system uses Helm's `lookup` function to reference the existing `gitea-admin-secret` from the Gitea namespace, ensuring that:
|
||||
|
||||
1. **Single Source of Truth**: All CI/CD components use the same credentials as Gitea
|
||||
2. **Automatic Synchronization**: When Gitea admin password changes, all CI/CD components automatically use the new credentials
|
||||
3. **Reduced Maintenance**: No need to manually update credentials in multiple places
|
||||
|
||||
## Secret Reference Flow
|
||||
|
||||
```
|
||||
Gitea Namespace: gitea-admin-secret
|
||||
└── username: bakery-admin
|
||||
└── password: [secure-password]
|
||||
|
||||
Tekton Namespace:
|
||||
├── gitea-registry-credentials (dockerconfigjson)
|
||||
│ └── references gitea-admin-secret.password
|
||||
│
|
||||
├── gitea-git-credentials (opaque)
|
||||
│ └── references gitea-admin-secret.password
|
||||
│
|
||||
└── gitea-credentials (opaque) [flux-system namespace]
|
||||
└── references gitea-admin-secret.password
|
||||
```
|
||||
|
||||
## Deployment Requirements
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **Gitea must be installed first**: The `gitea-admin-secret` must exist before deploying Tekton
|
||||
2. **Same username**: All components use `bakery-admin` as the username
|
||||
3. **Namespace access**: Tekton service account needs read access to Gitea namespace secrets
|
||||
|
||||
### Installation Steps
|
||||
|
||||
1. **Install Gitea with admin secret**:
|
||||
```bash
|
||||
# Run the setup script to create gitea-admin-secret
|
||||
./infrastructure/cicd/gitea/setup-admin-secret.sh your-secure-password
|
||||
|
||||
# Install Gitea Helm chart
|
||||
helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml
|
||||
```
|
||||
|
||||
2. **Install Tekton with secret references**:
|
||||
```bash
|
||||
# Install Tekton - it will automatically reference the Gitea admin secret
|
||||
helm install tekton-cicd infrastructure/cicd/tekton-helm \
|
||||
--namespace tekton-pipelines \
|
||||
--set secrets.webhook.token="your-webhook-token"
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Secret not found error**:
|
||||
- Ensure Gitea is installed before Tekton
|
||||
- Verify the `gitea-admin-secret` exists in the `gitea` namespace
|
||||
- Check that Tekton service account has RBAC permissions to read Gitea secrets
|
||||
|
||||
2. **Authentication failures**:
|
||||
- Verify the Gitea admin password is correct
|
||||
- Ensure the username is `bakery-admin` (matching the Gitea admin)
|
||||
- Check that the password hasn't been manually changed in Gitea UI
|
||||
|
||||
### Debugging Commands
|
||||
|
||||
```bash
|
||||
# Check if gitea-admin-secret exists
|
||||
kubectl get secret gitea-admin-secret -n gitea
|
||||
|
||||
# Verify Tekton secrets were created correctly
|
||||
kubectl get secret gitea-registry-credentials -n tekton-pipelines -o yaml
|
||||
kubectl get secret gitea-git-credentials -n tekton-pipelines -o yaml
|
||||
kubectl get secret gitea-credentials -n flux-system -o yaml
|
||||
|
||||
# Check RBAC permissions
|
||||
kubectl get role,rolebinding,clusterrole,clusterrolebinding -n tekton-pipelines
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Benefits
|
||||
|
||||
1. **Reduced attack surface**: Fewer secrets to manage and rotate
|
||||
2. **Automatic rotation**: Changing Gitea admin password automatically updates all CI/CD components
|
||||
3. **Consistent access control**: Single point for credential management
|
||||
|
||||
### Best Practices
|
||||
|
||||
1. **Use strong passwords**: Generate secure random passwords for Gitea admin
|
||||
2. **Rotate regularly**: Change the Gitea admin password periodically
|
||||
3. **Limit access**: Restrict who can read the `gitea-admin-secret`
|
||||
4. **Audit logs**: Monitor access to the admin secret
|
||||
|
||||
## Manual Override
|
||||
|
||||
If you need to use different credentials for specific components, you can override the values:
|
||||
|
||||
```bash
|
||||
helm install tekton-cicd infrastructure/cicd/tekton-helm \
|
||||
--namespace tekton-pipelines \
|
||||
--set secrets.webhook.token="your-webhook-token" \
|
||||
--set secrets.registry.password="custom-registry-password" \
|
||||
--set secrets.git.password="custom-git-password"
|
||||
```
|
||||
|
||||
However, this is **not recommended** as it breaks the single source of truth principle.
|
||||
|
||||
## Helm Template Details
|
||||
|
||||
The integration uses Helm's `lookup` function with `b64dec` to decode the base64-encoded password:
|
||||
|
||||
```yaml
|
||||
password: {{ .Values.secrets.git.password | default (lookup "v1" "Secret" "gitea" "gitea-admin-secret").data.password | b64dec | quote }}
|
||||
```
|
||||
|
||||
This means:
|
||||
1. Look up the `gitea-admin-secret` in the `gitea` namespace
|
||||
2. Get the `password` field from the secret's `data` section
|
||||
3. Base64 decode it (Kubernetes stores secret data as base64)
|
||||
4. Use it as the password value
|
||||
5. If `.Values.secrets.git.password` is provided, use that instead (for manual override)
|
||||
|
||||
## Conclusion
|
||||
|
||||
This integration provides a robust, secure way to manage credentials across the CI/CD pipeline while maintaining consistency with Gitea's admin credentials.
|
||||
83
infrastructure/cicd/tekton-helm/README.md
Normal file
83
infrastructure/cicd/tekton-helm/README.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# Tekton CI/CD Helm Chart
|
||||
|
||||
This Helm chart deploys the Tekton CI/CD infrastructure for the Bakery-IA project.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Kubernetes 1.20+
|
||||
- Tekton Pipelines installed (v0.57.0 or later)
|
||||
- Helm 3.0+
|
||||
|
||||
## Installation
|
||||
|
||||
Before installing this chart, Tekton Pipelines must be installed separately:
|
||||
|
||||
```bash
|
||||
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||||
```
|
||||
|
||||
Then install the chart:
|
||||
|
||||
### Development Installation
|
||||
|
||||
```bash
|
||||
helm install tekton-cicd infrastructure/cicd/tekton-helm \
|
||||
--namespace tekton-pipelines \
|
||||
--create-namespace
|
||||
```
|
||||
|
||||
### Production Installation
|
||||
|
||||
**Important**: Never use default secrets in production. Always provide secure credentials.
|
||||
|
||||
```bash
|
||||
# Generate secure webhook token
|
||||
export TEKTON_WEBHOOK_TOKEN=$(openssl rand -hex 32)
|
||||
|
||||
# Use the same password as Gitea admin (from GITEA_ADMIN_PASSWORD)
|
||||
helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \
|
||||
-n tekton-pipelines \
|
||||
-f infrastructure/cicd/tekton-helm/values.yaml \
|
||||
-f infrastructure/cicd/tekton-helm/values-prod.yaml \
|
||||
--set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \
|
||||
--set secrets.registry.password=$GITEA_ADMIN_PASSWORD \
|
||||
--set secrets.git.password=$GITEA_ADMIN_PASSWORD
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
The following table lists the configurable parameters of the tekton-cicd chart and their default values.
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `global.registry.url` | Container registry URL | `"gitea.bakery-ia.local:5000"` |
|
||||
| `global.git.branch` | Git branch name | `"main"` |
|
||||
| `global.git.userName` | Git user name | `"bakery-ia-ci"` |
|
||||
| `global.git.userEmail` | Git user email | `"ci@bakery-ia.local"` |
|
||||
| `pipeline.build.cacheTTL` | Build cache TTL | `"24h"` |
|
||||
| `pipeline.build.verbosity` | Build verbosity level | `"info"` |
|
||||
| `pipeline.test.skipTests` | Skip tests flag | `"false"` |
|
||||
| `pipeline.test.skipLint` | Skip lint flag | `"false"` |
|
||||
| `pipeline.deployment.namespace` | Deployment namespace | `"bakery-ia"` |
|
||||
| `pipeline.deployment.fluxNamespace` | Flux namespace | `"flux-system"` |
|
||||
| `pipeline.workspace.size` | Workspace size | `"5Gi"` |
|
||||
| `pipeline.workspace.storageClass` | Workspace storage class | `"standard"` |
|
||||
| `secrets.webhook.token` | Webhook validation token | `"example-webhook-token-do-not-use-in-production"` |
|
||||
| `secrets.registry.username` | Registry username | `"example-user"` |
|
||||
| `secrets.registry.password` | Registry password | `"example-password"` |
|
||||
| `secrets.registry.registryUrl` | Registry URL | `"gitea.bakery-ia.local:5000"` |
|
||||
| `secrets.git.username` | Git username | `"example-user"` |
|
||||
| `secrets.git.password` | Git password | `"example-password"` |
|
||||
| `namespace` | Namespace for Tekton resources | `"tekton-pipelines"` |
|
||||
|
||||
## Uninstallation
|
||||
|
||||
To uninstall/delete the `tekton-cicd` release:
|
||||
|
||||
```bash
|
||||
helm delete tekton-cicd --namespace tekton-pipelines
|
||||
```
|
||||
|
||||
## Values
|
||||
|
||||
For a detailed list of configurable values, see the `values.yaml` file.
|
||||
22
infrastructure/cicd/tekton-helm/templates/NOTES.txt
Normal file
22
infrastructure/cicd/tekton-helm/templates/NOTES.txt
Normal file
@@ -0,0 +1,22 @@
|
||||
Thank you for installing {{ .Chart.Name }}.
|
||||
|
||||
This chart deploys the Tekton CI/CD infrastructure for Bakery-IA.
|
||||
|
||||
IMPORTANT: Tekton Pipelines must be installed separately before deploying this chart.
|
||||
|
||||
To install Tekton Pipelines, run:
|
||||
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
|
||||
|
||||
To verify Tekton is running:
|
||||
kubectl get pods -n tekton-pipelines
|
||||
|
||||
After Tekton is installed, this chart will deploy:
|
||||
- ConfigMaps with pipeline configuration
|
||||
- RBAC resources for triggers and pipelines
|
||||
- Secrets for registry and Git credentials
|
||||
- Tasks, Pipelines, and Triggers for CI/CD
|
||||
|
||||
To check the status of deployed resources:
|
||||
kubectl get all -n {{ .Release.Namespace }}
|
||||
|
||||
For more information about Tekton, visit: https://tekton.dev/
|
||||
80
infrastructure/cicd/tekton-helm/templates/clusterroles.yaml
Normal file
80
infrastructure/cicd/tekton-helm/templates/clusterroles.yaml
Normal file
@@ -0,0 +1,80 @@
|
||||
# ClusterRole for Tekton Triggers to create PipelineRuns
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRole
|
||||
metadata:
|
||||
name: tekton-triggers-role
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
rules:
|
||||
# Ability to create PipelineRuns from triggers
|
||||
- apiGroups: ["tekton.dev"]
|
||||
resources: ["pipelineruns", "taskruns"]
|
||||
verbs: ["create", "get", "list", "watch"]
|
||||
# Ability to read pipelines and tasks
|
||||
- apiGroups: ["tekton.dev"]
|
||||
resources: ["pipelines", "tasks", "clustertasks"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Ability to manage PVCs for workspaces
|
||||
- apiGroups: [""]
|
||||
resources: ["persistentvolumeclaims"]
|
||||
verbs: ["create", "get", "list", "watch", "delete"]
|
||||
# Ability to read secrets for credentials
|
||||
- apiGroups: [""]
|
||||
resources: ["secrets"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Ability to read configmaps
|
||||
- apiGroups: [""]
|
||||
resources: ["configmaps"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Ability to manage events for logging
|
||||
- apiGroups: [""]
|
||||
resources: ["events"]
|
||||
verbs: ["create", "patch"]
|
||||
# Ability to list cluster-scoped trigger resources (needed for Tekton Triggers controller)
|
||||
- apiGroups: ["triggers.tekton.dev"]
|
||||
resources: ["clustertriggerbindings", "clusterinterceptors"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
---
|
||||
# ClusterRole for Pipeline execution (needed for git operations and deployments)
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRole
|
||||
metadata:
|
||||
name: tekton-pipeline-role
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: pipeline
|
||||
rules:
|
||||
# Ability to read/update deployments for GitOps
|
||||
- apiGroups: ["apps"]
|
||||
resources: ["deployments"]
|
||||
verbs: ["get", "list", "watch", "patch", "update"]
|
||||
# Ability to read secrets for credentials
|
||||
- apiGroups: [""]
|
||||
resources: ["secrets"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Ability to read configmaps
|
||||
- apiGroups: [""]
|
||||
resources: ["configmaps"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
# Ability to manage pods for build operations
|
||||
- apiGroups: [""]
|
||||
resources: ["pods", "pods/log"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
---
|
||||
# Role for EventListener to access triggers resources
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: Role
|
||||
metadata:
|
||||
name: tekton-triggers-eventlistener-role
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
rules:
|
||||
- apiGroups: ["triggers.tekton.dev"]
|
||||
resources: ["eventlisteners", "triggerbindings", "triggertemplates", "triggers", "interceptors"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
- apiGroups: [""]
|
||||
resources: ["configmaps", "secrets"]
|
||||
verbs: ["get", "list", "watch"]
|
||||
32
infrastructure/cicd/tekton-helm/templates/configmap.yaml
Normal file
32
infrastructure/cicd/tekton-helm/templates/configmap.yaml
Normal file
@@ -0,0 +1,32 @@
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: pipeline-config
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: config
|
||||
data:
|
||||
# Container Registry Configuration
|
||||
REGISTRY_URL: "{{ .Values.global.registry.url }}"
|
||||
|
||||
# Git Configuration
|
||||
GIT_BRANCH: "{{ .Values.global.git.branch }}"
|
||||
GIT_USER_NAME: "{{ .Values.global.git.userName }}"
|
||||
GIT_USER_EMAIL: "{{ .Values.global.git.userEmail }}"
|
||||
|
||||
# Build Configuration
|
||||
BUILD_CACHE_TTL: "{{ .Values.pipeline.build.cacheTTL }}"
|
||||
BUILD_VERBOSITY: "{{ .Values.pipeline.build.verbosity }}"
|
||||
|
||||
# Test Configuration
|
||||
SKIP_TESTS: "{{ .Values.pipeline.test.skipTests }}"
|
||||
SKIP_LINT: "{{ .Values.pipeline.test.skipLint }}"
|
||||
|
||||
# Deployment Configuration
|
||||
DEPLOY_NAMESPACE: "{{ .Values.pipeline.deployment.namespace }}"
|
||||
FLUX_NAMESPACE: "{{ .Values.pipeline.deployment.fluxNamespace }}"
|
||||
|
||||
# Workspace Configuration
|
||||
WORKSPACE_SIZE: "{{ .Values.pipeline.workspace.size }}"
|
||||
WORKSPACE_STORAGE_CLASS: "{{ .Values.pipeline.workspace.storageClass }}"
|
||||
@@ -0,0 +1,32 @@
|
||||
# Tekton EventListener for Bakery-IA CI/CD
|
||||
# This listener receives webhook events and triggers pipelines
|
||||
|
||||
apiVersion: triggers.tekton.dev/v1beta1
|
||||
kind: EventListener
|
||||
metadata:
|
||||
name: bakery-ia-event-listener
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
spec:
|
||||
serviceAccountName: {{ .Values.serviceAccounts.triggers.name }}
|
||||
triggers:
|
||||
- name: bakery-ia-gitea-trigger
|
||||
interceptors:
|
||||
- ref:
|
||||
name: "cel"
|
||||
params:
|
||||
- name: "filter"
|
||||
value: "has(body.repository) && body.ref.contains('main')"
|
||||
- ref:
|
||||
name: "bitbucket"
|
||||
params:
|
||||
- name: "secretRef"
|
||||
value:
|
||||
secretName: gitea-webhook-secret
|
||||
secretKey: secretToken
|
||||
bindings:
|
||||
- ref: bakery-ia-trigger-binding
|
||||
template:
|
||||
ref: bakery-ia-trigger-template
|
||||
9
infrastructure/cicd/tekton-helm/templates/namespace.yaml
Normal file
9
infrastructure/cicd/tekton-helm/templates/namespace.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
{{- if .Values.namespace }}
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: {{ .Values.namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: {{ .Values.labels.app.component }}
|
||||
{{- end }}
|
||||
164
infrastructure/cicd/tekton-helm/templates/pipeline-ci.yaml
Normal file
164
infrastructure/cicd/tekton-helm/templates/pipeline-ci.yaml
Normal file
@@ -0,0 +1,164 @@
|
||||
# Main CI Pipeline for Bakery-IA
|
||||
# This pipeline orchestrates the build, test, and deploy process
|
||||
# Includes: fetch -> detect changes -> test -> build -> update gitops
|
||||
# Supports environment-configurable base images for dev/prod flexibility
|
||||
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: Pipeline
|
||||
metadata:
|
||||
name: bakery-ia-ci
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: pipeline
|
||||
spec:
|
||||
workspaces:
|
||||
- name: shared-workspace
|
||||
description: Shared workspace for source code
|
||||
- name: docker-credentials
|
||||
description: Docker registry credentials
|
||||
- name: git-credentials
|
||||
description: Git credentials for pushing GitOps updates
|
||||
optional: true
|
||||
params:
|
||||
- name: git-url
|
||||
type: string
|
||||
description: Repository URL
|
||||
- name: git-revision
|
||||
type: string
|
||||
description: Git revision/commit hash
|
||||
- name: registry
|
||||
type: string
|
||||
description: Container registry URL for pushing built images
|
||||
- name: git-branch
|
||||
type: string
|
||||
description: Target branch for GitOps updates
|
||||
default: "main"
|
||||
- name: skip-tests
|
||||
type: string
|
||||
description: Skip tests if "true"
|
||||
default: "false"
|
||||
- name: dry-run
|
||||
type: string
|
||||
description: Dry run mode - don't push changes
|
||||
default: "false"
|
||||
# Base image configuration for environment-specific builds
|
||||
- name: base-registry
|
||||
type: string
|
||||
description: "Base image registry URL (e.g., docker.io for prod, localhost:5000 for dev)"
|
||||
default: "{{ .Values.pipeline.build.baseRegistry }}"
|
||||
- name: python-image
|
||||
type: string
|
||||
description: "Python base image name and tag (e.g., python:3.11-slim for prod)"
|
||||
default: "{{ .Values.pipeline.build.pythonImage }}"
|
||||
|
||||
tasks:
|
||||
# Stage 1: Fetch source code
|
||||
- name: fetch-source
|
||||
taskRef:
|
||||
name: git-clone
|
||||
workspaces:
|
||||
- name: output
|
||||
workspace: shared-workspace
|
||||
params:
|
||||
- name: url
|
||||
value: $(params.git-url)
|
||||
- name: revision
|
||||
value: $(params.git-revision)
|
||||
|
||||
# Stage 2: Detect which services changed
|
||||
- name: detect-changes
|
||||
runAfter: [fetch-source]
|
||||
taskRef:
|
||||
name: detect-changed-services
|
||||
workspaces:
|
||||
- name: source
|
||||
workspace: shared-workspace
|
||||
|
||||
# Stage 3: Run tests on changed services
|
||||
- name: run-tests
|
||||
runAfter: [detect-changes]
|
||||
taskRef:
|
||||
name: run-tests
|
||||
when:
|
||||
- input: "$(tasks.detect-changes.results.changed-services)"
|
||||
operator: notin
|
||||
values: ["none", "infrastructure"]
|
||||
- input: "$(params.skip-tests)"
|
||||
operator: notin
|
||||
values: ["true"]
|
||||
workspaces:
|
||||
- name: source
|
||||
workspace: shared-workspace
|
||||
params:
|
||||
- name: services
|
||||
value: $(tasks.detect-changes.results.changed-services)
|
||||
- name: skip-tests
|
||||
value: $(params.skip-tests)
|
||||
|
||||
# Stage 4: Build and push container images
|
||||
- name: build-and-push
|
||||
runAfter: [run-tests]
|
||||
taskRef:
|
||||
name: kaniko-build
|
||||
when:
|
||||
- input: "$(tasks.detect-changes.results.changed-services)"
|
||||
operator: notin
|
||||
values: ["none", "infrastructure"]
|
||||
workspaces:
|
||||
- name: source
|
||||
workspace: shared-workspace
|
||||
- name: docker-credentials
|
||||
workspace: docker-credentials
|
||||
params:
|
||||
- name: services
|
||||
value: $(tasks.detect-changes.results.changed-services)
|
||||
- name: registry
|
||||
value: $(params.registry)
|
||||
- name: git-revision
|
||||
value: $(params.git-revision)
|
||||
# Environment-configurable base images
|
||||
- name: base-registry
|
||||
value: $(params.base-registry)
|
||||
- name: python-image
|
||||
value: $(params.python-image)
|
||||
|
||||
# Stage 5: Update GitOps manifests
|
||||
- name: update-gitops-manifests
|
||||
runAfter: [build-and-push]
|
||||
taskRef:
|
||||
name: update-gitops
|
||||
when:
|
||||
- input: "$(tasks.detect-changes.results.changed-services)"
|
||||
operator: notin
|
||||
values: ["none", "infrastructure"]
|
||||
- input: "$(tasks.build-and-push.results.build-status)"
|
||||
operator: in
|
||||
values: ["success", "partial"]
|
||||
workspaces:
|
||||
- name: source
|
||||
workspace: shared-workspace
|
||||
- name: git-credentials
|
||||
workspace: git-credentials
|
||||
params:
|
||||
- name: services
|
||||
value: $(tasks.detect-changes.results.changed-services)
|
||||
- name: registry
|
||||
value: $(params.registry)
|
||||
- name: git-revision
|
||||
value: $(params.git-revision)
|
||||
- name: git-branch
|
||||
value: $(params.git-branch)
|
||||
- name: dry-run
|
||||
value: $(params.dry-run)
|
||||
|
||||
# Final tasks that run regardless of pipeline success/failure
|
||||
finally:
|
||||
- name: pipeline-summary
|
||||
taskRef:
|
||||
name: pipeline-summary
|
||||
params:
|
||||
- name: changed-services
|
||||
value: $(tasks.detect-changes.results.changed-services)
|
||||
- name: git-revision
|
||||
value: $(params.git-revision)
|
||||
51
infrastructure/cicd/tekton-helm/templates/rolebindings.yaml
Normal file
51
infrastructure/cicd/tekton-helm/templates/rolebindings.yaml
Normal file
@@ -0,0 +1,51 @@
|
||||
# ClusterRoleBinding for Tekton Triggers
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
name: tekton-triggers-binding
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: {{ .Values.serviceAccounts.triggers.name }}
|
||||
namespace: {{ .Release.Namespace }}
|
||||
roleRef:
|
||||
kind: ClusterRole
|
||||
name: tekton-triggers-role
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
---
|
||||
# ClusterRoleBinding for Pipeline execution
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
name: tekton-pipeline-binding
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: pipeline
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: {{ .Values.serviceAccounts.pipeline.name }}
|
||||
namespace: {{ .Release.Namespace }}
|
||||
roleRef:
|
||||
kind: ClusterRole
|
||||
name: tekton-pipeline-role
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
---
|
||||
# RoleBinding for EventListener
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: RoleBinding
|
||||
metadata:
|
||||
name: tekton-triggers-eventlistener-binding
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: {{ .Values.serviceAccounts.triggers.name }}
|
||||
namespace: {{ .Release.Namespace }}
|
||||
roleRef:
|
||||
kind: Role
|
||||
name: tekton-triggers-eventlistener-role
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
87
infrastructure/cicd/tekton-helm/templates/secrets.yaml
Normal file
87
infrastructure/cicd/tekton-helm/templates/secrets.yaml
Normal file
@@ -0,0 +1,87 @@
|
||||
# Secret for Gitea webhook validation
|
||||
# Used by EventListener to validate incoming webhooks
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-webhook-secret
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
annotations:
|
||||
note: "Webhook secret for validating incoming webhooks"
|
||||
type: Opaque
|
||||
stringData:
|
||||
secretToken: {{ .Values.secrets.webhook.token | quote }}
|
||||
---
|
||||
# Secret for Gitea container registry credentials
|
||||
# Used by Kaniko to push images to Gitea registry
|
||||
# References the existing gitea-admin-secret for consistency
|
||||
{{- $giteaSecret := (lookup "v1" "Secret" "gitea" "gitea-admin-secret") }}
|
||||
{{- $giteaPassword := "" }}
|
||||
{{- if and $giteaSecret $giteaSecret.data (index $giteaSecret.data "password") }}
|
||||
{{- $giteaPassword = index $giteaSecret.data "password" | b64dec }}
|
||||
{{- end }}
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-registry-credentials
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: build
|
||||
annotations:
|
||||
note: "Registry credentials for pushing images - references gitea-admin-secret"
|
||||
type: kubernetes.io/dockerconfigjson
|
||||
stringData:
|
||||
{{- $registryPassword := .Values.secrets.registry.password | default $giteaPassword | default "PLACEHOLDER_PASSWORD" }}
|
||||
{{- if and .Values.secrets.registry.registryUrl .Values.secrets.registry.username }}
|
||||
.dockerconfigjson: |
|
||||
{
|
||||
"auths": {
|
||||
{{ .Values.secrets.registry.registryUrl | quote }}: {
|
||||
"username": {{ .Values.secrets.registry.username | quote }},
|
||||
"password": {{ $registryPassword | quote }}
|
||||
}
|
||||
}
|
||||
}
|
||||
{{- else }}
|
||||
.dockerconfigjson: '{"auths":{}}'
|
||||
{{- end }}
|
||||
---
|
||||
# Secret for Git credentials (used by pipeline to push GitOps updates)
|
||||
# References the existing gitea-admin-secret for consistency
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-git-credentials
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: gitops
|
||||
annotations:
|
||||
note: "Git credentials for GitOps updates - references gitea-admin-secret"
|
||||
type: Opaque
|
||||
stringData:
|
||||
{{- $gitPassword := .Values.secrets.git.password | default $giteaPassword | default "PLACEHOLDER_PASSWORD" }}
|
||||
username: {{ .Values.secrets.git.username | quote }}
|
||||
password: {{ $gitPassword | quote }}
|
||||
---
|
||||
# Secret for Flux GitRepository access
|
||||
# Used by Flux to pull from Gitea repository
|
||||
# References the existing gitea-admin-secret for consistency
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: gitea-credentials
|
||||
namespace: {{ .Values.pipeline.deployment.fluxNamespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: flux
|
||||
annotations:
|
||||
note: "Credentials for Flux GitRepository access - references gitea-admin-secret"
|
||||
type: Opaque
|
||||
stringData:
|
||||
{{- $fluxPassword := .Values.secrets.git.password | default $giteaPassword | default "PLACEHOLDER_PASSWORD" }}
|
||||
username: {{ .Values.secrets.git.username | quote }}
|
||||
password: {{ $fluxPassword | quote }}
|
||||
@@ -0,0 +1,19 @@
|
||||
# ServiceAccount for Tekton Triggers EventListener
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: {{ .Values.serviceAccounts.triggers.name }}
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
---
|
||||
# ServiceAccount for Pipeline execution
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: {{ .Values.serviceAccounts.pipeline.name }}
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: pipeline
|
||||
@@ -0,0 +1,87 @@
|
||||
# Tekton Task to Detect Changed Services
|
||||
# This task analyzes git changes to determine which services need to be built
|
||||
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: Task
|
||||
metadata:
|
||||
name: detect-changed-services
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: detection
|
||||
spec:
|
||||
workspaces:
|
||||
- name: source
|
||||
description: Workspace containing the source code
|
||||
results:
|
||||
- name: changed-services
|
||||
description: Comma-separated list of changed services
|
||||
steps:
|
||||
- name: detect-changes
|
||||
image: alpine/git
|
||||
script: |
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
cd $(workspaces.source.path)
|
||||
|
||||
# Get the list of changed files
|
||||
CHANGED_FILES=$(git diff --name-only HEAD~1 HEAD 2>/dev/null || git diff --name-only $(git rev-parse --abbrev-ref HEAD)@{upstream} HEAD 2>/dev/null || echo "")
|
||||
|
||||
if [ -z "$CHANGED_FILES" ]; then
|
||||
# No changes detected, assume all services need building
|
||||
echo "No git changes detected, building all services"
|
||||
echo "all" > $(results.changed-services.path)
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Initialize an array to collect changed services
|
||||
declare -a changed_services=()
|
||||
|
||||
# Check for changes in services/ directory
|
||||
while IFS= read -r service_dir; do
|
||||
if [ -n "$service_dir" ]; then
|
||||
service_name=$(basename "$service_dir")
|
||||
if [[ ! " ${changed_services[@]} " =~ " ${service_name} " ]]; then
|
||||
changed_services+=("$service_name")
|
||||
fi
|
||||
fi
|
||||
done < <(echo "$CHANGED_FILES" | grep '^services/' | cut -d'/' -f2 | sort -u)
|
||||
|
||||
# Check for changes in gateway/ directory
|
||||
if echo "$CHANGED_FILES" | grep -q '^gateway/'; then
|
||||
if [[ ! " ${changed_services[@]} " =~ " gateway " ]]; then
|
||||
changed_services+=("gateway")
|
||||
fi
|
||||
fi
|
||||
|
||||
# Check for changes in frontend/ directory
|
||||
if echo "$CHANGED_FILES" | grep -q '^frontend/'; then
|
||||
if [[ ! " ${changed_services[@]} " =~ " frontend " ]]; then
|
||||
changed_services+=("frontend")
|
||||
fi
|
||||
fi
|
||||
|
||||
# Check for changes in shared/ directory (might affect multiple services)
|
||||
if echo "$CHANGED_FILES" | grep -q '^shared/'; then
|
||||
if [[ ! " ${changed_services[@]} " =~ " shared " ]]; then
|
||||
changed_services+=("shared")
|
||||
fi
|
||||
fi
|
||||
|
||||
# Convert array to comma-separated string
|
||||
CHANGED_SERVICES=""
|
||||
for service in "${changed_services[@]}"; do
|
||||
if [ -z "$CHANGED_SERVICES" ]; then
|
||||
CHANGED_SERVICES="$service"
|
||||
else
|
||||
CHANGED_SERVICES="$CHANGED_SERVICES,$service"
|
||||
fi
|
||||
done
|
||||
|
||||
if [ -z "$CHANGED_SERVICES" ]; then
|
||||
# Changes are in infrastructure or other non-service files
|
||||
echo "infrastructure" > $(results.changed-services.path)
|
||||
else
|
||||
echo "$CHANGED_SERVICES" > $(results.changed-services.path)
|
||||
fi
|
||||
@@ -0,0 +1,95 @@
|
||||
# Tekton Git Clone Task for Bakery-IA CI/CD
|
||||
# This task clones the source code repository
|
||||
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: Task
|
||||
metadata:
|
||||
name: git-clone
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: source
|
||||
spec:
|
||||
workspaces:
|
||||
- name: output
|
||||
description: Workspace to clone the repository into
|
||||
params:
|
||||
- name: url
|
||||
type: string
|
||||
description: Repository URL to clone
|
||||
- name: revision
|
||||
type: string
|
||||
description: Git revision to checkout
|
||||
default: "main"
|
||||
- name: depth
|
||||
type: string
|
||||
description: Git clone depth (0 for full history)
|
||||
default: "1"
|
||||
results:
|
||||
- name: commit-sha
|
||||
description: The commit SHA that was checked out
|
||||
- name: commit-message
|
||||
description: The commit message
|
||||
steps:
|
||||
- name: clone
|
||||
image: alpine/git:2.43.0
|
||||
script: |
|
||||
#!/bin/sh
|
||||
set -e
|
||||
|
||||
URL="$(params.url)"
|
||||
REVISION="$(params.revision)"
|
||||
DEPTH="$(params.depth)"
|
||||
OUTPUT_PATH="$(workspaces.output.path)"
|
||||
|
||||
echo "============================================"
|
||||
echo "Git Clone Task"
|
||||
echo "============================================"
|
||||
echo "URL: $URL"
|
||||
echo "Revision: $REVISION"
|
||||
echo "Depth: $DEPTH"
|
||||
echo "============================================"
|
||||
|
||||
# Clone with depth for faster checkout
|
||||
if [ "$DEPTH" = "0" ]; then
|
||||
echo "Cloning full repository..."
|
||||
git clone "$URL" "$OUTPUT_PATH"
|
||||
else
|
||||
echo "Cloning with depth $DEPTH..."
|
||||
git clone --depth "$DEPTH" "$URL" "$OUTPUT_PATH"
|
||||
fi
|
||||
|
||||
cd "$OUTPUT_PATH"
|
||||
|
||||
# Fetch the specific revision if needed
|
||||
if [ "$REVISION" != "main" ] && [ "$REVISION" != "master" ]; then
|
||||
echo "Fetching revision: $REVISION"
|
||||
git fetch --depth 1 origin "$REVISION" 2>/dev/null || true
|
||||
fi
|
||||
|
||||
# Checkout the revision
|
||||
echo "Checking out: $REVISION"
|
||||
git checkout "$REVISION" 2>/dev/null || git checkout "origin/$REVISION"
|
||||
|
||||
# Get commit info
|
||||
COMMIT_SHA=$(git rev-parse HEAD)
|
||||
COMMIT_MSG=$(git log -1 --pretty=format:"%s")
|
||||
|
||||
echo ""
|
||||
echo "============================================"
|
||||
echo "Clone Complete"
|
||||
echo "============================================"
|
||||
echo "Commit: $COMMIT_SHA"
|
||||
echo "Message: $COMMIT_MSG"
|
||||
echo "============================================"
|
||||
|
||||
# Write results
|
||||
echo -n "$COMMIT_SHA" > $(results.commit-sha.path)
|
||||
echo -n "$COMMIT_MSG" > $(results.commit-message.path)
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
103
infrastructure/cicd/tekton-helm/templates/task-kaniko-build.yaml
Normal file
103
infrastructure/cicd/tekton-helm/templates/task-kaniko-build.yaml
Normal file
@@ -0,0 +1,103 @@
|
||||
# Tekton Kaniko Build Task for Bakery-IA CI/CD
|
||||
# This task builds and pushes container images using Kaniko
|
||||
# Supports environment-configurable base images via build-args
|
||||
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: Task
|
||||
metadata:
|
||||
name: kaniko-build
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: build
|
||||
spec:
|
||||
workspaces:
|
||||
- name: source
|
||||
description: Workspace containing the source code
|
||||
- name: docker-credentials
|
||||
description: Docker registry credentials
|
||||
params:
|
||||
- name: services
|
||||
type: string
|
||||
description: Comma-separated list of services to build
|
||||
- name: registry
|
||||
type: string
|
||||
description: Container registry URL for pushing built images
|
||||
- name: git-revision
|
||||
type: string
|
||||
description: Git revision to tag images with
|
||||
- name: base-registry
|
||||
type: string
|
||||
description: Base image registry URL (e.g., docker.io, ghcr.io/org)
|
||||
default: "gitea-http.gitea.svc.cluster.local:3000/bakery-admin"
|
||||
- name: python-image
|
||||
type: string
|
||||
description: Python base image name and tag
|
||||
default: "python_3.11-slim"
|
||||
results:
|
||||
- name: build-status
|
||||
description: Status of the build operation
|
||||
steps:
|
||||
- name: build-and-push
|
||||
image: gcr.io/kaniko-project/executor:v1.15.0
|
||||
env:
|
||||
- name: DOCKER_CONFIG
|
||||
value: /tekton/home/.docker
|
||||
script: |
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
echo "==================================================================="
|
||||
echo "Kaniko Build Configuration"
|
||||
echo "==================================================================="
|
||||
echo "Target Registry: $(params.registry)"
|
||||
echo "Base Registry: $(params.base-registry)"
|
||||
echo "Python Image: $(params.python-image)"
|
||||
echo "Git Revision: $(params.git-revision)"
|
||||
echo "==================================================================="
|
||||
|
||||
# Split services parameter by comma
|
||||
IFS=',' read -ra SERVICES <<< "$(params.services)"
|
||||
|
||||
# Build each service
|
||||
for service in "${SERVICES[@]}"; do
|
||||
service=$(echo "$service" | xargs) # Trim whitespace
|
||||
if [ -n "$service" ] && [ "$service" != "none" ]; then
|
||||
echo ""
|
||||
echo "Building service: $service"
|
||||
echo "-------------------------------------------------------------------"
|
||||
|
||||
# Determine Dockerfile path (services vs gateway vs frontend)
|
||||
if [ "$service" = "gateway" ]; then
|
||||
DOCKERFILE_PATH="$(workspaces.source.path)/gateway/Dockerfile"
|
||||
elif [ "$service" = "frontend" ]; then
|
||||
DOCKERFILE_PATH="$(workspaces.source.path)/frontend/Dockerfile.kubernetes"
|
||||
else
|
||||
DOCKERFILE_PATH="$(workspaces.source.path)/services/$service/Dockerfile"
|
||||
fi
|
||||
|
||||
/kaniko/executor \
|
||||
--dockerfile="$DOCKERFILE_PATH" \
|
||||
--destination="$(params.registry)/$service:$(params.git-revision)" \
|
||||
--context="$(workspaces.source.path)" \
|
||||
--build-arg="BASE_REGISTRY=$(params.base-registry)" \
|
||||
--build-arg="PYTHON_IMAGE=$(params.python-image)" \
|
||||
--cache=true \
|
||||
--cache-repo="$(params.registry)/cache"
|
||||
|
||||
echo "Successfully built: $(params.registry)/$service:$(params.git-revision)"
|
||||
fi
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "==================================================================="
|
||||
echo "Build completed successfully!"
|
||||
echo "==================================================================="
|
||||
echo "success" > $(results.build-status.path)
|
||||
resources:
|
||||
limits:
|
||||
cpu: 2000m
|
||||
memory: 4Gi
|
||||
requests:
|
||||
cpu: 500m
|
||||
memory: 1Gi
|
||||
@@ -0,0 +1,33 @@
|
||||
# Tekton Task for Pipeline Summary
|
||||
# This task generates a summary of the pipeline execution
|
||||
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: Task
|
||||
metadata:
|
||||
name: pipeline-summary
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: summary
|
||||
spec:
|
||||
params:
|
||||
- name: changed-services
|
||||
type: string
|
||||
description: Services that were changed
|
||||
- name: git-revision
|
||||
type: string
|
||||
description: Git revision being processed
|
||||
steps:
|
||||
- name: generate-summary
|
||||
image: alpine
|
||||
script: |
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
echo "=== Bakery-IA CI Pipeline Summary ==="
|
||||
echo "Git Revision: $(params.git-revision)"
|
||||
echo "Changed Services: $(params.changed-services)"
|
||||
echo "Pipeline completed successfully"
|
||||
|
||||
# Log summary to stdout for visibility
|
||||
echo "Summary generated"
|
||||
@@ -0,0 +1,86 @@
|
||||
# Tekton Run Tests Task for Bakery-IA CI/CD
|
||||
# This task runs tests on the source code
|
||||
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: Task
|
||||
metadata:
|
||||
name: run-tests
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: test
|
||||
spec:
|
||||
workspaces:
|
||||
- name: source
|
||||
description: Workspace containing the source code
|
||||
params:
|
||||
- name: services
|
||||
type: string
|
||||
description: Comma-separated list of services to test
|
||||
- name: skip-tests
|
||||
type: string
|
||||
description: Skip tests if "true"
|
||||
default: "false"
|
||||
steps:
|
||||
- name: run-unit-tests
|
||||
image: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/python_3.11-slim:latest
|
||||
workingDir: $(workspaces.source.path)
|
||||
script: |
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
echo "============================================"
|
||||
echo "Running Unit Tests"
|
||||
echo "Services: $(params.services)"
|
||||
echo "Skip tests: $(params.skip-tests)"
|
||||
echo "============================================"
|
||||
|
||||
if [ "$(params.skip-tests)" = "true" ]; then
|
||||
echo "Skipping tests as requested"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Install dependencies if requirements file exists
|
||||
if [ -f "requirements.txt" ]; then
|
||||
pip install --no-cache-dir -r requirements.txt
|
||||
fi
|
||||
|
||||
# Run unit tests
|
||||
python -m pytest tests/unit/ -v
|
||||
|
||||
echo "Unit tests completed successfully"
|
||||
resources:
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 2Gi
|
||||
requests:
|
||||
cpu: 200m
|
||||
memory: 512Mi
|
||||
- name: run-integration-tests
|
||||
image: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/python_3.11-slim:latest
|
||||
workingDir: $(workspaces.source.path)
|
||||
script: |
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
echo "============================================"
|
||||
echo "Running Integration Tests"
|
||||
echo "Services: $(params.services)"
|
||||
echo "============================================"
|
||||
|
||||
if [ "$(params.skip-tests)" = "true" ]; then
|
||||
echo "Skipping integration tests as requested"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Run integration tests
|
||||
python -m pytest tests/integration/ -v
|
||||
|
||||
echo "Integration tests completed successfully"
|
||||
resources:
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 2Gi
|
||||
requests:
|
||||
cpu: 200m
|
||||
memory: 512Mi
|
||||
@@ -0,0 +1,153 @@
|
||||
# Tekton Update GitOps Task for Bakery-IA CI/CD
|
||||
# This task updates GitOps manifests with new image tags
|
||||
|
||||
apiVersion: tekton.dev/v1beta1
|
||||
kind: Task
|
||||
metadata:
|
||||
name: update-gitops
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: gitops
|
||||
spec:
|
||||
workspaces:
|
||||
- name: source
|
||||
description: Workspace containing the source code
|
||||
- name: git-credentials
|
||||
description: Git credentials for pushing changes
|
||||
params:
|
||||
- name: services
|
||||
type: string
|
||||
description: Comma-separated list of services to update
|
||||
- name: registry
|
||||
type: string
|
||||
description: Container registry URL
|
||||
- name: git-revision
|
||||
type: string
|
||||
description: Git revision to tag images with
|
||||
- name: git-branch
|
||||
type: string
|
||||
description: Git branch to push changes to
|
||||
- name: dry-run
|
||||
type: string
|
||||
description: Dry run mode - don't push changes
|
||||
default: "false"
|
||||
steps:
|
||||
- name: update-manifests
|
||||
image: alpine/git:2.43.0
|
||||
workingDir: $(workspaces.source.path)
|
||||
env:
|
||||
- name: GIT_USERNAME
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: gitea-git-credentials
|
||||
key: username
|
||||
- name: GIT_PASSWORD
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: gitea-git-credentials
|
||||
key: password
|
||||
script: |
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
echo "============================================"
|
||||
echo "Updating GitOps Manifests"
|
||||
echo "Services: $(params.services)"
|
||||
echo "Registry: $(params.registry)"
|
||||
echo "Revision: $(params.git-revision)"
|
||||
echo "Branch: $(params.git-branch)"
|
||||
echo "Dry run: $(params.dry-run)"
|
||||
echo "============================================"
|
||||
|
||||
# Configure git
|
||||
git config --global user.email "ci@bakery-ia.local"
|
||||
git config --global user.name "bakery-ia-ci"
|
||||
|
||||
# Clone the main repository (not a separate gitops repo)
|
||||
# Use internal cluster DNS which works in all environments
|
||||
REPO_URL="https://${GIT_USERNAME}:${GIT_PASSWORD}@gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bakery-ia.git"
|
||||
git clone "$REPO_URL" /tmp/gitops
|
||||
|
||||
cd /tmp/gitops
|
||||
|
||||
# Switch to target branch
|
||||
git checkout "$(params.git-branch)" || git checkout -b "$(params.git-branch)"
|
||||
|
||||
# Update image tags in Kubernetes manifests
|
||||
for service in $(echo "$(params.services)" | tr ',' '\n'); do
|
||||
service=$(echo "$service" | xargs) # Trim whitespace
|
||||
if [ -n "$service" ] && [ "$service" != "none" ] && [ "$service" != "infrastructure" ] && [ "$service" != "shared" ]; then
|
||||
echo "Updating manifest for service: $service"
|
||||
|
||||
# Format service name for directory (convert from kebab-case to snake_case if needed)
|
||||
# Handle special cases like demo-session -> demo_session, alert-processor -> alert_processor, etc.
|
||||
formatted_service=$(echo "$service" | sed 's/-/_/g')
|
||||
|
||||
# For gateway and frontend, they have different directory structures
|
||||
if [ "$service" = "gateway" ]; then
|
||||
MANIFEST_PATH="infrastructure/platform/gateway/gateway-service.yaml"
|
||||
IMAGE_NAME="gateway" # gateway image name is just "gateway"
|
||||
elif [ "$service" = "frontend" ]; then
|
||||
MANIFEST_PATH="infrastructure/services/microservices/frontend/frontend-service.yaml"
|
||||
IMAGE_NAME="dashboard" # frontend service uses "dashboard" as image name
|
||||
else
|
||||
# For microservices, look in the microservices directory
|
||||
# Convert service name to directory format (kebab-case)
|
||||
service_dir=$(echo "$service" | sed 's/_/-/g')
|
||||
|
||||
# Check for different possible manifest file names
|
||||
if [ -f "infrastructure/services/microservices/$service_dir/deployment.yaml" ]; then
|
||||
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/deployment.yaml"
|
||||
elif [ -f "infrastructure/services/microservices/$service_dir/${formatted_service}-service.yaml" ]; then
|
||||
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/${formatted_service}-service.yaml"
|
||||
elif [ -f "infrastructure/services/microservices/$service_dir/${service_dir}-service.yaml" ]; then
|
||||
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/${service_dir}-service.yaml"
|
||||
else
|
||||
# Default to the standard naming pattern
|
||||
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/${formatted_service}-service.yaml"
|
||||
fi
|
||||
|
||||
# For most services, the image name follows the pattern service-name-service
|
||||
IMAGE_NAME="${service_dir}-service"
|
||||
fi
|
||||
|
||||
# Update the image tag in the deployment YAML
|
||||
if [ -f "$MANIFEST_PATH" ]; then
|
||||
# Update image reference from bakery/image_name:tag to registry/image_name:git_revision
|
||||
# Handle various image name formats that might exist in the manifests
|
||||
sed -i "s|image: bakery/${IMAGE_NAME}:.*|image: $(params.registry)/${IMAGE_NAME}:$(params.git-revision)|g" "$MANIFEST_PATH"
|
||||
# Also handle the case where the image name might be formatted differently
|
||||
sed -i "s|image: bakery/${service}:.*|image: $(params.registry)/${service}:$(params.git-revision)|g" "$MANIFEST_PATH"
|
||||
sed -i "s|image: bakery/${formatted_service}:.*|image: $(params.registry)/${formatted_service}:$(params.git-revision)|g" "$MANIFEST_PATH"
|
||||
|
||||
echo "Updated image in: $MANIFEST_PATH for image: bakery/${IMAGE_NAME}:* -> $(params.registry)/${IMAGE_NAME}:$(params.git-revision)"
|
||||
else
|
||||
echo "Warning: Manifest file not found: $MANIFEST_PATH"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
# Commit and push changes (unless dry-run)
|
||||
if [ "$(params.dry-run)" != "true" ]; then
|
||||
git add .
|
||||
git status
|
||||
if ! git diff --cached --quiet; then
|
||||
git commit -m "Update images for services: $(params.services) [skip ci]"
|
||||
git push origin "$(params.git-branch)"
|
||||
echo "GitOps manifests updated successfully"
|
||||
else
|
||||
echo "No changes to commit"
|
||||
fi
|
||||
else
|
||||
echo "Dry run mode - changes not pushed"
|
||||
git status
|
||||
git diff
|
||||
fi
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
@@ -0,0 +1,23 @@
|
||||
# Tekton TriggerBinding for Bakery-IA CI/CD
|
||||
# This binding extracts parameters from incoming webhook payloads
|
||||
|
||||
apiVersion: triggers.tekton.dev/v1beta1
|
||||
kind: TriggerBinding
|
||||
metadata:
|
||||
name: bakery-ia-trigger-binding
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
spec:
|
||||
params:
|
||||
- name: git-repo-url
|
||||
value: "{{"{{ .payload.repository.clone_url }}"}}"
|
||||
- name: git-revision
|
||||
value: "{{"{{ .payload.after }}"}}"
|
||||
- name: git-branch
|
||||
value: "{{"{{ .payload.ref }}" | replace "refs/heads/" "" | replace "refs/tags/" "" }}"
|
||||
- name: git-repo-name
|
||||
value: "{{"{{ .payload.repository.name }}"}}"
|
||||
- name: git-repo-full-name
|
||||
value: "{{"{{ .payload.repository.full_name }}"}}"
|
||||
@@ -0,0 +1,79 @@
|
||||
# Tekton TriggerTemplate for Bakery-IA CI/CD
|
||||
# This template defines how PipelineRuns are created when triggers fire
|
||||
|
||||
apiVersion: triggers.tekton.dev/v1beta1
|
||||
kind: TriggerTemplate
|
||||
metadata:
|
||||
name: bakery-ia-trigger-template
|
||||
namespace: {{ .Release.Namespace }}
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
app.kubernetes.io/component: triggers
|
||||
spec:
|
||||
params:
|
||||
- name: git-repo-url
|
||||
description: The git repository URL
|
||||
- name: git-revision
|
||||
description: The git revision/commit hash
|
||||
- name: git-branch
|
||||
description: The git branch name
|
||||
default: "main"
|
||||
- name: git-repo-name
|
||||
description: The git repository name
|
||||
default: "bakery-ia"
|
||||
- name: git-repo-full-name
|
||||
description: The full repository name (org/repo)
|
||||
default: "bakery-admin/bakery-ia"
|
||||
# Registry URL - keep in sync with pipeline-config ConfigMap
|
||||
- name: registry-url
|
||||
description: Container registry URL
|
||||
default: {{ .Values.global.registry.url | quote }}
|
||||
resourcetemplates:
|
||||
- apiVersion: tekton.dev/v1beta1
|
||||
kind: PipelineRun
|
||||
metadata:
|
||||
generateName: bakery-ia-ci-run-
|
||||
labels:
|
||||
app.kubernetes.io/name: {{ .Values.labels.app.name }}
|
||||
tekton.dev/pipeline: bakery-ia-ci
|
||||
triggers.tekton.dev/trigger: bakery-ia-gitea-trigger
|
||||
annotations:
|
||||
# Track the source commit
|
||||
bakery-ia.io/git-revision: $(tt.params.git-revision)
|
||||
bakery-ia.io/git-branch: $(tt.params.git-branch)
|
||||
spec:
|
||||
pipelineRef:
|
||||
name: bakery-ia-ci
|
||||
serviceAccountName: {{ .Values.serviceAccounts.pipeline.name }}
|
||||
workspaces:
|
||||
- name: shared-workspace
|
||||
volumeClaimTemplate:
|
||||
spec:
|
||||
accessModes: ["ReadWriteOnce"]
|
||||
resources:
|
||||
requests:
|
||||
storage: {{ .Values.pipeline.workspace.size }}
|
||||
- name: docker-credentials
|
||||
secret:
|
||||
secretName: gitea-registry-credentials
|
||||
- name: git-credentials
|
||||
secret:
|
||||
secretName: gitea-git-credentials
|
||||
params:
|
||||
- name: git-url
|
||||
value: $(tt.params.git-repo-url)
|
||||
- name: git-revision
|
||||
value: $(tt.params.git-revision)
|
||||
- name: git-branch
|
||||
value: $(tt.params.git-branch)
|
||||
# Use template parameter for registry URL
|
||||
- name: registry
|
||||
value: $(tt.params.registry-url)
|
||||
- name: skip-tests
|
||||
value: "false"
|
||||
- name: dry-run
|
||||
value: "false"
|
||||
# Timeout for the entire pipeline run
|
||||
timeouts:
|
||||
pipeline: "1h0m0s"
|
||||
tasks: "45m0s"
|
||||
81
infrastructure/cicd/tekton-helm/values-prod.yaml
Normal file
81
infrastructure/cicd/tekton-helm/values-prod.yaml
Normal file
@@ -0,0 +1,81 @@
|
||||
# Production values for tekton-cicd Helm chart
|
||||
# This file overrides values.yaml for production deployment
|
||||
#
|
||||
# Installation:
|
||||
# helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \
|
||||
# -n tekton-pipelines \
|
||||
# -f infrastructure/cicd/tekton-helm/values.yaml \
|
||||
# -f infrastructure/cicd/tekton-helm/values-prod.yaml \
|
||||
# --set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \
|
||||
# --set secrets.registry.password=$GITEA_ADMIN_PASSWORD \
|
||||
# --set secrets.git.password=$GITEA_ADMIN_PASSWORD
|
||||
#
|
||||
# Required environment variables:
|
||||
# TEKTON_WEBHOOK_TOKEN - Secure webhook token (generate with: openssl rand -hex 32)
|
||||
# GITEA_ADMIN_PASSWORD - Gitea admin password (must match gitea-admin-secret)
|
||||
|
||||
# Global settings for production
|
||||
global:
|
||||
# Git configuration
|
||||
git:
|
||||
userEmail: "ci@bakewise.ai"
|
||||
|
||||
# Pipeline configuration for production
|
||||
pipeline:
|
||||
# Build configuration
|
||||
build:
|
||||
verbosity: "warn" # Less verbose in production
|
||||
|
||||
# Test configuration
|
||||
test:
|
||||
skipTests: "false"
|
||||
skipLint: "false"
|
||||
|
||||
# Workspace configuration - ensure storage class exists in production cluster
|
||||
workspace:
|
||||
size: "10Gi"
|
||||
storageClass: "standard" # Adjust to your production storage class
|
||||
|
||||
# Tekton controller settings - increased resources for production
|
||||
controller:
|
||||
replicas: 2
|
||||
resources:
|
||||
limits:
|
||||
cpu: 2000m
|
||||
memory: 2Gi
|
||||
requests:
|
||||
cpu: 200m
|
||||
memory: 256Mi
|
||||
|
||||
# Tekton webhook settings - increased resources for production
|
||||
webhook:
|
||||
replicas: 2
|
||||
resources:
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 1Gi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
|
||||
# Secrets configuration
|
||||
# IMPORTANT: These MUST be overridden via --set flags during deployment
|
||||
# DO NOT commit actual secrets to this file
|
||||
secrets:
|
||||
# Webhook secret for validating incoming webhooks
|
||||
# Override with: --set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN
|
||||
webhook:
|
||||
token: "" # MUST be set via --set flag
|
||||
|
||||
# Registry credentials for pushing images
|
||||
# Override with: --set secrets.registry.password=$GITEA_ADMIN_PASSWORD
|
||||
registry:
|
||||
username: "bakery-admin"
|
||||
password: "" # MUST be set via --set flag
|
||||
registryUrl: "gitea-http.gitea.svc.cluster.local:3000"
|
||||
|
||||
# Git credentials for GitOps updates
|
||||
# Override with: --set secrets.git.password=$GITEA_ADMIN_PASSWORD
|
||||
git:
|
||||
username: "bakery-admin"
|
||||
password: "" # MUST be set via --set flag
|
||||
99
infrastructure/cicd/tekton-helm/values.yaml
Normal file
99
infrastructure/cicd/tekton-helm/values.yaml
Normal file
@@ -0,0 +1,99 @@
|
||||
# Default values for tekton-cicd Helm chart
|
||||
# This file contains configurable values for the CI/CD pipeline
|
||||
|
||||
# Global settings
|
||||
global:
|
||||
# Registry configuration
|
||||
registry:
|
||||
url: "gitea-http.gitea.svc.cluster.local:3000/bakery-admin"
|
||||
|
||||
# Git configuration
|
||||
git:
|
||||
branch: "main"
|
||||
userName: "bakery-ia-ci"
|
||||
userEmail: "ci@bakery-ia.local"
|
||||
|
||||
# Pipeline configuration
|
||||
pipeline:
|
||||
# Build configuration
|
||||
build:
|
||||
cacheTTL: "24h"
|
||||
verbosity: "info"
|
||||
# Base image registry configuration
|
||||
# For dev: localhost:5000 with python_3.11-slim
|
||||
# For prod: gitea registry with python_3.11-slim
|
||||
baseRegistry: "gitea-http.gitea.svc.cluster.local:3000/bakery-admin"
|
||||
pythonImage: "python_3.11-slim"
|
||||
|
||||
# Test configuration
|
||||
test:
|
||||
skipTests: "false"
|
||||
skipLint: "false"
|
||||
|
||||
# Deployment configuration
|
||||
deployment:
|
||||
namespace: "bakery-ia"
|
||||
fluxNamespace: "flux-system"
|
||||
|
||||
# Workspace configuration
|
||||
workspace:
|
||||
size: "5Gi"
|
||||
storageClass: "standard"
|
||||
|
||||
# Tekton controller settings
|
||||
controller:
|
||||
replicas: 1
|
||||
resources:
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 1Gi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
|
||||
# Tekton webhook settings
|
||||
webhook:
|
||||
replicas: 1
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
requests:
|
||||
cpu: 50m
|
||||
memory: 64Mi
|
||||
|
||||
# Namespace for Tekton resources
|
||||
# Set to empty/false to skip namespace creation (namespace is created by Tekton installation)
|
||||
namespace: ""
|
||||
|
||||
# Secrets configuration
|
||||
secrets:
|
||||
# Webhook secret for validating incoming webhooks
|
||||
webhook:
|
||||
token: "secure-webhook-token-replace-with-actual-value"
|
||||
|
||||
# Registry credentials for pushing images
|
||||
# Uses the same credentials as Gitea admin for consistency
|
||||
registry:
|
||||
username: "bakery-admin"
|
||||
password: "" # Will be populated from gitea-admin-secret
|
||||
registryUrl: "gitea-http.gitea.svc.cluster.local:3000"
|
||||
|
||||
# Git credentials for GitOps updates
|
||||
# Uses the same credentials as Gitea admin for consistency
|
||||
git:
|
||||
username: "bakery-admin"
|
||||
password: "" # Will be populated from gitea-admin-secret
|
||||
|
||||
# Service accounts
|
||||
serviceAccounts:
|
||||
triggers:
|
||||
name: "tekton-triggers-sa"
|
||||
pipeline:
|
||||
name: "tekton-pipeline-sa"
|
||||
|
||||
# Labels to apply to resources
|
||||
labels:
|
||||
app:
|
||||
name: "bakery-ia-cicd"
|
||||
component: "tekton"
|
||||
491
infrastructure/environments/common/configs/configmap.yaml
Normal file
491
infrastructure/environments/common/configs/configmap.yaml
Normal file
@@ -0,0 +1,491 @@
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: bakery-config
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: config
|
||||
data:
|
||||
# ENVIRONMENT & BUILD SETTINGS
|
||||
# ================================================================
|
||||
ENVIRONMENT: "development"
|
||||
DEBUG: "false"
|
||||
LOG_LEVEL: "INFO"
|
||||
|
||||
# Observability Settings - SigNoz enabled
|
||||
# Note: Detailed OTEL configuration is in the OBSERVABILITY section below
|
||||
ENABLE_TRACING: "true"
|
||||
ENABLE_METRICS: "true"
|
||||
ENABLE_LOGS: "true"
|
||||
ENABLE_OTEL_METRICS: "true"
|
||||
ENABLE_SYSTEM_METRICS: "true"
|
||||
OTEL_LOGS_EXPORTER: "otlp"
|
||||
|
||||
# Database initialization settings
|
||||
# IMPORTANT: Services NEVER run migrations - they only verify DB is ready
|
||||
# Migrations are handled by dedicated migration jobs
|
||||
# DB_FORCE_RECREATE only affects migration jobs, not services
|
||||
DB_FORCE_RECREATE: "false"
|
||||
BUILD_DATE: "2024-01-20T10:00:00Z"
|
||||
VCS_REF: "latest"
|
||||
IMAGE_TAG: "latest"
|
||||
DOMAIN: "bakewise.ai"
|
||||
AUTO_RELOAD: "false"
|
||||
PROFILING_ENABLED: "false"
|
||||
MOCK_EXTERNAL_APIS: "false"
|
||||
TESTING: "false"
|
||||
|
||||
# ================================================================
|
||||
# SERVICE DISCOVERY (KUBERNETES INTERNAL)
|
||||
# ================================================================
|
||||
REDIS_HOST: "redis-service"
|
||||
REDIS_PORT: "6379"
|
||||
RABBITMQ_HOST: "rabbitmq-service"
|
||||
RABBITMQ_PORT: "5672"
|
||||
RABBITMQ_MANAGEMENT_PORT: "15672"
|
||||
RABBITMQ_VHOST: "/"
|
||||
|
||||
# Database Hosts (Kubernetes Services)
|
||||
AUTH_DB_HOST: "auth-db-service"
|
||||
TENANT_DB_HOST: "tenant-db-service"
|
||||
TRAINING_DB_HOST: "training-db-service"
|
||||
FORECASTING_DB_HOST: "forecasting-db-service"
|
||||
SALES_DB_HOST: "sales-db-service"
|
||||
EXTERNAL_DB_HOST: "external-db-service"
|
||||
NOTIFICATION_DB_HOST: "notification-db-service"
|
||||
INVENTORY_DB_HOST: "inventory-db-service"
|
||||
RECIPES_DB_HOST: "recipes-db-service"
|
||||
SUPPLIERS_DB_HOST: "suppliers-db-service"
|
||||
POS_DB_HOST: "pos-db-service"
|
||||
ORDERS_DB_HOST: "orders-db-service"
|
||||
PRODUCTION_DB_HOST: "production-db-service"
|
||||
PROCUREMENT_DB_HOST: "procurement-db-service"
|
||||
ORCHESTRATOR_DB_HOST: "orchestrator-db-service"
|
||||
ALERT_PROCESSOR_DB_HOST: "alert-processor-db-service"
|
||||
AI_INSIGHTS_DB_HOST: "ai-insights-db-service"
|
||||
DISTRIBUTION_DB_HOST: "distribution-db-service"
|
||||
DEMO_SESSION_DB_HOST: "demo-session-db-service"
|
||||
|
||||
# MinIO Configuration
|
||||
MINIO_ENDPOINT: "minio.bakery-ia.svc.cluster.local:9000"
|
||||
MINIO_USE_SSL: "true"
|
||||
MINIO_MODEL_BUCKET: "training-models"
|
||||
MINIO_CONSOLE_PORT: "9001"
|
||||
MINIO_API_PORT: "9000"
|
||||
MINIO_REGION: "us-east-1"
|
||||
MINIO_MODEL_LIFECYCLE_DAYS: "90"
|
||||
MINIO_CACHE_TTL_SECONDS: "3600"
|
||||
|
||||
# Database Configuration
|
||||
DB_PORT: "5432"
|
||||
AUTH_DB_NAME: "auth_db"
|
||||
TENANT_DB_NAME: "tenant_db"
|
||||
TRAINING_DB_NAME: "training_db"
|
||||
FORECASTING_DB_NAME: "forecasting_db"
|
||||
SALES_DB_NAME: "sales_db"
|
||||
EXTERNAL_DB_NAME: "external_db"
|
||||
NOTIFICATION_DB_NAME: "notification_db"
|
||||
INVENTORY_DB_NAME: "inventory_db"
|
||||
RECIPES_DB_NAME: "recipes_db"
|
||||
SUPPLIERS_DB_NAME: "suppliers_db"
|
||||
POS_DB_NAME: "pos_db"
|
||||
ORDERS_DB_NAME: "orders_db"
|
||||
PRODUCTION_DB_NAME: "production_db"
|
||||
PROCUREMENT_DB_NAME: "procurement_db"
|
||||
ORCHESTRATOR_DB_NAME: "orchestrator_db"
|
||||
ALERT_PROCESSOR_DB_NAME: "alert_processor_db"
|
||||
AI_INSIGHTS_DB_NAME: "ai_insights_db"
|
||||
DISTRIBUTION_DB_NAME: "distribution_db"
|
||||
POSTGRES_INITDB_ARGS: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C"
|
||||
|
||||
# ================================================================
|
||||
# SERVICE URLS (KUBERNETES INTERNAL)
|
||||
# ================================================================
|
||||
GATEWAY_URL: "http://gateway-service:8000"
|
||||
AUTH_SERVICE_URL: "http://auth-service:8000"
|
||||
TENANT_SERVICE_URL: "http://tenant-service:8000"
|
||||
TRAINING_SERVICE_URL: "http://training-service:8000"
|
||||
FORECASTING_SERVICE_URL: "http://forecasting-service:8000"
|
||||
SALES_SERVICE_URL: "http://sales-service:8000"
|
||||
EXTERNAL_SERVICE_URL: "http://external-service:8000"
|
||||
NOTIFICATION_SERVICE_URL: "http://notification-service:8000"
|
||||
INVENTORY_SERVICE_URL: "http://inventory-service:8000"
|
||||
RECIPES_SERVICE_URL: "http://recipes-service:8000"
|
||||
SUPPLIERS_SERVICE_URL: "http://suppliers-service:8000"
|
||||
POS_SERVICE_URL: "http://pos-service:8000"
|
||||
ORDERS_SERVICE_URL: "http://orders-service:8000"
|
||||
PRODUCTION_SERVICE_URL: "http://production-service:8000"
|
||||
ALERT_PROCESSOR_SERVICE_URL: "http://alert-processor:8000"
|
||||
ORCHESTRATOR_SERVICE_URL: "http://orchestrator-service:8000"
|
||||
AI_INSIGHTS_SERVICE_URL: "http://ai-insights-service:8000"
|
||||
DISTRIBUTION_SERVICE_URL: "http://distribution-service:8000"
|
||||
|
||||
# ================================================================
|
||||
# AUTHENTICATION & SECURITY SETTINGS
|
||||
# ================================================================
|
||||
JWT_ALGORITHM: "HS256"
|
||||
JWT_ACCESS_TOKEN_EXPIRE_MINUTES: "240"
|
||||
JWT_REFRESH_TOKEN_EXPIRE_DAYS: "7"
|
||||
ENABLE_SERVICE_AUTH: "false"
|
||||
PASSWORD_MIN_LENGTH: "8"
|
||||
PASSWORD_REQUIRE_UPPERCASE: "true"
|
||||
PASSWORD_REQUIRE_LOWERCASE: "true"
|
||||
PASSWORD_REQUIRE_NUMBERS: "true"
|
||||
PASSWORD_REQUIRE_SYMBOLS: "false"
|
||||
BCRYPT_ROUNDS: "12"
|
||||
MAX_LOGIN_ATTEMPTS: "5"
|
||||
LOCKOUT_DURATION_MINUTES: "30"
|
||||
|
||||
# ================================================================
|
||||
# CORS & API CONFIGURATION
|
||||
# ================================================================
|
||||
CORS_ORIGINS: "https://bakery.yourdomain.com,http://frontend-service:3000"
|
||||
CORS_ALLOW_CREDENTIALS: "true"
|
||||
RATE_LIMIT_ENABLED: "true"
|
||||
RATE_LIMIT_REQUESTS: "100"
|
||||
RATE_LIMIT_WINDOW: "60"
|
||||
RATE_LIMIT_BURST: "10"
|
||||
API_DOCS_ENABLED: "true"
|
||||
|
||||
# ================================================================
|
||||
# HTTP CLIENT SETTINGS
|
||||
# ================================================================
|
||||
HTTP_TIMEOUT: "30000"
|
||||
HTTP_RETRIES: "3"
|
||||
HTTP_RETRY_DELAY: "1.0"
|
||||
|
||||
# ================================================================
|
||||
# EXTERNAL API CONFIGURATION
|
||||
# ================================================================
|
||||
AEMET_BASE_URL: "https://opendata.aemet.es/opendata"
|
||||
AEMET_TIMEOUT: "90"
|
||||
AEMET_RETRY_ATTEMPTS: "5"
|
||||
MADRID_OPENDATA_BASE_URL: "https://datos.madrid.es"
|
||||
MADRID_OPENDATA_TIMEOUT: "30"
|
||||
|
||||
# ================================================================
|
||||
# PAYMENT CONFIGURATION
|
||||
# ================================================================
|
||||
STRIPE_PUBLISHABLE_KEY: "pk_live_your_stripe_publishable_key_here"
|
||||
SQUARE_APPLICATION_ID: "your-square-application-id"
|
||||
SQUARE_ENVIRONMENT: "production"
|
||||
TOAST_ENVIRONMENT: "production"
|
||||
LIGHTSPEED_ENVIRONMENT: "production"
|
||||
|
||||
# ================================================================
|
||||
# EMAIL CONFIGURATION
|
||||
# ================================================================
|
||||
SMTP_HOST: "mailu-postfix.bakery-ia.svc.cluster.local"
|
||||
SMTP_PORT: "587"
|
||||
SMTP_TLS: "true"
|
||||
SMTP_SSL: "false"
|
||||
DEFAULT_FROM_EMAIL: "noreply@bakewise.ai"
|
||||
DEFAULT_FROM_NAME: "Bakery-Forecast"
|
||||
EMAIL_FROM_ADDRESS: "alerts@bakewise.ai"
|
||||
EMAIL_FROM_NAME: "Bakery Alert System"
|
||||
|
||||
# ================================================================
|
||||
# WHATSAPP CONFIGURATION
|
||||
# ================================================================
|
||||
WHATSAPP_BASE_URL: "https://api.twilio.com"
|
||||
WHATSAPP_FROM_NUMBER: "whatsapp:+14155238886"
|
||||
|
||||
# ================================================================
|
||||
# ALERT SYSTEM CONFIGURATION
|
||||
# ================================================================
|
||||
ALERT_PROCESSOR_INSTANCES: "2"
|
||||
ALERT_PROCESSOR_MAX_MEMORY: "512M"
|
||||
ALERT_BATCH_SIZE: "10"
|
||||
ALERT_PROCESSING_TIMEOUT: "30"
|
||||
EMAIL_ENABLED: "true"
|
||||
WHATSAPP_ENABLED: "true"
|
||||
SSE_ENABLED: "true"
|
||||
PUSH_NOTIFICATIONS_ENABLED: "false"
|
||||
ALERT_DEDUPLICATION_WINDOW_MINUTES: "15"
|
||||
RECOMMENDATION_DEDUPLICATION_WINDOW_MINUTES: "60"
|
||||
|
||||
# Alert Enrichment Configuration (Unified Alert Service)
|
||||
# Priority scoring weights (must sum to 1.0)
|
||||
BUSINESS_IMPACT_WEIGHT: "0.4"
|
||||
URGENCY_WEIGHT: "0.3"
|
||||
USER_AGENCY_WEIGHT: "0.2"
|
||||
CONFIDENCE_WEIGHT: "0.1"
|
||||
|
||||
# Priority thresholds (0-100 scale)
|
||||
CRITICAL_THRESHOLD: "90"
|
||||
IMPORTANT_THRESHOLD: "70"
|
||||
STANDARD_THRESHOLD: "50"
|
||||
|
||||
# Timing intelligence
|
||||
BUSINESS_HOURS_START: "6"
|
||||
BUSINESS_HOURS_END: "22"
|
||||
PEAK_HOURS_START: "7"
|
||||
PEAK_HOURS_END: "11"
|
||||
PEAK_HOURS_EVENING_START: "17"
|
||||
PEAK_HOURS_EVENING_END: "19"
|
||||
|
||||
# Alert grouping
|
||||
GROUPING_TIME_WINDOW_MINUTES: "15"
|
||||
MAX_ALERTS_PER_GROUP: "5"
|
||||
|
||||
# Email digest
|
||||
DIGEST_SEND_TIME: "18:00"
|
||||
|
||||
# ================================================================
|
||||
# CHECK FREQUENCIES (CRON EXPRESSIONS)
|
||||
# ================================================================
|
||||
STOCK_CHECK_FREQUENCY: "*/5"
|
||||
EXPIRY_CHECK_FREQUENCY: "*/2"
|
||||
TEMPERATURE_CHECK_FREQUENCY: "*/2"
|
||||
PRODUCTION_DELAY_CHECK_FREQUENCY: "*/5"
|
||||
CAPACITY_CHECK_FREQUENCY: "*/10"
|
||||
INVENTORY_OPTIMIZATION_FREQUENCY: "*/30"
|
||||
EFFICIENCY_RECOMMENDATIONS_FREQUENCY: "*/30"
|
||||
ENERGY_RECOMMENDATIONS_FREQUENCY: "0"
|
||||
WASTE_REDUCTION_FREQUENCY: "0"
|
||||
|
||||
# ================================================================
|
||||
# MODEL STORAGE & TRAINING
|
||||
# ================================================================
|
||||
# Model storage is handled by MinIO (see MinIO Configuration section)
|
||||
MODEL_STORAGE_BACKEND: "minio"
|
||||
MODEL_BACKUP_ENABLED: "true"
|
||||
MODEL_VERSIONING_ENABLED: "true"
|
||||
MAX_TRAINING_TIME_MINUTES: "30"
|
||||
MAX_CONCURRENT_TRAINING_JOBS: "3"
|
||||
MIN_TRAINING_DATA_DAYS: "30"
|
||||
TRAINING_BATCH_SIZE: "1000"
|
||||
|
||||
# ================================================================
|
||||
# OPTIMIZATION SETTINGS
|
||||
# ================================================================
|
||||
ENABLE_HYPERPARAMETER_OPTIMIZATION: "true"
|
||||
ENABLE_PRODUCT_SPECIFIC_PARAMS: "true"
|
||||
ENABLE_DYNAMIC_PARAM_SELECTION: "true"
|
||||
OPTUNA_N_TRIALS: "50"
|
||||
OPTUNA_CV_FOLDS: "3"
|
||||
OPTUNA_TIMEOUT_MINUTES: "10"
|
||||
HIGH_VOLUME_THRESHOLD: "1.0"
|
||||
INTERMITTENT_THRESHOLD: "0.6"
|
||||
|
||||
# ================================================================
|
||||
# PROPHET PARAMETERS
|
||||
# ================================================================
|
||||
PROPHET_SEASONALITY_MODE: "additive"
|
||||
PROPHET_CHANGEPOINT_PRIOR_SCALE: "0.05"
|
||||
PROPHET_SEASONALITY_PRIOR_SCALE: "10.0"
|
||||
PROPHET_HOLIDAYS_PRIOR_SCALE: "10.0"
|
||||
PROPHET_DAILY_SEASONALITY: "true"
|
||||
PROPHET_WEEKLY_SEASONALITY: "true"
|
||||
PROPHET_YEARLY_SEASONALITY: "true"
|
||||
|
||||
# ================================================================
|
||||
# BUSINESS CONFIGURATION
|
||||
# ================================================================
|
||||
SERVICE_VERSION: "1.0.0"
|
||||
TIMEZONE: "Europe/Madrid"
|
||||
LOCALE: "es_ES.UTF-8"
|
||||
CURRENCY: "EUR"
|
||||
BUSINESS_HOUR_START: "7"
|
||||
BUSINESS_HOUR_END: "20"
|
||||
ENABLE_SPANISH_HOLIDAYS: "true"
|
||||
ENABLE_MADRID_HOLIDAYS: "true"
|
||||
SCHOOL_CALENDAR_ENABLED: "true"
|
||||
WEATHER_IMPACT_ENABLED: "true"
|
||||
|
||||
# ================================================================
|
||||
# MONITORING & LOGGING
|
||||
# ================================================================
|
||||
LOG_FORMAT: "json"
|
||||
LOG_FILE_ENABLED: "false"
|
||||
LOG_FILE_PATH: "/app/logs"
|
||||
LOG_ROTATION_SIZE: "100MB"
|
||||
LOG_RETENTION_DAYS: "30"
|
||||
HEALTH_CHECK_TIMEOUT: "30"
|
||||
HEALTH_CHECK_INTERVAL: "30"
|
||||
|
||||
# Monitoring Configuration - SigNoz
|
||||
SIGNOZ_ROOT_URL: "https://monitoring.bakery-ia.local"
|
||||
|
||||
# ================================================================
|
||||
# DATA COLLECTION SETTINGS
|
||||
# ================================================================
|
||||
WEATHER_COLLECTION_INTERVAL_HOURS: "1"
|
||||
TRAFFIC_COLLECTION_INTERVAL_HOURS: "1"
|
||||
EVENTS_COLLECTION_INTERVAL_HOURS: "6"
|
||||
DATA_VALIDATION_ENABLED: "true"
|
||||
OUTLIER_DETECTION_ENABLED: "true"
|
||||
DATA_COMPLETENESS_THRESHOLD: "0.8"
|
||||
DEFAULT_LATITUDE: "40.4168"
|
||||
DEFAULT_LONGITUDE: "-3.7038"
|
||||
LOCATION_RADIUS_KM: "50.0"
|
||||
|
||||
# ================================================================
|
||||
# NOTIFICATION SETTINGS
|
||||
# ================================================================
|
||||
ENABLE_EMAIL_NOTIFICATIONS: "true"
|
||||
ENABLE_WHATSAPP_NOTIFICATIONS: "true"
|
||||
ENABLE_PUSH_NOTIFICATIONS: "false"
|
||||
MAX_RETRY_ATTEMPTS: "3"
|
||||
RETRY_DELAY_SECONDS: "60"
|
||||
NOTIFICATION_BATCH_SIZE: "100"
|
||||
EMAIL_RATE_LIMIT_PER_HOUR: "1000"
|
||||
WHATSAPP_RATE_LIMIT_PER_HOUR: "100"
|
||||
DEFAULT_LANGUAGE: "es"
|
||||
DATE_FORMAT: "%d/%m/%Y"
|
||||
TIME_FORMAT: "%H:%M"
|
||||
EMAIL_TEMPLATES_PATH: "/app/templates/email"
|
||||
WHATSAPP_TEMPLATES_PATH: "/app/templates/whatsapp"
|
||||
IMMEDIATE_DELIVERY: "true"
|
||||
SCHEDULED_DELIVERY_ENABLED: "true"
|
||||
DELIVERY_TRACKING_ENABLED: "true"
|
||||
OPEN_TRACKING_ENABLED: "true"
|
||||
CLICK_TRACKING_ENABLED: "true"
|
||||
|
||||
# ================================================================
|
||||
# FORECASTING SETTINGS
|
||||
# ================================================================
|
||||
MAX_FORECAST_DAYS: "30"
|
||||
MIN_HISTORICAL_DAYS: "60"
|
||||
PREDICTION_CONFIDENCE_THRESHOLD: "0.8"
|
||||
PREDICTION_CACHE_TTL_HOURS: "6"
|
||||
FORECAST_BATCH_SIZE: "100"
|
||||
|
||||
# ================================================================
|
||||
# BUSINESS RULES
|
||||
# ================================================================
|
||||
WEEKEND_ADJUSTMENT_FACTOR: "0.8"
|
||||
HOLIDAY_ADJUSTMENT_FACTOR: "0.5"
|
||||
TEMPERATURE_THRESHOLD_COLD: "10.0"
|
||||
TEMPERATURE_THRESHOLD_HOT: "30.0"
|
||||
RAIN_IMPACT_FACTOR: "0.7"
|
||||
HIGH_DEMAND_THRESHOLD: "1.5"
|
||||
LOW_DEMAND_THRESHOLD: "0.5"
|
||||
STOCKOUT_RISK_THRESHOLD: "0.9"
|
||||
|
||||
# ================================================================
|
||||
# CACHE SETTINGS
|
||||
# ================================================================
|
||||
REDIS_TLS_ENABLED: "true"
|
||||
REDIS_MAX_MEMORY: "512mb"
|
||||
REDIS_MAX_CONNECTIONS: "50"
|
||||
REDIS_DB: "1"
|
||||
WEATHER_CACHE_TTL_HOURS: "1"
|
||||
TRAFFIC_CACHE_TTL_HOURS: "1"
|
||||
|
||||
# ================================================================
|
||||
# FRONTEND CONFIGURATION
|
||||
# ================================================================
|
||||
VITE_APP_TITLE: "PanIA Dashboard"
|
||||
VITE_APP_VERSION: "1.0.0"
|
||||
VITE_API_URL: "/api"
|
||||
VITE_ENVIRONMENT: "production"
|
||||
|
||||
# Pilot Program Configuration
|
||||
VITE_PILOT_MODE_ENABLED: "true"
|
||||
VITE_PILOT_COUPON_CODE: "PILOT2025"
|
||||
VITE_PILOT_TRIAL_MONTHS: "3"
|
||||
VITE_STRIPE_PUBLISHABLE_KEY: "pk_test_51QuxKyIzCdnBmAVTGM8fvXYkItrBUILz6lHYwhAva6ZAH1HRi0e8zDRgZ4X3faN0zEABp5RHjCVBmMJL3aKXbaC200fFrSNnPl"
|
||||
|
||||
# ================================================================
|
||||
# LOCATION SETTINGS (Nominatim Geocoding)
|
||||
# ================================================================
|
||||
NOMINATIM_SERVICE_URL: "http://nominatim-service:8080"
|
||||
NOMINATIM_PBF_URL: "http://download.geofabrik.de/europe/spain-latest.osm.pbf"
|
||||
NOMINATIM_MEMORY_LIMIT: "8G"
|
||||
NOMINATIM_CPU_LIMIT: "4"
|
||||
|
||||
# ================================================================
|
||||
# OBSERVABILITY - SigNoz (Unified Monitoring)
|
||||
# ================================================================
|
||||
# OpenTelemetry Configuration - Direct to SigNoz OTel Collector
|
||||
#
|
||||
# ENDPOINT CONFIGURATION:
|
||||
# - OTEL_EXPORTER_OTLP_ENDPOINT: Base gRPC endpoint (host:port format, NO http:// prefix)
|
||||
# Used by traces and metrics (gRPC) by default
|
||||
# Format: "host:4317" (gRPC port)
|
||||
#
|
||||
# PROTOCOL USAGE:
|
||||
# - Traces: gRPC (port 4317) - High performance, low latency
|
||||
# - Metrics: gRPC (port 4317) - Efficient batch export
|
||||
# - Logs: HTTP (port 4318) - Required for OTLP log protocol
|
||||
#
|
||||
# The monitoring library automatically handles:
|
||||
# - Converting gRPC endpoint (4317) to HTTP endpoint (4318) for logs
|
||||
# - Adding proper paths (/v1/traces, /v1/metrics, /v1/logs)
|
||||
# - Protocol prefixes (http:// for HTTP, none for gRPC)
|
||||
#
|
||||
# Base OTLP endpoint (gRPC format - used by traces and metrics)
|
||||
OTEL_EXPORTER_OTLP_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
|
||||
|
||||
# Protocol configuration (gRPC is recommended for better performance)
|
||||
OTEL_EXPORTER_OTLP_PROTOCOL: "grpc"
|
||||
|
||||
# Optional: Signal-specific endpoint overrides (if different from base)
|
||||
# OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
|
||||
# OTEL_EXPORTER_OTLP_METRICS_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
|
||||
# OTEL_EXPORTER_OTLP_LOGS_ENDPOINT: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318"
|
||||
|
||||
# Gateway telemetry proxy configuration
|
||||
SIGNOZ_OTEL_COLLECTOR_URL: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318"
|
||||
|
||||
# Optional: Protocol overrides per signal
|
||||
# OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "grpc"
|
||||
# OTEL_EXPORTER_OTLP_METRICS_PROTOCOL: "grpc"
|
||||
# Note: Logs always use HTTP protocol regardless of this setting
|
||||
|
||||
# Resource attributes (added to all telemetry signals)
|
||||
OTEL_SERVICE_NAME: "bakery-ia"
|
||||
OTEL_RESOURCE_ATTRIBUTES: "deployment.environment=development"
|
||||
|
||||
# SigNoz service endpoints (for UI and API access)
|
||||
SIGNOZ_ENDPOINT: "http://signoz.bakery-ia.svc.cluster.local:8080"
|
||||
SIGNOZ_FRONTEND_URL: "https://monitoring.bakery-ia.local"
|
||||
|
||||
# ================================================================
|
||||
# DISTRIBUTION & ROUTING OPTIMIZATION SETTINGS
|
||||
# ================================================================
|
||||
VRP_TIME_LIMIT_SECONDS: "30"
|
||||
VRP_DEFAULT_VEHICLE_CAPACITY_KG: "1000"
|
||||
VRP_AVERAGE_SPEED_KMH: "30"
|
||||
|
||||
# ================================================================
|
||||
# REPLENISHMENT PLANNING SETTINGS
|
||||
# ================================================================
|
||||
REPLENISHMENT_PROJECTION_HORIZON_DAYS: "7"
|
||||
REPLENISHMENT_SERVICE_LEVEL: "0.95"
|
||||
REPLENISHMENT_BUFFER_DAYS: "1"
|
||||
|
||||
# Safety Stock
|
||||
SAFETY_STOCK_SERVICE_LEVEL: "0.95"
|
||||
SAFETY_STOCK_METHOD: "statistical"
|
||||
|
||||
# MOQ
|
||||
MOQ_CONSOLIDATION_WINDOW_DAYS: "7"
|
||||
MOQ_ALLOW_EARLY_ORDERING: "true"
|
||||
|
||||
# Supplier Selection
|
||||
SUPPLIER_PRICE_WEIGHT: "0.40"
|
||||
SUPPLIER_LEAD_TIME_WEIGHT: "0.20"
|
||||
SUPPLIER_QUALITY_WEIGHT: "0.20"
|
||||
SUPPLIER_RELIABILITY_WEIGHT: "0.20"
|
||||
SUPPLIER_DIVERSIFICATION_THRESHOLD: "1000"
|
||||
SUPPLIER_MAX_SINGLE_PERCENTAGE: "0.70"
|
||||
|
||||
# Circuit Breakers
|
||||
CIRCUIT_BREAKER_FAILURE_THRESHOLD: "5"
|
||||
CIRCUIT_BREAKER_TIMEOUT_DURATION: "60"
|
||||
CIRCUIT_BREAKER_SUCCESS_THRESHOLD: "2"
|
||||
|
||||
# Saga
|
||||
SAGA_TIMEOUT_SECONDS: "600"
|
||||
SAGA_ENABLE_COMPENSATION: "true"
|
||||
|
||||
# ================================================================
|
||||
# EXTERNAL DATA SERVICE V2 SETTINGS
|
||||
# ================================================================
|
||||
EXTERNAL_ENABLED_CITIES: "madrid"
|
||||
EXTERNAL_RETENTION_MONTHS: "6" # Reduced from 24 to avoid memory issues during init
|
||||
EXTERNAL_CACHE_TTL_DAYS: "7"
|
||||
EXTERNAL_REDIS_URL: "rediss://redis-service:6379/0?ssl_cert_reqs=none"
|
||||
@@ -0,0 +1,6 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
resources:
|
||||
- configmap.yaml
|
||||
- secrets.yaml
|
||||
226
infrastructure/environments/common/configs/secrets.yaml
Normal file
226
infrastructure/environments/common/configs/secrets.yaml
Normal file
@@ -0,0 +1,226 @@
|
||||
# NOTE: gitea-registry-secret is dynamically created by:
|
||||
# infrastructure/cicd/gitea/sync-registry-secret.sh
|
||||
# This script is automatically run by Tiltfile after Gitea setup.
|
||||
# The secret uses the same credentials as gitea-admin-secret in the gitea namespace.
|
||||
# DO NOT define gitea-registry-secret here to avoid credential sync issues.
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: database-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: database
|
||||
type: Opaque
|
||||
data:
|
||||
# Database Users (base64 encoded from .env)
|
||||
AUTH_DB_USER: YXV0aF91c2Vy # auth_user
|
||||
TENANT_DB_USER: dGVuYW50X3VzZXI= # tenant_user
|
||||
TRAINING_DB_USER: dHJhaW5pbmdfdXNlcg== # training_user
|
||||
FORECASTING_DB_USER: Zm9yZWNhc3RpbmdfdXNlcg== # forecasting_user
|
||||
SALES_DB_USER: c2FsZXNfdXNlcg== # sales_user
|
||||
EXTERNAL_DB_USER: ZXh0ZXJuYWxfdXNlcg== # external_user
|
||||
NOTIFICATION_DB_USER: bm90aWZpY2F0aW9uX3VzZXI= # notification_user
|
||||
INVENTORY_DB_USER: aW52ZW50b3J5X3VzZXI= # inventory_user
|
||||
RECIPES_DB_USER: cmVjaXBlc191c2Vy # recipes_user
|
||||
SUPPLIERS_DB_USER: c3VwcGxpZXJzX3VzZXI= # suppliers_user
|
||||
POS_DB_USER: cG9zX3VzZXI= # pos_user
|
||||
ORDERS_DB_USER: b3JkZXJzX3VzZXI= # orders_user
|
||||
PRODUCTION_DB_USER: cHJvZHVjdGlvbl91c2Vy # production_user
|
||||
ALERT_PROCESSOR_DB_USER: YWxlcnRfcHJvY2Vzc29yX3VzZXI= # alert_processor_user
|
||||
DEMO_SESSION_DB_USER: ZGVtb19zZXNzaW9uX3VzZXI= # demo_session_user
|
||||
ORCHESTRATOR_DB_USER: b3JjaGVzdHJhdG9yX3VzZXI= # orchestrator_user
|
||||
PROCUREMENT_DB_USER: cHJvY3VyZW1lbnRfdXNlcg== # procurement_user
|
||||
AI_INSIGHTS_DB_USER: YWlfaW5zaWdodHNfdXNlcg== # ai_insights_user
|
||||
DISTRIBUTION_DB_USER: ZGlzdHJpYnV0aW9uX3VzZXI= # distribution_user
|
||||
|
||||
# Database Passwords (base64 encoded - URL-SAFE PRODUCTION PASSWORDS)
|
||||
AUTH_DB_PASSWORD: RThLejQ3WW1WekRsSEdzMU05d0FiSnp4Y0tuR09OQ1Q= # E8Kz47YmVzDlHGs1M9wAbJzxcKnGONCT
|
||||
TENANT_DB_PASSWORD: VW5tV0VBNlJkaWZncGdoV2N4Zkh2ME1veVVnbUY0ekg= # UnmWEA6RdifgpghWcxfHv0MoyUgmF4zH
|
||||
TRAINING_DB_PASSWORD: WnZhMzNoaVBJc2ZtV3RxUlBWV29taTRYZ2xLTlZPcHY= # Zva33hiPIsfmWtqRPVWomi4XglKNVOpv
|
||||
FORECASTING_DB_PASSWORD: QU9CN0Z1SkczVFFSWXptdFJXZHZja3JuQzdsSGtJSHQ= # AOB7FuJG3TQRYzmtRWdvckrnC7lHkIHt
|
||||
SALES_DB_PASSWORD: NlN1R1lETFRiZjdjWGJZb1RETGlGU2ZSZDBmU2FpMXA= # 6SuGYDLTbf7cXbYoTDLiFSfRd0fSai1p
|
||||
EXTERNAL_DB_PASSWORD: anlOZE1YRWVBdnhLZWxHOElqMVptRjk4c3l2R3JicTc= # jyNdMXEeAvxKelG8Ij1ZmF98syvGrbq7
|
||||
NOTIFICATION_DB_PASSWORD: NWJ0YzVZWExjUnZBaGE3dzFaNExNNnNoSmRxU21oVGQ= # 5btc5YXLcRvAha7w1Z4LM6shJdqSmhTd
|
||||
INVENTORY_DB_PASSWORD: NU5hc09uR1M1RTlXbkV0cDNDcFBvUEVpUWxGQXdlWEQ= # 5NasOnGS5E9WnEtp3CpPoPEiQlFAweXD
|
||||
RECIPES_DB_PASSWORD: QlRvc2IzMDlpc05DeHFmV25WZFhQZ0xMTUI5VmM5RXQ= # BTosb309isNCxqfWnVdXPgLLMB9Vc9Et
|
||||
SUPPLIERS_DB_PASSWORD: ZjVUQzd1ekVUblI0ZkowWWdPNFRoMDQ1QkN4Mk9CcWs= # f5TC7uzETnR4fJ0YgO4Th045BCx2OBqk
|
||||
POS_DB_PASSWORD: Q1hIdE5nTTFEYmRiR2VGYTdRWE5lTkttbVAxVWRsc08= # CXHtNgM1DbdbGeFa7QXNeNKmmP1UdlsO
|
||||
ORDERS_DB_PASSWORD: emU1aVJncVpVTm1DaHNRbjV3MGFDWFBqb3h1MXdNSDk= # ze5iRgqZUNmChsQn5w0aCXPjoxu1wMH9
|
||||
PRODUCTION_DB_PASSWORD: SVpaUjZ5dzFqUmFPM29iVUtBQWJaODNLMEdmeTNqbWI= # IZZR6yw1jRaO3obUKAAbZ83K0Gfy3jmb
|
||||
ALERT_PROCESSOR_DB_PASSWORD: WklyWjBNQnFsRHZsTXJtcndndnZ2UUwzNm5yWFFqdDU= # ZIrZ0MBqlDvlMrmrwgvvvQL36nrXQjt5
|
||||
DEMO_SESSION_DB_PASSWORD: R291ZWlkcWFSNDhJejJFMDdmT0tyd3BSeXBtMjV1cW4= # GoueidqaR48Iz2E07fOKrwpRypm25uqn
|
||||
ORCHESTRATOR_DB_PASSWORD: cndCZTdZck5GMVRCMkE3N3U5cUVVTGtWdEJlbU1xdm8= # rwBe7YrNF1TB2A77u9qEULkVtBemMqvo
|
||||
PROCUREMENT_DB_PASSWORD: dUNhRHllZm5aMXhpd21TcDRNMnQ3QzQ1bkJieGltT1g= # uCaDyefnZ1xiwmSp4M2t7C45nBbximOX
|
||||
AI_INSIGHTS_DB_PASSWORD: ZGp6M2M1T09KYkJOT28yd2VTY0l0dmlra0pyV2l5dUw= # djz3c5OOJbBNOo2weScItvikkJrWiyuL
|
||||
DISTRIBUTION_DB_PASSWORD: ZGp6M2M1T09KYkJOT28yd2VTY0l0dmlra0pyV2l5dUw= # djz3c5OOJbBNOo2weScItvikkJrWiyuL
|
||||
|
||||
# Database URLs (base64 encoded - with strong passwords)
|
||||
AUTH_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vYXV0aF91c2VyOkU4S3o0N1ltVnpEbEhHczFNOXdBYkp6eGNLbkdPTkNUQGF1dGgtZGItc2VydmljZTo1NDMyL2F1dGhfZGI=
|
||||
TENANT_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vdGVuYW50X3VzZXI6VW5tV0VBNlJkaWZncGdoV2N4Zkh2ME1veVVnbUY0ekhAdGVuYW50LWRiLXNlcnZpY2U6NTQzMi90ZW5hbnRfZGI=
|
||||
TRAINING_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vdHJhaW5pbmdfdXNlcjpadmEzM2hpUElzZm1XdHFSUFZXb21pNFhnbEtOVk9wdkB0cmFpbmluZy1kYi1zZXJ2aWNlOjU0MzIvdHJhaW5pbmdfZGI=
|
||||
FORECASTING_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZm9yZWNhc3RpbmdfdXNlcjpBT0I3RnVKRzNUUVJZem10UldkdmNrcm5DN2xIa0lIdEBmb3JlY2FzdGluZy1kYi1zZXJ2aWNlOjU0MzIvZm9yZWNhc3RpbmdfZGI=
|
||||
SALES_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vc2FsZXNfdXNlcjo2U3VHWURMVGJmN2NYYllvVERMaUZTZlJkMGZTYWkxcEBzYWxlcy1kYi1zZXJ2aWNlOjU0MzIvc2FsZXNfZGI=
|
||||
EXTERNAL_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZXh0ZXJuYWxfdXNlcjpqeU5kTVhFZUF2eEtlbEc4SWoxWm1GOThzeXZHcmJxN0BleHRlcm5hbC1kYi1zZXJ2aWNlOjU0MzIvZXh0ZXJuYWxfZGI=
|
||||
NOTIFICATION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vbm90aWZpY2F0aW9uX3VzZXI6NWJ0YzVZWExjUnZBaGE3dzFaNExNNnNoSmRxU21oVGRAbm90aWZpY2F0aW9uLWRiLXNlcnZpY2U6NTQzMi9ub3RpZmljYXRpb25fZGI=
|
||||
INVENTORY_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vaW52ZW50b3J5X3VzZXI6NU5hc09uR1M1RTlXbkV0cDNDcFBvUEVpUWxGQXdlWERAaW52ZW50b3J5LWRiLXNlcnZpY2U6NTQzMi9pbnZlbnRvcnlfZGI=
|
||||
RECIPES_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcmVjaXBlc191c2VyOkJUb3NiMzA5aXNOQ3hxZlduVmRYUGdMTE1COVZjOUV0QHJlY2lwZXMtZGItc2VydmljZTo1NDMyL3JlY2lwZXNfZGI=
|
||||
SUPPLIERS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vc3VwcGxpZXJzX3VzZXI6ZjVUQzd1ekVUblI0ZkowWWdPNFRoMDQ1QkN4Mk9CcWtAc3VwcGxpZXJzLWRiLXNlcnZpY2U6NTQzMi9zdXBwbGllcnNfZGI=
|
||||
POS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcG9zX3VzZXI6Q1hIdE5nTTFEYmRiR2VGYTdRWE5lTkttbVAxVWRsc09AcG9zLWRiLXNlcnZpY2U6NTQzMi9wb3NfZGI=
|
||||
ORDERS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vb3JkZXJzX3VzZXI6emU1aVJncVpVTm1DaHNRbjV3MGFDWFBqb3h1MXdNSDlAb3JkZXJzLWRiLXNlcnZpY2U6NTQzMi9vcmRlcnNfZGI=
|
||||
PRODUCTION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcHJvZHVjdGlvbl91c2VyOklaWlI2eXcxalJhTzNvYlVLQUFiWjgzSzBHZnkzam1iQHByb2R1Y3Rpb24tZGItc2VydmljZTo1NDMyL3Byb2R1Y3Rpb25fZGI=
|
||||
ALERT_PROCESSOR_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vYWxlcnRfcHJvY2Vzc29yX3VzZXI6WklyWjBNQnFsRHZsTXJtcndndnZ2UUwzNm5yWFFqdDVAYWxlcnQtcHJvY2Vzc29yLWRiLXNlcnZpY2U6NTQzMi9hbGVydF9wcm9jZXNzb3JfZGI=
|
||||
DEMO_SESSION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZGVtb19zZXNzaW9uX3VzZXI6R291ZWlkcWFSNDhJejJFMDdmT0tyd3BSeXBtMjV1cW5AZGVtby1zZXNzaW9uLWRiLXNlcnZpY2U6NTQzMi9kZW1vX3Nlc3Npb25fZGI=
|
||||
ORCHESTRATOR_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vb3JjaGVzdHJhdG9yX3VzZXI6cndCZTdZck5GMVRCMkE3N3U5cUVVTGtWdEJlbU1xdm9Ab3JjaGVzdHJhdG9yLWRiLXNlcnZpY2U6NTQzMi9vcmNoZXN0cmF0b3JfZGI=
|
||||
PROCUREMENT_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcHJvY3VyZW1lbnRfdXNlcjp1Q2FEeWVmbloxeGl3bVNwNE0ydDdDNDVuQmJ4aW1PWEBwcm9jdXJlbWVudC1kYi1zZXJ2aWNlOjU0MzIvcHJvY3VyZW1lbnRfZGI=
|
||||
AI_INSIGHTS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vYWlfaW5zaWdodHNfdXNlcjpkanozYzVPT0piQk5PbzJ3ZVNjSXR2aWtrSnJXaXl1TEBhaS1pbnNpZ2h0cy1kYi1zZXJ2aWNlOjU0MzIvYWlfaW5zaWdodHNfZGI=
|
||||
DISTRIBUTION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZGlzdHJpYnV0aW9uX3VzZXI6ZGp6M2M1T09KYkJOT28yd2VTY0l0dmlra0pyV2l5dUxAZGlzdHJpYnV0aW9uLWRiLXNlcnZpY2U6NTQzMi9kaXN0cmlidXRpb25fZGI=
|
||||
|
||||
# PostgreSQL Monitoring User (for SigNoz metrics collection)
|
||||
POSTGRES_MONITOR_USER: bW9uaXRvcmluZw== # monitoring
|
||||
POSTGRES_MONITOR_PASSWORD: bW9uaXRvcmluZ18zNjlmOWMwMDFmMjQyYjA3ZWY5ZTI4MjZlMTcxNjljYQ== # monitoring_369f9c001f242b07ef9e2826e17169ca
|
||||
|
||||
# Redis URL (URL-safe password)
|
||||
REDIS_URL: cmVkaXM6Ly86SjNsa2x4cHU5QzlPTElLdkJteFVIT2h0czFnc0lvM0FAcmVkaXMtc2VydmljZTo2Mzc5LzA= # redis://:J3lklxpu9C9OLIKvBmxUHOhts1gsIo3A@redis-service:6379/0
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: redis-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: redis
|
||||
type: Opaque
|
||||
data:
|
||||
REDIS_PASSWORD: SjNsa2x4cHU5QzlPTElLdkJteFVIT2h0czFnc0lvM0E= # J3lklxpu9C9OLIKvBmxUHOhts1gsIo3A
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: rabbitmq-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: rabbitmq
|
||||
type: Opaque
|
||||
data:
|
||||
RABBITMQ_USER: YmFrZXJ5 # bakery
|
||||
RABBITMQ_PASSWORD: VzJYS2tSdUxpT25ZS2RCWVFTQXJvbjFpeWtFU1M1b2I= # W2XKkRuLiOnYKdBYQSAron1iykESS5ob
|
||||
RABBITMQ_ERLANG_COOKIE: YzU4MzQ2NzBhYjU1OTA1MTUzZTM1Yjg3ZmVhOTZkNWMxNGM4ODExZjIwM2E3YWI3NmE5MWRjMGE5MWQ4ZDBiNA== # c5834670ab55905153e35b87fea96d5c14c8811f203a7ab76a91dc0a91d8d0b4
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: jwt-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: auth
|
||||
type: Opaque
|
||||
data:
|
||||
JWT_SECRET_KEY: dXNNSHc5a1FDUW95cmM3d1BtTWkzYkNscjBsVFk5d3Z6Wm1jVGJBRHZMMD0= # usMHw9kQCQoyrc7wPmMi3bClr0lTY9wvzZmcTbADvL0=
|
||||
JWT_REFRESH_SECRET_KEY: b2ZPRUlUWHBEUXM0a0pGcERTVWt4bDUwSmkxWUJKUmd3T0V5bStGRWNIST0= # ofOEITXpDQs4kJFpDSUkxl50Ji1YBJRgwOEym+FEcHI=
|
||||
SERVICE_API_KEY: Y2IyNjFiOTM0ZDQ3MDI5YTY0MTE3YzBlNDExMGM5M2Y2NmJiY2Y1ZWFhMTVjODRjNDI3MjdmYWQ3OGY3MTk2Yw== # cb261b934d47029a64117c0e4110c93f66bbcf5eaa15c84c42727fad78f7196c
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: external-api-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: external-apis
|
||||
type: Opaque
|
||||
data:
|
||||
AEMET_API_KEY: ZXlKaGJHY2lPaUpJVXpJMU5pSjkuZXlKemRXSWlPaUoxWVd4bVlYSnZRR2R0WVdsc0xtTnZiU0lzSW1wMGFTSTZJakV3TjJObE9XVmlMVGxoTm1ZdE5EQmpZeTA1WWpoaUxUTTFOV05pWkRZNU5EazJOeUlzSW1semN5STZJa0ZGVFVWVUlpd2lhV0YwSWpveE56VTVPREkwT0RNekxDSjFjMlZ5U1dRaU9pSXhNRGRqWlRsbFlpMDVZVFptTFRRd1kyTXRPV0k0WWkwek5UVmpZbVEyT1RRNU5qY2lMQ0p5YjJ4bElqb2lJbjAuamtjX3hCc0pDc204ZmRVVnhESW1mb2x5UE5pazF4MTd6c1UxZEZKR09iWQ==
|
||||
MADRID_OPENDATA_API_KEY: eW91ci1tYWRyaWQtb3BlbmRhdGEta2V5LWhlcmU= # your-madrid-opendata-key-here
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: payment-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: payments
|
||||
type: Opaque
|
||||
data:
|
||||
STRIPE_SECRET_KEY: c2tfdGVzdF81MVF1eEt5SXpDZG5CbUFWVG5QYzhVWThZTW1qdUJjaTk0RzRqc2lzMVQzMFU1anV5ZmxhQkJxYThGb2xEdTBFMlNnOUZFcVNUakFxenUwa0R6eTROUUN3ejAwOGtQUFF6WGM= # sk_test_51QuxKyIzCdnBmAVTnPc8UY8YMmjuBci94G4jsis1T30U5juyflaBBqa8FolDu0E2Sg9FEqSTjAqzu0kDzy4NQCwz008kPPQzXc
|
||||
STRIPE_WEBHOOK_SECRET: d2hzZWNfOWI1NGM2ZDQ2ZjhlN2E4NWQzZWZmNmI5MWQyMzg3NGQ3N2Q5NjBlZGUyYWQzNTBkOWY3MWY5ZjBmYTlkM2VjNQ== # whsec_9b54c6d46f8e7a85d3eff6b91d23874d77d960ede2ad350d9f71f9f0fa9d3ec5
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: email-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: notifications
|
||||
type: Opaque
|
||||
data:
|
||||
# SMTP credentials for internal Mailu server (Helm deployment)
|
||||
# These are used by notification-service to send emails via mailu-postfix
|
||||
SMTP_USER: cG9zdG1hc3RlckBiYWtld2lzZS5haQ== # postmaster@bakewise.ai
|
||||
SMTP_PASSWORD: VzJYS2tSdUxpT25ZS2RCWVFTQXJvbjFpeWtFU1M1b2I= # W2XKkRuLiOnYKdBYQSAron1iykESS5ob
|
||||
# Dovecot admin password for IMAP management
|
||||
DOVEADM_PASSWORD: WnZhMzNoaVBJc2ZtV3RxUlBWV29taTRYZ2xLTlZPcHY= # Zva33hiPIsfmWtqRPVWomi4XglKNVOpv
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: monitoring-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: monitoring
|
||||
type: Opaque
|
||||
data:
|
||||
GRAFANA_ADMIN_USER: YWRtaW4= # admin
|
||||
GRAFANA_ADMIN_PASSWORD: YWRtaW4xMjM= # admin123
|
||||
GRAFANA_SECRET_KEY: Z3JhZmFuYS1zZWNyZXQta2V5LWNoYW5nZS1pbi1wcm9kdWN0aW9u # grafana-secret-key-change-in-production
|
||||
PGADMIN_EMAIL: YWRtaW5AYmFrZXJ5LmxvY2Fs # admin@bakery.local
|
||||
PGADMIN_PASSWORD: YWRtaW4xMjM= # admin123
|
||||
REDIS_COMMANDER_USER: YWRtaW4= # admin
|
||||
REDIS_COMMANDER_PASSWORD: YWRtaW4xMjM= # admin123
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: pos-integration-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: pos
|
||||
type: Opaque
|
||||
data:
|
||||
SQUARE_ACCESS_TOKEN: eW91ci1zcXVhcmUtYWNjZXNzLXRva2Vu # your-square-access-token
|
||||
SQUARE_WEBHOOK_SECRET: eW91ci1zcXVhcmUtd2ViaG9vay1zZWNyZXQ= # your-square-webhook-secret
|
||||
TOAST_API_KEY: eW91ci10b2FzdC1hcGkta2V5 # your-toast-api-key
|
||||
TOAST_API_SECRET: eW91ci10b2FzdC1hcGktc2VjcmV0 # your-toast-api-secret
|
||||
TOAST_WEBHOOK_SECRET: eW91ci10b2FzdC13ZWJob29rLXNlY3JldA== # your-toast-webhook-secret
|
||||
LIGHTSPEED_API_KEY: eW91ci1saWdodHNwZWVkLWFwaS1rZXk= # your-lightspeed-api-key
|
||||
LIGHTSPEED_API_SECRET: eW91ci1saWdodHNwZWVkLWFwaS1zZWNyZXQ= # your-lightspeed-api-secret
|
||||
LIGHTSPEED_WEBHOOK_SECRET: eW91ci1saWdodHNwZWVkLXdlYmhvb2stc2VjcmV0 # your-lightspeed-webhook-secret
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: whatsapp-secrets
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/component: notifications
|
||||
type: Opaque
|
||||
data:
|
||||
WHATSAPP_API_KEY: eW91ci13aGF0c2FwcC1hcGkta2V5LWhlcmU= # your-whatsapp-api-key-here
|
||||
@@ -0,0 +1,56 @@
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: Certificate
|
||||
metadata:
|
||||
name: bakery-dev-tls-cert
|
||||
namespace: bakery-ia
|
||||
spec:
|
||||
# Self-signed certificate for local development
|
||||
secretName: bakery-dev-tls-cert
|
||||
|
||||
# Certificate duration
|
||||
duration: 2160h # 90 days
|
||||
renewBefore: 360h # 15 days
|
||||
|
||||
# Subject configuration
|
||||
subject:
|
||||
organizations:
|
||||
- Bakery IA Development
|
||||
|
||||
# Common name
|
||||
commonName: localhost
|
||||
|
||||
# DNS names this certificate is valid for
|
||||
dnsNames:
|
||||
- localhost
|
||||
- bakery-ia.local
|
||||
- api.bakery-ia.local
|
||||
- monitoring.bakery-ia.local
|
||||
- gitea.bakery-ia.local
|
||||
- registry.bakery-ia.local
|
||||
- "*.bakery-ia.local"
|
||||
- "mail.bakery-ia.dev"
|
||||
- "*.bakery-ia.dev"
|
||||
|
||||
# IP addresses (for localhost)
|
||||
ipAddresses:
|
||||
- 127.0.0.1
|
||||
- ::1
|
||||
|
||||
# Use self-signed issuer for development
|
||||
issuerRef:
|
||||
name: selfsigned-issuer
|
||||
kind: ClusterIssuer
|
||||
group: cert-manager.io
|
||||
|
||||
# Private key configuration
|
||||
privateKey:
|
||||
algorithm: RSA
|
||||
encoding: PKCS1
|
||||
size: 2048
|
||||
|
||||
# Usages
|
||||
usages:
|
||||
- server auth
|
||||
- client auth
|
||||
- digital signature
|
||||
- key encipherment
|
||||
104
infrastructure/environments/dev/k8s-manifests/kustomization.yaml
Normal file
104
infrastructure/environments/dev/k8s-manifests/kustomization.yaml
Normal file
@@ -0,0 +1,104 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
metadata:
|
||||
name: bakery-ia-dev
|
||||
|
||||
# NOTE: Do NOT set a global namespace here.
|
||||
# Each resource already has its namespace explicitly defined.
|
||||
# A global namespace would incorrectly transform cluster-scoped resources
|
||||
# like cert-manager namespaces.
|
||||
|
||||
resources:
|
||||
- ../../../environments/common/configs
|
||||
# NOTE: nominatim is NOT included here - it's deployed manually via Tilt trigger 'nominatim-helm'
|
||||
# - ../../../platform/nominatim
|
||||
- ../../../platform/gateway
|
||||
- ../../../platform/cert-manager
|
||||
- ../../../platform/networking/ingress/overlays/dev
|
||||
- ../../../platform/storage
|
||||
- ../../../services/databases
|
||||
- ../../../services/microservices
|
||||
# NOTE: cicd is NOT included here - it's deployed manually via Tilt triggers
|
||||
# Run 'tilt trigger tekton-install' followed by 'tilt trigger tekton-pipelines-deploy'
|
||||
# - ../../../cicd
|
||||
- dev-certificate.yaml
|
||||
|
||||
|
||||
|
||||
# Dev-specific patches
|
||||
patches:
|
||||
- target:
|
||||
kind: ConfigMap
|
||||
name: bakery-config
|
||||
patch: |-
|
||||
- op: replace
|
||||
path: /data/ENVIRONMENT
|
||||
value: "development"
|
||||
- op: replace
|
||||
path: /data/DEBUG
|
||||
value: "true"
|
||||
# NOTE: nominatim patches removed - nominatim is now deployed via Helm (tilt trigger nominatim-helm)
|
||||
|
||||
|
||||
labels:
|
||||
- includeSelectors: true
|
||||
pairs:
|
||||
environment: development
|
||||
tier: local
|
||||
|
||||
# Dev image overrides - use Kind registry to avoid Docker Hub rate limits
|
||||
# IMPORTANT: All image names must be lowercase (Docker requirement)
|
||||
# The prepull-base-images.sh script pushes images to localhost:5000/ with format: <repo>_<tag>
|
||||
# Format: localhost:5000/<package-name>_<tag>:latest
|
||||
images:
|
||||
# Database images
|
||||
- name: postgres
|
||||
newName: localhost:5000/postgres_17_alpine
|
||||
newTag: latest
|
||||
- name: redis
|
||||
newName: localhost:5000/redis_7_4_alpine
|
||||
newTag: latest
|
||||
- name: rabbitmq
|
||||
newName: localhost:5000/rabbitmq_4_1_management_alpine
|
||||
newTag: latest
|
||||
# Utility images
|
||||
- name: busybox
|
||||
newName: localhost:5000/busybox_1_36
|
||||
newTag: latest
|
||||
- name: curlimages/curl
|
||||
newName: localhost:5000/curlimages_curl_latest
|
||||
newTag: latest
|
||||
- name: bitnami/kubectl
|
||||
newName: localhost:5000/bitnami_kubectl_latest
|
||||
newTag: latest
|
||||
|
||||
# Alpine variants
|
||||
- name: alpine
|
||||
newName: localhost:5000/alpine_3_19
|
||||
newTag: latest
|
||||
- name: alpine/git
|
||||
newName: localhost:5000/alpine_git_2_43_0
|
||||
newTag: latest
|
||||
# CI/CD images (cached in Kind registry for consistency)
|
||||
- name: gcr.io/kaniko-project/executor
|
||||
newName: localhost:5000/gcr_io_kaniko_project_executor_v1_23_0
|
||||
newTag: latest
|
||||
- name: gcr.io/go-containerregistry/crane
|
||||
newName: localhost:5000/gcr_io_go_containerregistry_crane_latest
|
||||
newTag: latest
|
||||
- name: registry.k8s.io/kustomize/kustomize
|
||||
newName: localhost:5000/registry_k8s_io_kustomize_kustomize_v5_3_0
|
||||
newTag: latest
|
||||
# Storage images
|
||||
- name: minio/minio
|
||||
newName: localhost:5000/minio_minio_release_2024_11_07t00_52_20z
|
||||
newTag: latest
|
||||
- name: minio/mc
|
||||
newName: localhost:5000/minio_mc_release_2024_11_17t19_35_25z
|
||||
newTag: latest
|
||||
# NOTE: nominatim image override removed - nominatim is now deployed via Helm
|
||||
# Python base image
|
||||
- name: python
|
||||
newName: localhost:5000/python_3_11_slim
|
||||
newTag: latest
|
||||
@@ -0,0 +1,347 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
metadata:
|
||||
name: bakery-ia-prod
|
||||
|
||||
# NOTE: Do NOT set a global namespace here.
|
||||
# Each resource already has its namespace explicitly defined.
|
||||
# A global namespace would incorrectly transform cluster-scoped resources
|
||||
# like flux-system and cert-manager namespaces.
|
||||
|
||||
resources:
|
||||
- ../../../environments/common/configs
|
||||
- ../../../platform/cert-manager
|
||||
- ../../../platform/networking/ingress/overlays/prod
|
||||
- ../../../platform/gateway
|
||||
- ../../../platform/storage
|
||||
- ../../../services/databases
|
||||
- ../../../services/microservices
|
||||
# NOTE: CI/CD (gitea, tekton, flux) deployed via Helm, not kustomize
|
||||
- prod-certificate.yaml
|
||||
|
||||
|
||||
# SigNoz is managed via Helm deployment (see infrastructure/helm/deploy-signoz.sh)
|
||||
# Monitoring is handled by SigNoz (no separate monitoring components needed)
|
||||
# SigNoz paths are now included in the main ingress (ingress-https.yaml)
|
||||
|
||||
labels:
|
||||
- includeSelectors: false
|
||||
pairs:
|
||||
environment: production
|
||||
tier: production
|
||||
|
||||
# Production configuration patches
|
||||
patches:
|
||||
# Override ConfigMap values for production
|
||||
- target:
|
||||
kind: ConfigMap
|
||||
name: bakery-config
|
||||
patch: |-
|
||||
- op: replace
|
||||
path: /data/ENVIRONMENT
|
||||
value: "production"
|
||||
- op: replace
|
||||
path: /data/DEBUG
|
||||
value: "false"
|
||||
- op: replace
|
||||
path: /data/LOG_LEVEL
|
||||
value: "INFO"
|
||||
- op: replace
|
||||
path: /data/PROFILING_ENABLED
|
||||
value: "false"
|
||||
- op: replace
|
||||
path: /data/MOCK_EXTERNAL_APIS
|
||||
value: "false"
|
||||
- op: add
|
||||
path: /data/REQUEST_TIMEOUT
|
||||
value: "30"
|
||||
- op: add
|
||||
path: /data/MAX_CONNECTIONS
|
||||
value: "100"
|
||||
- op: replace
|
||||
path: /data/ENABLE_TRACING
|
||||
value: "true"
|
||||
- op: replace
|
||||
path: /data/ENABLE_METRICS
|
||||
value: "true"
|
||||
- op: replace
|
||||
path: /data/ENABLE_LOGS
|
||||
value: "true"
|
||||
- op: add
|
||||
path: /data/OTEL_EXPORTER_OTLP_ENDPOINT
|
||||
value: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
|
||||
- op: add
|
||||
path: /data/OTEL_EXPORTER_OTLP_PROTOCOL
|
||||
value: "grpc"
|
||||
- op: add
|
||||
path: /data/OTEL_SERVICE_NAME
|
||||
value: "bakery-ia"
|
||||
- op: add
|
||||
path: /data/OTEL_RESOURCE_ATTRIBUTES
|
||||
value: "deployment.environment=production,cluster.name=bakery-ia-prod"
|
||||
- op: add
|
||||
path: /data/SIGNOZ_ENDPOINT
|
||||
value: "http://signoz.signoz.svc.cluster.local:8080"
|
||||
- op: add
|
||||
path: /data/SIGNOZ_FRONTEND_URL
|
||||
value: "https://monitoring.bakewise.ai"
|
||||
- op: add
|
||||
path: /data/SIGNOZ_ROOT_URL
|
||||
value: "https://monitoring.bakewise.ai"
|
||||
- op: add
|
||||
path: /data/RATE_LIMIT_ENABLED
|
||||
value: "true"
|
||||
- op: add
|
||||
path: /data/RATE_LIMIT_PER_MINUTE
|
||||
value: "60"
|
||||
- op: add
|
||||
path: /data/CORS_ORIGINS
|
||||
value: "https://bakewise.ai"
|
||||
- op: add
|
||||
path: /data/CORS_ALLOW_CREDENTIALS
|
||||
value: "true"
|
||||
- op: add
|
||||
path: /data/VITE_API_URL
|
||||
value: "/api"
|
||||
- op: add
|
||||
path: /data/VITE_ENVIRONMENT
|
||||
value: "production"
|
||||
# Add imagePullSecrets to all Deployments for gitea registry authentication
|
||||
- target:
|
||||
kind: Deployment
|
||||
patch: |-
|
||||
- op: add
|
||||
path: /spec/template/spec/imagePullSecrets
|
||||
value:
|
||||
- name: gitea-registry-secret
|
||||
|
||||
# Add imagePullSecrets to all StatefulSets for gitea registry authentication
|
||||
- target:
|
||||
kind: StatefulSet
|
||||
patch: |-
|
||||
- op: add
|
||||
path: /spec/template/spec/imagePullSecrets
|
||||
value:
|
||||
- name: gitea-registry-secret
|
||||
|
||||
# Add imagePullSecrets to all Jobs for gitea registry authentication
|
||||
- target:
|
||||
kind: Job
|
||||
patch: |-
|
||||
- op: add
|
||||
path: /spec/template/spec/imagePullSecrets
|
||||
value:
|
||||
- name: gitea-registry-secret
|
||||
|
||||
# Add imagePullSecrets to all CronJobs for gitea registry authentication
|
||||
- target:
|
||||
kind: CronJob
|
||||
patch: |-
|
||||
- op: add
|
||||
path: /spec/jobTemplate/spec/template/spec/imagePullSecrets
|
||||
value:
|
||||
- name: gitea-registry-secret
|
||||
# SigNoz resource patches for production
|
||||
# SigNoz ClickHouse production configuration
|
||||
- target:
|
||||
group: apps
|
||||
version: v1
|
||||
kind: StatefulSet
|
||||
name: signoz-clickhouse
|
||||
namespace: bakery-ia
|
||||
patch: |-
|
||||
- op: replace
|
||||
path: /spec/replicas
|
||||
value: 2
|
||||
- op: replace
|
||||
path: /spec/template/spec/containers/0/resources
|
||||
value:
|
||||
requests:
|
||||
memory: "2Gi"
|
||||
cpu: "500m"
|
||||
limits:
|
||||
memory: "4Gi"
|
||||
cpu: "1000m"
|
||||
# SigNoz Main Service production configuration (v0.106.0+ unified service)
|
||||
- target:
|
||||
group: apps
|
||||
version: v1
|
||||
kind: StatefulSet
|
||||
name: signoz
|
||||
namespace: bakery-ia
|
||||
patch: |-
|
||||
- op: replace
|
||||
path: /spec/replicas
|
||||
value: 2
|
||||
- op: replace
|
||||
path: /spec/template/spec/containers/0/resources
|
||||
value:
|
||||
requests:
|
||||
memory: "2Gi"
|
||||
cpu: "1000m"
|
||||
limits:
|
||||
memory: "4Gi"
|
||||
cpu: "2000m"
|
||||
# SigNoz AlertManager production configuration
|
||||
- target:
|
||||
group: apps
|
||||
version: v1
|
||||
kind: Deployment
|
||||
name: signoz-alertmanager
|
||||
namespace: bakery-ia
|
||||
patch: |-
|
||||
- op: replace
|
||||
path: /spec/replicas
|
||||
value: 2
|
||||
- op: replace
|
||||
path: /spec/template/spec/containers/0/resources
|
||||
value:
|
||||
requests:
|
||||
memory: "512Mi"
|
||||
cpu: "250m"
|
||||
limits:
|
||||
memory: "1Gi"
|
||||
cpu: "500m"
|
||||
|
||||
images:
|
||||
# Application services
|
||||
- name: bakery/auth-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/auth-service
|
||||
newTag: latest
|
||||
- name: bakery/tenant-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/tenant-service
|
||||
newTag: latest
|
||||
- name: bakery/training-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/training-service
|
||||
newTag: latest
|
||||
- name: bakery/forecasting-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/forecasting-service
|
||||
newTag: latest
|
||||
- name: bakery/sales-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/sales-service
|
||||
newTag: latest
|
||||
- name: bakery/external-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/external-service
|
||||
newTag: latest
|
||||
- name: bakery/notification-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/notification-service
|
||||
newTag: latest
|
||||
- name: bakery/inventory-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/inventory-service
|
||||
newTag: latest
|
||||
- name: bakery/recipes-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/recipes-service
|
||||
newTag: latest
|
||||
- name: bakery/suppliers-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/suppliers-service
|
||||
newTag: latest
|
||||
- name: bakery/pos-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/pos-service
|
||||
newTag: latest
|
||||
- name: bakery/orders-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/orders-service
|
||||
newTag: latest
|
||||
- name: bakery/production-service
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/production-service
|
||||
newTag: latest
|
||||
- name: bakery/alert-processor
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/alert-processor
|
||||
newTag: latest
|
||||
- name: bakery/gateway
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/gateway
|
||||
newTag: latest
|
||||
- name: bakery/dashboard
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/dashboard
|
||||
newTag: latest
|
||||
# =============================================================================
|
||||
# Database images (cached in gitea registry for consistency)
|
||||
- name: postgres
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/postgres
|
||||
newTag: "17-alpine"
|
||||
- name: redis
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/redis
|
||||
newTag: "7.4-alpine"
|
||||
- name: rabbitmq
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/rabbitmq
|
||||
newTag: "4.1-management-alpine"
|
||||
# Utility images
|
||||
- name: busybox
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/busybox
|
||||
newTag: "1.36"
|
||||
- name: curlimages/curl
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/curlimages-curl
|
||||
newTag: latest
|
||||
- name: bitnami/kubectl
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bitnami-kubectl
|
||||
newTag: latest
|
||||
|
||||
# Alpine variants
|
||||
- name: alpine
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/alpine
|
||||
newTag: "3.19"
|
||||
- name: alpine/git
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/alpine-git
|
||||
newTag: 2.43.0
|
||||
# CI/CD images (cached in gitea registry for consistency)
|
||||
- name: gcr.io/kaniko-project/executor
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/gcr.io-kaniko-project-executor
|
||||
newTag: v1.23.0
|
||||
- name: gcr.io/go-containerregistry/crane
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/gcr.io-go-containerregistry-crane
|
||||
newTag: latest
|
||||
- name: registry.k8s.io/kustomize/kustomize
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/registry.k8s.io-kustomize-kustomize
|
||||
newTag: v5.3.0
|
||||
# Storage images
|
||||
- name: minio/minio
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/minio-minio
|
||||
newTag: RELEASE.2024-11-07T00-52-20Z
|
||||
- name: minio/mc
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/minio-mc
|
||||
newTag: RELEASE.2024-11-17T19-35-25Z
|
||||
# NOTE: nominatim image override removed - nominatim is now deployed via Helm
|
||||
# Python base image
|
||||
- name: python
|
||||
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/python
|
||||
newTag: 3.11-slim
|
||||
|
||||
replicas:
|
||||
- name: auth-service
|
||||
count: 3
|
||||
- name: tenant-service
|
||||
count: 2
|
||||
- name: training-service
|
||||
count: 3 # Safe with MinIO storage - no PVC conflicts
|
||||
- name: forecasting-service
|
||||
count: 3
|
||||
- name: sales-service
|
||||
count: 2
|
||||
- name: external-service
|
||||
count: 2
|
||||
- name: notification-service
|
||||
count: 3
|
||||
- name: inventory-service
|
||||
count: 2
|
||||
- name: recipes-service
|
||||
count: 2
|
||||
- name: suppliers-service
|
||||
count: 2
|
||||
- name: pos-service
|
||||
count: 2
|
||||
- name: orders-service
|
||||
count: 3
|
||||
- name: production-service
|
||||
count: 2
|
||||
- name: alert-processor
|
||||
count: 3
|
||||
- name: procurement-service
|
||||
count: 2
|
||||
- name: orchestrator-service
|
||||
count: 2
|
||||
- name: ai-insights-service
|
||||
count: 2
|
||||
- name: gateway
|
||||
count: 3
|
||||
- name: frontend
|
||||
count: 2
|
||||
@@ -0,0 +1,49 @@
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: Certificate
|
||||
metadata:
|
||||
name: bakery-ia-prod-tls-cert
|
||||
namespace: bakery-ia
|
||||
spec:
|
||||
# Let's Encrypt certificate for production
|
||||
secretName: bakery-ia-prod-tls-cert
|
||||
|
||||
# Certificate duration and renewal
|
||||
duration: 2160h # 90 days (Let's Encrypt default)
|
||||
renewBefore: 360h # 15 days before expiry
|
||||
|
||||
# Subject configuration
|
||||
subject:
|
||||
organizations:
|
||||
- Bakery IA
|
||||
|
||||
# Common name
|
||||
commonName: bakewise.ai
|
||||
|
||||
# DNS names this certificate is valid for
|
||||
dnsNames:
|
||||
- bakewise.ai
|
||||
- www.bakewise.ai
|
||||
- mail.bakewise.ai
|
||||
- monitoring.bakewise.ai
|
||||
- gitea.bakewise.ai
|
||||
- registry.bakewise.ai
|
||||
- api.bakewise.ai
|
||||
|
||||
# Use Let's Encrypt production issuer
|
||||
issuerRef:
|
||||
name: letsencrypt-production
|
||||
kind: ClusterIssuer
|
||||
group: cert-manager.io
|
||||
|
||||
# Private key configuration
|
||||
privateKey:
|
||||
algorithm: RSA
|
||||
encoding: PKCS1
|
||||
size: 2048
|
||||
|
||||
# Usages
|
||||
usages:
|
||||
- server auth
|
||||
- client auth
|
||||
- digital signature
|
||||
- key encipherment
|
||||
@@ -0,0 +1,47 @@
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: bakery-config
|
||||
namespace: bakery-ia
|
||||
data:
|
||||
# Environment
|
||||
ENVIRONMENT: "production"
|
||||
DEBUG: "false"
|
||||
LOG_LEVEL: "INFO"
|
||||
|
||||
# Profiling and Development Features (disabled in production)
|
||||
PROFILING_ENABLED: "false"
|
||||
MOCK_EXTERNAL_APIS: "false"
|
||||
|
||||
# Performance and Security
|
||||
REQUEST_TIMEOUT: "30"
|
||||
MAX_CONNECTIONS: "100"
|
||||
|
||||
# Monitoring - SigNoz (Unified Observability)
|
||||
ENABLE_TRACING: "true"
|
||||
ENABLE_METRICS: "true"
|
||||
ENABLE_LOGS: "true"
|
||||
|
||||
# OpenTelemetry Configuration - Direct to SigNoz
|
||||
# IMPORTANT: gRPC endpoints should NOT include http:// prefix
|
||||
OTEL_EXPORTER_OTLP_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
|
||||
OTEL_EXPORTER_OTLP_PROTOCOL: "grpc"
|
||||
OTEL_SERVICE_NAME: "bakery-ia"
|
||||
OTEL_RESOURCE_ATTRIBUTES: "deployment.environment=production,cluster.name=bakery-ia-prod"
|
||||
|
||||
# SigNoz Endpoints (v0.106.0+ unified service)
|
||||
SIGNOZ_ENDPOINT: "http://signoz.bakery-ia.svc.cluster.local:8080"
|
||||
SIGNOZ_FRONTEND_URL: "https://monitoring.bakewise.ai"
|
||||
SIGNOZ_ROOT_URL: "https://monitoring.bakewise.ai"
|
||||
|
||||
# Rate Limiting (stricter in production)
|
||||
RATE_LIMIT_ENABLED: "true"
|
||||
RATE_LIMIT_PER_MINUTE: "60"
|
||||
|
||||
# CORS Configuration for Production
|
||||
CORS_ORIGINS: "https://bakewise.ai"
|
||||
CORS_ALLOW_CREDENTIALS: "true"
|
||||
|
||||
# Frontend Configuration
|
||||
VITE_API_URL: "/api"
|
||||
VITE_ENVIRONMENT: "production"
|
||||
616
infrastructure/monitoring/signoz/README.md
Normal file
616
infrastructure/monitoring/signoz/README.md
Normal file
@@ -0,0 +1,616 @@
|
||||
# SigNoz Helm Deployment for Bakery IA
|
||||
|
||||
This directory contains Helm configurations and deployment scripts for SigNoz observability platform.
|
||||
|
||||
## Overview
|
||||
|
||||
SigNoz is deployed using the official Helm chart with environment-specific configurations optimized for:
|
||||
- **Development**: Colima + Kind (Kubernetes in Docker) with Tilt
|
||||
- **Production**: VPS on clouding.io with MicroK8s
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Required Tools
|
||||
- **kubectl** 1.22+
|
||||
- **Helm** 3.8+
|
||||
- **Docker** (for development)
|
||||
- **Kind/MicroK8s** (environment-specific)
|
||||
|
||||
### Docker Hub Authentication
|
||||
|
||||
SigNoz uses images from Docker Hub. Set up authentication to avoid rate limits:
|
||||
|
||||
```bash
|
||||
# Option 1: Environment variables (recommended)
|
||||
export DOCKERHUB_USERNAME='your-username'
|
||||
export DOCKERHUB_PASSWORD='your-personal-access-token'
|
||||
|
||||
# Option 2: Docker login
|
||||
docker login
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Development Deployment
|
||||
|
||||
```bash
|
||||
# Deploy SigNoz to development environment
|
||||
./deploy-signoz.sh dev
|
||||
|
||||
# Verify deployment
|
||||
./verify-signoz.sh dev
|
||||
|
||||
# Access SigNoz UI
|
||||
# Via ingress: http://monitoring.bakery-ia.local
|
||||
# Or port-forward:
|
||||
kubectl port-forward -n signoz svc/signoz 8080:8080
|
||||
# Then open: http://localhost:8080
|
||||
```
|
||||
|
||||
### Production Deployment
|
||||
|
||||
```bash
|
||||
# Deploy SigNoz to production environment
|
||||
./deploy-signoz.sh prod
|
||||
|
||||
# Verify deployment
|
||||
./verify-signoz.sh prod
|
||||
|
||||
# Access SigNoz UI
|
||||
# https://monitoring.bakewise.ai
|
||||
```
|
||||
|
||||
## Configuration Files
|
||||
|
||||
### signoz-values-dev.yaml
|
||||
|
||||
Development environment configuration with:
|
||||
- Single replica for most components
|
||||
- Reduced resource requests (optimized for local Kind cluster)
|
||||
- 7-day data retention
|
||||
- Batch size: 10,000 events
|
||||
- ClickHouse 25.5.6, OTel Collector v0.129.12
|
||||
- PostgreSQL, Redis, and RabbitMQ receivers configured
|
||||
|
||||
### signoz-values-prod.yaml
|
||||
|
||||
Production environment configuration with:
|
||||
- High availability: 2+ replicas for critical components
|
||||
- 3 Zookeeper replicas (required for production)
|
||||
- 30-day data retention
|
||||
- Batch size: 50,000 events (high-performance)
|
||||
- Cold storage enabled with 30-day TTL
|
||||
- Horizontal Pod Autoscaler (HPA) enabled
|
||||
- TLS/SSL with cert-manager
|
||||
- Enhanced security with pod anti-affinity rules
|
||||
|
||||
## Key Configuration Changes (v0.89.0+)
|
||||
|
||||
⚠️ **BREAKING CHANGE**: SigNoz Helm chart v0.89.0+ uses a unified component structure.
|
||||
|
||||
**Old Structure (deprecated):**
|
||||
```yaml
|
||||
frontend:
|
||||
replicaCount: 2
|
||||
queryService:
|
||||
replicaCount: 2
|
||||
```
|
||||
|
||||
**New Structure (current):**
|
||||
```yaml
|
||||
signoz:
|
||||
replicaCount: 2
|
||||
# Combines frontend + query service
|
||||
```
|
||||
|
||||
## Component Architecture
|
||||
|
||||
### Core Components
|
||||
|
||||
1. **SigNoz** (unified component)
|
||||
- Frontend UI + Query Service
|
||||
- Port 8080 (HTTP/API), 8085 (internal gRPC)
|
||||
- Dev: 1 replica, Prod: 2+ replicas with HPA
|
||||
|
||||
2. **ClickHouse** (Time-series database)
|
||||
- Version: 25.5.6
|
||||
- Stores traces, metrics, and logs
|
||||
- Dev: 1 replica, Prod: 2 replicas with cold storage
|
||||
|
||||
3. **Zookeeper** (ClickHouse coordination)
|
||||
- Version: 3.7.1
|
||||
- Dev: 1 replica, Prod: 3 replicas (critical for HA)
|
||||
|
||||
4. **OpenTelemetry Collector** (Data ingestion)
|
||||
- Version: v0.129.12
|
||||
- Ports: 4317 (gRPC), 4318 (HTTP), 8888 (metrics)
|
||||
- Dev: 1 replica, Prod: 2+ replicas with HPA
|
||||
|
||||
5. **Alertmanager** (Alert management)
|
||||
- Version: 0.23.5
|
||||
- Email and Slack integrations configured
|
||||
- Port: 9093
|
||||
|
||||
## Performance Optimizations
|
||||
|
||||
### Batch Processing
|
||||
- **Development**: 10,000 events per batch
|
||||
- **Production**: 50,000 events per batch (official recommendation)
|
||||
- Timeout: 1 second for faster processing
|
||||
|
||||
### Memory Management
|
||||
- Memory limiter processor prevents OOM
|
||||
- Dev: 400 MiB limit, Prod: 1500 MiB limit
|
||||
- Spike limits configured
|
||||
|
||||
### Span Metrics Processor
|
||||
Automatically generates RED metrics (Rate, Errors, Duration):
|
||||
- Latency histogram buckets optimized for microservices
|
||||
- Cache size: 10K (dev), 100K (prod)
|
||||
|
||||
### Cold Storage (Production Only)
|
||||
- Enabled with 30-day TTL
|
||||
- Automatically moves old data to cold storage
|
||||
- Keeps 10GB free on primary storage
|
||||
|
||||
## OpenTelemetry Endpoints
|
||||
|
||||
### From Within Kubernetes Cluster
|
||||
|
||||
**Development:**
|
||||
```
|
||||
OTLP gRPC: signoz-otel-collector.bakery-ia.svc.cluster.local:4317
|
||||
OTLP HTTP: signoz-otel-collector.bakery-ia.svc.cluster.local:4318
|
||||
```
|
||||
|
||||
**Production:**
|
||||
```
|
||||
OTLP gRPC: signoz-otel-collector.bakery-ia.svc.cluster.local:4317
|
||||
OTLP HTTP: signoz-otel-collector.bakery-ia.svc.cluster.local:4318
|
||||
```
|
||||
|
||||
### Application Configuration Example
|
||||
|
||||
```yaml
|
||||
# Python with OpenTelemetry
|
||||
OTEL_EXPORTER_OTLP_ENDPOINT: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318"
|
||||
OTEL_EXPORTER_OTLP_PROTOCOL: "http/protobuf"
|
||||
```
|
||||
|
||||
```javascript
|
||||
// Node.js with OpenTelemetry
|
||||
const exporter = new OTLPTraceExporter({
|
||||
url: 'http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318/v1/traces',
|
||||
});
|
||||
```
|
||||
|
||||
## Deployment Scripts
|
||||
|
||||
### deploy-signoz.sh
|
||||
|
||||
Comprehensive deployment script with features:
|
||||
|
||||
```bash
|
||||
# Usage
|
||||
./deploy-signoz.sh [OPTIONS] ENVIRONMENT
|
||||
|
||||
# Options
|
||||
-h, --help Show help message
|
||||
-d, --dry-run Show what would be deployed
|
||||
-u, --upgrade Upgrade existing deployment
|
||||
-r, --remove Remove deployment
|
||||
-n, --namespace NS Custom namespace (default: signoz)
|
||||
|
||||
# Examples
|
||||
./deploy-signoz.sh dev # Deploy to dev
|
||||
./deploy-signoz.sh --upgrade prod # Upgrade prod
|
||||
./deploy-signoz.sh --dry-run prod # Preview changes
|
||||
./deploy-signoz.sh --remove dev # Remove dev deployment
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- Automatic Helm repository setup
|
||||
- Docker Hub secret creation
|
||||
- Namespace management
|
||||
- Deployment verification
|
||||
- 15-minute timeout with `--wait` flag
|
||||
|
||||
### verify-signoz.sh
|
||||
|
||||
Verification script to check deployment health:
|
||||
|
||||
```bash
|
||||
# Usage
|
||||
./verify-signoz.sh [OPTIONS] ENVIRONMENT
|
||||
|
||||
# Examples
|
||||
./verify-signoz.sh dev # Verify dev deployment
|
||||
./verify-signoz.sh prod # Verify prod deployment
|
||||
```
|
||||
|
||||
**Checks performed:**
|
||||
1. ✅ Helm release status
|
||||
2. ✅ Pod health and readiness
|
||||
3. ✅ Service availability
|
||||
4. ✅ Ingress configuration
|
||||
5. ✅ PVC status
|
||||
6. ✅ Resource usage (if metrics-server available)
|
||||
7. ✅ Log errors
|
||||
8. ✅ Environment-specific validations
|
||||
- Dev: Single replica, resource limits
|
||||
- Prod: HA config, TLS, Zookeeper replicas, HPA
|
||||
|
||||
## Storage Configuration
|
||||
|
||||
### Development (Kind)
|
||||
```yaml
|
||||
global:
|
||||
storageClass: "standard" # Kind's default provisioner
|
||||
```
|
||||
|
||||
### Production (MicroK8s)
|
||||
```yaml
|
||||
global:
|
||||
storageClass: "microk8s-hostpath" # Or custom storage class
|
||||
```
|
||||
|
||||
**Storage Requirements:**
|
||||
- **Development**: ~35 GiB total
|
||||
- SigNoz: 5 GiB
|
||||
- ClickHouse: 20 GiB
|
||||
- Zookeeper: 5 GiB
|
||||
- Alertmanager: 2 GiB
|
||||
|
||||
- **Production**: ~135 GiB total
|
||||
- SigNoz: 20 GiB
|
||||
- ClickHouse: 100 GiB
|
||||
- Zookeeper: 10 GiB
|
||||
- Alertmanager: 5 GiB
|
||||
|
||||
## Resource Requirements
|
||||
|
||||
### Development Environment
|
||||
**Minimum:**
|
||||
- CPU: 550m (0.55 cores)
|
||||
- Memory: 1.6 GiB
|
||||
- Storage: 35 GiB
|
||||
|
||||
**Recommended:**
|
||||
- CPU: 3 cores
|
||||
- Memory: 3 GiB
|
||||
- Storage: 50 GiB
|
||||
|
||||
### Production Environment
|
||||
**Minimum:**
|
||||
- CPU: 3.5 cores
|
||||
- Memory: 8 GiB
|
||||
- Storage: 135 GiB
|
||||
|
||||
**Recommended:**
|
||||
- CPU: 12 cores
|
||||
- Memory: 20 GiB
|
||||
- Storage: 200 GiB
|
||||
|
||||
## Data Retention
|
||||
|
||||
### Development
|
||||
- Traces: 7 days (168 hours)
|
||||
- Metrics: 7 days (168 hours)
|
||||
- Logs: 7 days (168 hours)
|
||||
|
||||
### Production
|
||||
- Traces: 30 days (720 hours)
|
||||
- Metrics: 30 days (720 hours)
|
||||
- Logs: 30 days (720 hours)
|
||||
- Cold storage after 30 days
|
||||
|
||||
To modify retention, update the environment variables:
|
||||
```yaml
|
||||
signoz:
|
||||
env:
|
||||
signoz_traces_ttl_duration_hrs: "720" # 30 days
|
||||
signoz_metrics_ttl_duration_hrs: "720" # 30 days
|
||||
signoz_logs_ttl_duration_hrs: "168" # 7 days
|
||||
```
|
||||
|
||||
## High Availability (Production)
|
||||
|
||||
### Replication Strategy
|
||||
```yaml
|
||||
signoz: 2 replicas + HPA (min: 2, max: 5)
|
||||
clickhouse: 2 replicas
|
||||
zookeeper: 3 replicas (critical!)
|
||||
otelCollector: 2 replicas + HPA (min: 2, max: 10)
|
||||
alertmanager: 2 replicas
|
||||
```
|
||||
|
||||
### Pod Anti-Affinity
|
||||
Ensures pods are distributed across different nodes:
|
||||
```yaml
|
||||
affinity:
|
||||
podAntiAffinity:
|
||||
preferredDuringSchedulingIgnoredDuringExecution:
|
||||
- weight: 100
|
||||
podAffinityTerm:
|
||||
labelSelector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/component: query-service
|
||||
topologyKey: kubernetes.io/hostname
|
||||
```
|
||||
|
||||
### Pod Disruption Budgets
|
||||
Configured for all critical components:
|
||||
```yaml
|
||||
podDisruptionBudget:
|
||||
enabled: true
|
||||
minAvailable: 1
|
||||
```
|
||||
|
||||
## Monitoring and Alerting
|
||||
|
||||
### Email Alerts (Production)
|
||||
Configure SMTP in production values (using Mailu Helm with Mailgun relay):
|
||||
```yaml
|
||||
signoz:
|
||||
env:
|
||||
signoz_smtp_enabled: "true"
|
||||
signoz_smtp_host: "mailu-postfix.bakery-ia.svc.cluster.local"
|
||||
signoz_smtp_port: "587"
|
||||
signoz_smtp_from: "alerts@bakewise.ai"
|
||||
signoz_smtp_username: "alerts@bakewise.ai"
|
||||
# Set via secret: signoz_smtp_password
|
||||
```
|
||||
|
||||
**Note**: Signoz now uses the internal Mailu SMTP service (deployed via Helm), which relays to Mailgun for better deliverability and centralized email management.
|
||||
|
||||
### Slack Alerts (Production)
|
||||
Configure webhook in Alertmanager:
|
||||
```yaml
|
||||
alertmanager:
|
||||
config:
|
||||
receivers:
|
||||
- name: 'critical-alerts'
|
||||
slack_configs:
|
||||
- api_url: '${SLACK_WEBHOOK_URL}'
|
||||
channel: '#alerts-critical'
|
||||
```
|
||||
|
||||
### Mailgun Integration for Alert Emails
|
||||
|
||||
Signoz has been configured to use Mailgun for sending alert emails through the Mailu SMTP service. This provides:
|
||||
|
||||
**Benefits:**
|
||||
- Better email deliverability through Mailgun's infrastructure
|
||||
- Centralized email management via Mailu
|
||||
- Improved tracking and analytics for alert emails
|
||||
- Compliance with email sending best practices
|
||||
|
||||
**Architecture:**
|
||||
```
|
||||
Signoz Alertmanager → Mailu SMTP → Mailgun Relay → Recipients
|
||||
```
|
||||
|
||||
**Configuration Requirements:**
|
||||
|
||||
1. **Mailu Configuration** (deployed via Helm at `infrastructure/platform/mail/mailu-helm/`):
|
||||
```yaml
|
||||
externalRelay:
|
||||
host: "[smtp.mailgun.org]:587"
|
||||
username: "postmaster@bakewise.ai"
|
||||
password: "<mailgun-api-key>"
|
||||
```
|
||||
|
||||
2. **DNS Configuration** (required for Mailgun):
|
||||
```
|
||||
# MX record
|
||||
bakewise.ai. IN MX 10 mail.bakewise.ai.
|
||||
|
||||
# SPF record (authorize Mailgun)
|
||||
bakewise.ai. IN TXT "v=spf1 include:mailgun.org ~all"
|
||||
|
||||
# DKIM record (provided by Mailgun)
|
||||
m1._domainkey.bakewise.ai. IN TXT "v=DKIM1; k=rsa; p=<mailgun-public-key>"
|
||||
|
||||
# DMARC record
|
||||
_dmarc.bakewise.ai. IN TXT "v=DMARC1; p=quarantine; rua=mailto:dmarc@bakewise.ai"
|
||||
```
|
||||
|
||||
3. **Signoz SMTP Configuration** (already configured in `signoz-values-prod.yaml`):
|
||||
```yaml
|
||||
signoz_smtp_host: "mailu-postfix.bakery-ia.svc.cluster.local"
|
||||
signoz_smtp_port: "587"
|
||||
signoz_smtp_from: "alerts@bakewise.ai"
|
||||
```
|
||||
|
||||
**Testing the Integration:**
|
||||
|
||||
1. Trigger a test alert from Signoz UI
|
||||
2. Check Mailu logs: `kubectl logs -f -n bakery-ia deployment/mailu-postfix`
|
||||
3. Check Mailgun dashboard for delivery status
|
||||
4. Verify email receipt in destination inbox
|
||||
|
||||
**Troubleshooting:**
|
||||
|
||||
- **SMTP Authentication Failed**: Verify Mailu credentials and Mailgun API key
|
||||
- **Email Delivery Delays**: Check Mailu queue with `kubectl exec -it -n bakery-ia deployment/mailu-postfix -- mailq`
|
||||
- **SPF/DKIM Issues**: Verify DNS records and Mailgun domain verification
|
||||
|
||||
### Self-Monitoring
|
||||
SigNoz monitors itself:
|
||||
```yaml
|
||||
selfMonitoring:
|
||||
enabled: true
|
||||
serviceMonitor:
|
||||
enabled: true # Prod only
|
||||
interval: 30s
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**1. Pods not starting**
|
||||
```bash
|
||||
# Check pod status
|
||||
kubectl get pods -n signoz
|
||||
|
||||
# Check pod logs
|
||||
kubectl logs -n signoz <pod-name>
|
||||
|
||||
# Describe pod for events
|
||||
kubectl describe pod -n signoz <pod-name>
|
||||
```
|
||||
|
||||
**2. Docker Hub rate limits**
|
||||
```bash
|
||||
# Verify secret exists
|
||||
kubectl get secret dockerhub-creds -n signoz
|
||||
|
||||
# Recreate secret
|
||||
kubectl delete secret dockerhub-creds -n signoz
|
||||
export DOCKERHUB_USERNAME='your-username'
|
||||
export DOCKERHUB_PASSWORD='your-token'
|
||||
./deploy-signoz.sh dev
|
||||
```
|
||||
|
||||
**3. ClickHouse connection issues**
|
||||
```bash
|
||||
# Check ClickHouse pod
|
||||
kubectl logs -n signoz -l app.kubernetes.io/component=clickhouse
|
||||
|
||||
# Check Zookeeper (required by ClickHouse)
|
||||
kubectl logs -n signoz -l app.kubernetes.io/component=zookeeper
|
||||
```
|
||||
|
||||
**4. OTel Collector not receiving data**
|
||||
```bash
|
||||
# Check OTel Collector logs
|
||||
kubectl logs -n signoz -l app.kubernetes.io/component=otel-collector
|
||||
|
||||
# Test connectivity
|
||||
kubectl port-forward -n signoz svc/signoz-otel-collector 4318:4318
|
||||
curl -v http://localhost:4318/v1/traces
|
||||
```
|
||||
|
||||
**5. Insufficient storage**
|
||||
```bash
|
||||
# Check PVC status
|
||||
kubectl get pvc -n signoz
|
||||
|
||||
# Check storage usage (if metrics-server available)
|
||||
kubectl top pods -n signoz
|
||||
```
|
||||
|
||||
### Debug Mode
|
||||
|
||||
Enable debug exporter in OTel Collector:
|
||||
```yaml
|
||||
otelCollector:
|
||||
config:
|
||||
exporters:
|
||||
debug:
|
||||
verbosity: detailed
|
||||
sampling_initial: 5
|
||||
sampling_thereafter: 200
|
||||
service:
|
||||
pipelines:
|
||||
traces:
|
||||
exporters: [clickhousetraces, debug] # Add debug
|
||||
```
|
||||
|
||||
### Upgrade from Old Version
|
||||
|
||||
If upgrading from pre-v0.89.0:
|
||||
```bash
|
||||
# 1. Backup data (recommended)
|
||||
kubectl get all -n signoz -o yaml > signoz-backup.yaml
|
||||
|
||||
# 2. Remove old deployment
|
||||
./deploy-signoz.sh --remove prod
|
||||
|
||||
# 3. Deploy new version
|
||||
./deploy-signoz.sh prod
|
||||
|
||||
# 4. Verify
|
||||
./verify-signoz.sh prod
|
||||
```
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
1. **Change default password** immediately after first login
|
||||
2. **Use TLS/SSL** in production (configured with cert-manager)
|
||||
3. **Network policies** enabled in production
|
||||
4. **Run as non-root** (configured in securityContext)
|
||||
5. **RBAC** with dedicated service account
|
||||
6. **Secrets management** for sensitive data (SMTP, Slack webhooks)
|
||||
7. **Image pull secrets** to avoid exposing Docker Hub credentials
|
||||
|
||||
## Backup and Recovery
|
||||
|
||||
### Backup ClickHouse Data
|
||||
```bash
|
||||
# Export ClickHouse data
|
||||
kubectl exec -n signoz <clickhouse-pod> -- clickhouse-client \
|
||||
--query="BACKUP DATABASE signoz_traces TO Disk('backups', 'traces_backup.zip')"
|
||||
|
||||
# Copy backup out
|
||||
kubectl cp signoz/<clickhouse-pod>:/var/lib/clickhouse/backups/ ./backups/
|
||||
```
|
||||
|
||||
### Restore from Backup
|
||||
```bash
|
||||
# Copy backup in
|
||||
kubectl cp ./backups/ signoz/<clickhouse-pod>:/var/lib/clickhouse/backups/
|
||||
|
||||
# Restore
|
||||
kubectl exec -n signoz <clickhouse-pod> -- clickhouse-client \
|
||||
--query="RESTORE DATABASE signoz_traces FROM Disk('backups', 'traces_backup.zip')"
|
||||
```
|
||||
|
||||
## Updating Configuration
|
||||
|
||||
To update SigNoz configuration:
|
||||
|
||||
1. Edit values file: `signoz-values-{env}.yaml`
|
||||
2. Apply changes:
|
||||
```bash
|
||||
./deploy-signoz.sh --upgrade {env}
|
||||
```
|
||||
3. Verify:
|
||||
```bash
|
||||
./verify-signoz.sh {env}
|
||||
```
|
||||
|
||||
## Uninstallation
|
||||
|
||||
```bash
|
||||
# Remove SigNoz deployment
|
||||
./deploy-signoz.sh --remove {env}
|
||||
|
||||
# Optionally delete PVCs (WARNING: deletes all data)
|
||||
kubectl delete pvc -n signoz -l app.kubernetes.io/instance=signoz
|
||||
|
||||
# Optionally delete namespace
|
||||
kubectl delete namespace signoz
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [SigNoz Official Documentation](https://signoz.io/docs/)
|
||||
- [SigNoz Helm Charts Repository](https://github.com/SigNoz/charts)
|
||||
- [OpenTelemetry Documentation](https://opentelemetry.io/docs/)
|
||||
- [ClickHouse Documentation](https://clickhouse.com/docs/)
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions:
|
||||
1. Check [SigNoz GitHub Issues](https://github.com/SigNoz/signoz/issues)
|
||||
2. Review deployment logs: `kubectl logs -n signoz <pod-name>`
|
||||
3. Run verification script: `./verify-signoz.sh {env}`
|
||||
4. Check [SigNoz Community Slack](https://signoz.io/slack)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2026-01-09
|
||||
**SigNoz Helm Chart Version**: Latest (v0.129.12 components)
|
||||
**Maintained by**: Bakery IA Team
|
||||
190
infrastructure/monitoring/signoz/dashboards/README.md
Normal file
190
infrastructure/monitoring/signoz/dashboards/README.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# SigNoz Dashboards for Bakery IA
|
||||
|
||||
This directory contains comprehensive SigNoz dashboard configurations for monitoring the Bakery IA system.
|
||||
|
||||
## Available Dashboards
|
||||
|
||||
### 1. Infrastructure Monitoring
|
||||
- **File**: `infrastructure-monitoring.json`
|
||||
- **Purpose**: Monitor Kubernetes infrastructure, pod health, and resource utilization
|
||||
- **Key Metrics**: CPU usage, memory usage, network traffic, pod status, container health
|
||||
|
||||
### 2. Application Performance
|
||||
- **File**: `application-performance.json`
|
||||
- **Purpose**: Monitor microservice performance and API metrics
|
||||
- **Key Metrics**: Request rate, error rate, latency percentiles, endpoint performance
|
||||
|
||||
### 3. Database Performance
|
||||
- **File**: `database-performance.json`
|
||||
- **Purpose**: Monitor PostgreSQL and Redis database performance
|
||||
- **Key Metrics**: Connections, query execution time, cache hit ratio, locks, replication status
|
||||
|
||||
### 4. API Performance
|
||||
- **File**: `api-performance.json`
|
||||
- **Purpose**: Monitor REST and GraphQL API performance
|
||||
- **Key Metrics**: Request volume, response times, status codes, endpoint analysis
|
||||
|
||||
### 5. Error Tracking
|
||||
- **File**: `error-tracking.json`
|
||||
- **Purpose**: Track and analyze system errors
|
||||
- **Key Metrics**: Error rates, error distribution, recent errors, HTTP errors, database errors
|
||||
|
||||
### 6. User Activity
|
||||
- **File**: `user-activity.json`
|
||||
- **Purpose**: Monitor user behavior and activity patterns
|
||||
- **Key Metrics**: Active users, sessions, API calls per user, session duration
|
||||
|
||||
### 7. System Health
|
||||
- **File**: `system-health.json`
|
||||
- **Purpose**: Overall system health monitoring
|
||||
- **Key Metrics**: Availability, health scores, resource utilization, service status
|
||||
|
||||
### 8. Alert Management
|
||||
- **File**: `alert-management.json`
|
||||
- **Purpose**: Monitor and manage system alerts
|
||||
- **Key Metrics**: Active alerts, alert rates, alert distribution, firing alerts
|
||||
|
||||
### 9. Log Analysis
|
||||
- **File**: `log-analysis.json`
|
||||
- **Purpose**: Search and analyze system logs
|
||||
- **Key Metrics**: Log volume, error logs, log distribution, log search
|
||||
|
||||
## How to Import Dashboards
|
||||
|
||||
### Method 1: Using SigNoz UI
|
||||
|
||||
1. **Access SigNoz UI**: Open your SigNoz instance in a web browser
|
||||
2. **Navigate to Dashboards**: Go to the "Dashboards" section
|
||||
3. **Import Dashboard**: Click on "Import Dashboard" button
|
||||
4. **Upload JSON**: Select the JSON file from this directory
|
||||
5. **Configure**: Adjust any variables or settings as needed
|
||||
6. **Save**: Save the imported dashboard
|
||||
|
||||
**Note**: The dashboards now use the correct SigNoz JSON schema with proper filter arrays.
|
||||
|
||||
### Method 2: Using SigNoz API
|
||||
|
||||
```bash
|
||||
# Import a single dashboard
|
||||
curl -X POST "http://<SIGNOZ_HOST>:3301/api/v1/dashboards/import" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "Authorization: Bearer <API_KEY>" \
|
||||
-d @infrastructure-monitoring.json
|
||||
|
||||
# Import all dashboards
|
||||
for file in *.json; do
|
||||
curl -X POST "http://<SIGNOZ_HOST>:3301/api/v1/dashboards/import" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "Authorization: Bearer <API_KEY>" \
|
||||
-d @"$file"
|
||||
done
|
||||
```
|
||||
|
||||
### Method 3: Using Kubernetes ConfigMap
|
||||
|
||||
```yaml
|
||||
# Create a ConfigMap with all dashboards
|
||||
kubectl create configmap signoz-dashboards \
|
||||
--from-file=infrastructure-monitoring.json \
|
||||
--from-file=application-performance.json \
|
||||
--from-file=database-performance.json \
|
||||
--from-file=api-performance.json \
|
||||
--from-file=error-tracking.json \
|
||||
--from-file=user-activity.json \
|
||||
--from-file=system-health.json \
|
||||
--from-file=alert-management.json \
|
||||
--from-file=log-analysis.json \
|
||||
-n signoz
|
||||
```
|
||||
|
||||
## Dashboard Variables
|
||||
|
||||
Most dashboards include variables that allow you to filter and customize the view:
|
||||
|
||||
- **Namespace**: Filter by Kubernetes namespace (e.g., `bakery-ia`, `default`)
|
||||
- **Service**: Filter by specific microservice
|
||||
- **Severity**: Filter by error/alert severity
|
||||
- **Environment**: Filter by deployment environment
|
||||
- **Time Range**: Adjust the time window for analysis
|
||||
|
||||
## Metrics Reference
|
||||
|
||||
The dashboards use standard OpenTelemetry metrics. If you need to add custom metrics, ensure they are properly instrumented in your services.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Dashboard Import Errors
|
||||
|
||||
If you encounter errors when importing dashboards:
|
||||
|
||||
1. **Validate JSON**: Ensure the JSON files are valid
|
||||
```bash
|
||||
jq . infrastructure-monitoring.json
|
||||
```
|
||||
|
||||
2. **Check Metrics**: Verify that the metrics exist in your SigNoz instance
|
||||
|
||||
3. **Adjust Time Range**: Try different time ranges if no data appears
|
||||
|
||||
4. **Check Filters**: Ensure filters match your actual service names and tags
|
||||
|
||||
### "e.filter is not a function" Error
|
||||
|
||||
This error occurs when the dashboard JSON uses an incorrect filter format. The fix has been applied:
|
||||
|
||||
**Before (incorrect)**:
|
||||
```json
|
||||
"filters": {
|
||||
"namespace": "${namespace}"
|
||||
}
|
||||
```
|
||||
|
||||
**After (correct)**:
|
||||
```json
|
||||
"filters": [
|
||||
{
|
||||
"key": "namespace",
|
||||
"operator": "=",
|
||||
"value": "${namespace}"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
All dashboards in this directory now use the correct array format for filters.
|
||||
|
||||
### Missing Data
|
||||
|
||||
If dashboards show no data:
|
||||
|
||||
1. **Verify Instrumentation**: Ensure your services are properly instrumented with OpenTelemetry
|
||||
2. **Check Time Range**: Adjust the time range to include recent data
|
||||
3. **Validate Metrics**: Confirm the metrics are being collected and stored
|
||||
4. **Review Filters**: Check that filters match your actual deployment
|
||||
|
||||
## Customization
|
||||
|
||||
You can customize these dashboards by:
|
||||
|
||||
1. **Editing JSON**: Modify the JSON files to add/remove panels or adjust queries
|
||||
2. **Cloning in UI**: Clone existing dashboards and modify them in the SigNoz UI
|
||||
3. **Adding Variables**: Add new variables for additional filtering options
|
||||
4. **Adjusting Layout**: Change the grid layout and panel sizes
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Regular Reviews**: Review dashboards regularly to ensure they meet your monitoring needs
|
||||
2. **Alert Integration**: Set up alerts based on key metrics shown in these dashboards
|
||||
3. **Team Access**: Share relevant dashboards with appropriate team members
|
||||
4. **Documentation**: Document any custom metrics or specific monitoring requirements
|
||||
|
||||
## Support
|
||||
|
||||
For issues with these dashboards:
|
||||
|
||||
1. Check the [SigNoz documentation](https://signoz.io/docs/)
|
||||
2. Review the [Bakery IA monitoring guide](../SIGNOZ_COMPLETE_CONFIGURATION_GUIDE.md)
|
||||
3. Consult the OpenTelemetry metrics specification
|
||||
|
||||
## License
|
||||
|
||||
These dashboard configurations are provided under the same license as the Bakery IA project.
|
||||
@@ -0,0 +1,170 @@
|
||||
{
|
||||
"description": "Alert monitoring and management dashboard",
|
||||
"tags": ["alerts", "monitoring", "management"],
|
||||
"name": "bakery-ia-alert-management",
|
||||
"title": "Bakery IA - Alert Management",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-alerts-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "active-alerts",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "alert-rate",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"service": {
|
||||
"id": "service-var",
|
||||
"name": "service",
|
||||
"description": "Filter by service name",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'alerts_active' AND value != '' ORDER BY value",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": null
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "active-alerts",
|
||||
"title": "Active Alerts",
|
||||
"description": "Number of currently active alerts",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "value",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "alerts_active",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "Active Alerts",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "alert-rate",
|
||||
"title": "Alert Rate",
|
||||
"description": "Rate of alerts over time",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "alerts_total",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "alerts/s"
|
||||
}
|
||||
]
|
||||
}
|
||||
351
infrastructure/monitoring/signoz/dashboards/api-performance.json
Normal file
351
infrastructure/monitoring/signoz/dashboards/api-performance.json
Normal file
@@ -0,0 +1,351 @@
|
||||
{
|
||||
"description": "Comprehensive API performance monitoring for Bakery IA REST and GraphQL endpoints",
|
||||
"tags": ["api", "performance", "rest", "graphql"],
|
||||
"name": "bakery-ia-api-performance",
|
||||
"title": "Bakery IA - API Performance",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-api-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "request-volume",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "error-rate",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "avg-response-time",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "p95-latency",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"service": {
|
||||
"id": "service-var",
|
||||
"name": "service",
|
||||
"description": "Filter by API service",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'http_server_requests_seconds_count' AND value != '' ORDER BY value",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": null
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "request-volume",
|
||||
"title": "Request Volume",
|
||||
"description": "API request volume by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "http_server_requests_seconds_count",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "service.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "api.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{api.name}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "req/s"
|
||||
},
|
||||
{
|
||||
"id": "error-rate",
|
||||
"title": "Error Rate",
|
||||
"description": "API error rate by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "http_server_requests_seconds_count",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "api.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.api}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "status_code",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=~",
|
||||
"value": "5.."
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "api.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
{
|
||||
"key": "status_code",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{api.name}} - {{status_code}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "req/s"
|
||||
},
|
||||
{
|
||||
"id": "avg-response-time",
|
||||
"title": "Average Response Time",
|
||||
"description": "Average API response time by endpoint",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "http_server_requests_seconds_sum",
|
||||
"dataType": "float64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "avg",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "api.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.api}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "api.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
{
|
||||
"key": "endpoint",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{api.name}} - {{endpoint}}",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "seconds"
|
||||
},
|
||||
{
|
||||
"id": "p95-latency",
|
||||
"title": "P95 Latency",
|
||||
"description": "95th percentile latency by endpoint",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "histogram_quantile",
|
||||
"aggregateAttribute": {
|
||||
"key": "http_server_requests_seconds_bucket",
|
||||
"dataType": "float64",
|
||||
"type": "Histogram",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "avg",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "api.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.api}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "api.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
{
|
||||
"key": "endpoint",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{api.name}} - {{endpoint}}",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "seconds"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,333 @@
|
||||
{
|
||||
"description": "Application performance monitoring dashboard using distributed traces and metrics",
|
||||
"tags": ["application", "performance", "traces", "apm"],
|
||||
"name": "bakery-ia-application-performance",
|
||||
"title": "Bakery IA - Application Performance (APM)",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-apm-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "latency-p99",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "request-rate",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "error-rate",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "avg-duration",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"service_name": {
|
||||
"id": "service-var",
|
||||
"name": "service_name",
|
||||
"description": "Filter by service name",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(serviceName) FROM signoz_traces.distributed_signoz_index_v2 ORDER BY serviceName",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": null
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "latency-p99",
|
||||
"title": "P99 Latency",
|
||||
"description": "99th percentile latency for selected service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "p99",
|
||||
"aggregateAttribute": {
|
||||
"key": "duration_ns",
|
||||
"dataType": "float64",
|
||||
"type": "",
|
||||
"isColumn": true
|
||||
},
|
||||
"timeAggregation": "avg",
|
||||
"spaceAggregation": "p99",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service_name}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "ms"
|
||||
},
|
||||
{
|
||||
"id": "request-rate",
|
||||
"title": "Request Rate",
|
||||
"description": "Requests per second for the service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count",
|
||||
"aggregateAttribute": {
|
||||
"key": "",
|
||||
"dataType": "",
|
||||
"type": "",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service_name}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "reqps"
|
||||
},
|
||||
{
|
||||
"id": "error-rate",
|
||||
"title": "Error Rate",
|
||||
"description": "Error rate percentage for the service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count",
|
||||
"aggregateAttribute": {
|
||||
"key": "",
|
||||
"dataType": "",
|
||||
"type": "",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service_name}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "status_code",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "STATUS_CODE_ERROR"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "reqps"
|
||||
},
|
||||
{
|
||||
"id": "avg-duration",
|
||||
"title": "Average Duration",
|
||||
"description": "Average request duration",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "duration_ns",
|
||||
"dataType": "float64",
|
||||
"type": "",
|
||||
"isColumn": true
|
||||
},
|
||||
"timeAggregation": "avg",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service_name}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "ms"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -0,0 +1,425 @@
|
||||
{
|
||||
"description": "Comprehensive database performance monitoring for PostgreSQL, Redis, and RabbitMQ",
|
||||
"tags": ["database", "postgresql", "redis", "rabbitmq", "performance"],
|
||||
"name": "bakery-ia-database-performance",
|
||||
"title": "Bakery IA - Database Performance",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-db-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "pg-connections",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "pg-db-size",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "redis-connected-clients",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "redis-memory",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 6,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "rabbitmq-messages",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 6,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "rabbitmq-consumers",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"database": {
|
||||
"id": "database-var",
|
||||
"name": "database",
|
||||
"description": "Filter by PostgreSQL database name",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(resource_attrs['postgresql.database.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'postgresql.db_size' AND value != '' ORDER BY value",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": null
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "pg-connections",
|
||||
"title": "PostgreSQL - Active Connections",
|
||||
"description": "Number of active PostgreSQL connections",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "postgresql.backends",
|
||||
"dataType": "float64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "postgresql.database.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.database}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "postgresql.database.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{postgresql.database.name}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "pg-db-size",
|
||||
"title": "PostgreSQL - Database Size",
|
||||
"description": "Size of PostgreSQL databases in bytes",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "postgresql.db_size",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "postgresql.database.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.database}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "postgresql.database.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{postgresql.database.name}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "bytes"
|
||||
},
|
||||
{
|
||||
"id": "redis-connected-clients",
|
||||
"title": "Redis - Connected Clients",
|
||||
"description": "Number of clients connected to Redis",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "redis.clients.connected",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "host.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{host.name}}",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "redis-memory",
|
||||
"title": "Redis - Memory Usage",
|
||||
"description": "Redis memory usage in bytes",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "redis.memory.used",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "host.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{host.name}}",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "bytes"
|
||||
},
|
||||
{
|
||||
"id": "rabbitmq-messages",
|
||||
"title": "RabbitMQ - Current Messages",
|
||||
"description": "Number of messages currently in RabbitMQ queues",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "rabbitmq.message.current",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "queue",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "Queue: {{queue}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "rabbitmq-consumers",
|
||||
"title": "RabbitMQ - Consumer Count",
|
||||
"description": "Number of consumers per queue",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "rabbitmq.consumer.count",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "queue",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "Queue: {{queue}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
}
|
||||
]
|
||||
}
|
||||
348
infrastructure/monitoring/signoz/dashboards/error-tracking.json
Normal file
348
infrastructure/monitoring/signoz/dashboards/error-tracking.json
Normal file
@@ -0,0 +1,348 @@
|
||||
{
|
||||
"description": "Comprehensive error tracking and analysis dashboard",
|
||||
"tags": ["errors", "exceptions", "tracking"],
|
||||
"name": "bakery-ia-error-tracking",
|
||||
"title": "Bakery IA - Error Tracking",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-errors-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "total-errors",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "error-rate",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "http-5xx",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "http-4xx",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"service": {
|
||||
"id": "service-var",
|
||||
"name": "service",
|
||||
"description": "Filter by service name",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'error_total' AND value != '' ORDER BY value",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": null
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "total-errors",
|
||||
"title": "Total Errors",
|
||||
"description": "Total number of errors across all services",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "value",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "error_total",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "sum",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "service.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "Total Errors",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "error-rate",
|
||||
"title": "Error Rate",
|
||||
"description": "Error rate over time",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "error_total",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "service.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "errors/s"
|
||||
},
|
||||
{
|
||||
"id": "http-5xx",
|
||||
"title": "HTTP 5xx Errors",
|
||||
"description": "Server errors (5xx status codes)",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "http_server_requests_seconds_count",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "sum",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "service.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "status_code",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=~",
|
||||
"value": "5.."
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
{
|
||||
"key": "status_code",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}} - {{status_code}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "number"
|
||||
},
|
||||
{
|
||||
"id": "http-4xx",
|
||||
"title": "HTTP 4xx Errors",
|
||||
"description": "Client errors (4xx status codes)",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "http_server_requests_seconds_count",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "sum",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "service.name",
|
||||
"dataType": "string",
|
||||
"type": "resource",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "status_code",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=~",
|
||||
"value": "4.."
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
{
|
||||
"key": "status_code",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}} - {{status_code}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "number"
|
||||
}
|
||||
]
|
||||
}
|
||||
213
infrastructure/monitoring/signoz/dashboards/index.json
Normal file
213
infrastructure/monitoring/signoz/dashboards/index.json
Normal file
@@ -0,0 +1,213 @@
|
||||
{
|
||||
"name": "Bakery IA Dashboard Collection",
|
||||
"description": "Complete set of SigNoz dashboards for Bakery IA monitoring",
|
||||
"version": "1.0.0",
|
||||
"author": "Bakery IA Team",
|
||||
"license": "MIT",
|
||||
"dashboards": [
|
||||
{
|
||||
"id": "infrastructure-monitoring",
|
||||
"name": "Infrastructure Monitoring",
|
||||
"description": "Kubernetes infrastructure and resource monitoring",
|
||||
"file": "infrastructure-monitoring.json",
|
||||
"tags": ["infrastructure", "kubernetes", "system"],
|
||||
"category": "infrastructure"
|
||||
},
|
||||
{
|
||||
"id": "application-performance",
|
||||
"name": "Application Performance",
|
||||
"description": "Microservice performance and API metrics",
|
||||
"file": "application-performance.json",
|
||||
"tags": ["application", "performance", "apm"],
|
||||
"category": "performance"
|
||||
},
|
||||
{
|
||||
"id": "database-performance",
|
||||
"name": "Database Performance",
|
||||
"description": "PostgreSQL and Redis database monitoring",
|
||||
"file": "database-performance.json",
|
||||
"tags": ["database", "postgresql", "redis"],
|
||||
"category": "database"
|
||||
},
|
||||
{
|
||||
"id": "api-performance",
|
||||
"name": "API Performance",
|
||||
"description": "REST and GraphQL API performance monitoring",
|
||||
"file": "api-performance.json",
|
||||
"tags": ["api", "rest", "graphql"],
|
||||
"category": "api"
|
||||
},
|
||||
{
|
||||
"id": "error-tracking",
|
||||
"name": "Error Tracking",
|
||||
"description": "System error tracking and analysis",
|
||||
"file": "error-tracking.json",
|
||||
"tags": ["errors", "exceptions", "tracking"],
|
||||
"category": "monitoring"
|
||||
},
|
||||
{
|
||||
"id": "user-activity",
|
||||
"name": "User Activity",
|
||||
"description": "User behavior and activity monitoring",
|
||||
"file": "user-activity.json",
|
||||
"tags": ["user", "activity", "behavior"],
|
||||
"category": "user"
|
||||
},
|
||||
{
|
||||
"id": "system-health",
|
||||
"name": "System Health",
|
||||
"description": "Overall system health monitoring",
|
||||
"file": "system-health.json",
|
||||
"tags": ["system", "health", "overview"],
|
||||
"category": "overview"
|
||||
},
|
||||
{
|
||||
"id": "alert-management",
|
||||
"name": "Alert Management",
|
||||
"description": "Alert monitoring and management",
|
||||
"file": "alert-management.json",
|
||||
"tags": ["alerts", "notifications", "management"],
|
||||
"category": "alerts"
|
||||
},
|
||||
{
|
||||
"id": "log-analysis",
|
||||
"name": "Log Analysis",
|
||||
"description": "Log search and analysis",
|
||||
"file": "log-analysis.json",
|
||||
"tags": ["logs", "search", "analysis"],
|
||||
"category": "logs"
|
||||
}
|
||||
],
|
||||
"categories": [
|
||||
{
|
||||
"id": "infrastructure",
|
||||
"name": "Infrastructure",
|
||||
"description": "Kubernetes and system infrastructure monitoring"
|
||||
},
|
||||
{
|
||||
"id": "performance",
|
||||
"name": "Performance",
|
||||
"description": "Application and service performance monitoring"
|
||||
},
|
||||
{
|
||||
"id": "database",
|
||||
"name": "Database",
|
||||
"description": "Database performance and health monitoring"
|
||||
},
|
||||
{
|
||||
"id": "api",
|
||||
"name": "API",
|
||||
"description": "API performance and usage monitoring"
|
||||
},
|
||||
{
|
||||
"id": "monitoring",
|
||||
"name": "Monitoring",
|
||||
"description": "Error tracking and system monitoring"
|
||||
},
|
||||
{
|
||||
"id": "user",
|
||||
"name": "User",
|
||||
"description": "User activity and behavior monitoring"
|
||||
},
|
||||
{
|
||||
"id": "overview",
|
||||
"name": "Overview",
|
||||
"description": "System-wide overview and health dashboards"
|
||||
},
|
||||
{
|
||||
"id": "alerts",
|
||||
"name": "Alerts",
|
||||
"description": "Alert management and monitoring"
|
||||
},
|
||||
{
|
||||
"id": "logs",
|
||||
"name": "Logs",
|
||||
"description": "Log analysis and search"
|
||||
}
|
||||
],
|
||||
"usage": {
|
||||
"import_methods": [
|
||||
"ui_import",
|
||||
"api_import",
|
||||
"kubernetes_configmap"
|
||||
],
|
||||
"recommended_import_order": [
|
||||
"infrastructure-monitoring",
|
||||
"system-health",
|
||||
"application-performance",
|
||||
"api-performance",
|
||||
"database-performance",
|
||||
"error-tracking",
|
||||
"alert-management",
|
||||
"log-analysis",
|
||||
"user-activity"
|
||||
]
|
||||
},
|
||||
"requirements": {
|
||||
"signoz_version": ">= 0.10.0",
|
||||
"opentelemetry_collector": ">= 0.45.0",
|
||||
"metrics": [
|
||||
"container_cpu_usage_seconds_total",
|
||||
"container_memory_working_set_bytes",
|
||||
"http_server_requests_seconds_count",
|
||||
"http_server_requests_seconds_sum",
|
||||
"pg_stat_activity_count",
|
||||
"pg_stat_statements_total_time",
|
||||
"error_total",
|
||||
"alerts_total",
|
||||
"kube_pod_status_phase",
|
||||
"container_network_receive_bytes_total",
|
||||
"kube_pod_container_status_restarts_total",
|
||||
"kube_pod_container_status_ready",
|
||||
"container_fs_reads_total",
|
||||
"kube_pod_status_phase",
|
||||
"kube_pod_container_status_restarts_total",
|
||||
"kube_pod_container_status_ready",
|
||||
"container_fs_reads_total",
|
||||
"kubernetes_events",
|
||||
"http_server_requests_seconds_bucket",
|
||||
"http_server_active_requests",
|
||||
"http_server_up",
|
||||
"db_query_duration_seconds_sum",
|
||||
"db_connections_active",
|
||||
"http_client_request_duration_seconds_count",
|
||||
"http_client_request_duration_seconds_sum",
|
||||
"graphql_execution_time_seconds",
|
||||
"graphql_errors_total",
|
||||
"pg_stat_database_blks_hit",
|
||||
"pg_stat_database_xact_commit",
|
||||
"pg_locks_count",
|
||||
"pg_table_size_bytes",
|
||||
"pg_stat_user_tables_seq_scan",
|
||||
"redis_memory_used_bytes",
|
||||
"redis_commands_processed_total",
|
||||
"redis_keyspace_hits",
|
||||
"pg_stat_database_deadlocks",
|
||||
"pg_stat_database_conn_errors",
|
||||
"pg_replication_lag_bytes",
|
||||
"pg_replication_is_replica",
|
||||
"active_users",
|
||||
"user_sessions_total",
|
||||
"api_calls_per_user",
|
||||
"session_duration_seconds",
|
||||
"system_availability",
|
||||
"service_health_score",
|
||||
"system_cpu_usage",
|
||||
"system_memory_usage",
|
||||
"service_availability",
|
||||
"alerts_active",
|
||||
"alerts_total",
|
||||
"log_lines_total"
|
||||
]
|
||||
},
|
||||
"support": {
|
||||
"documentation": "https://signoz.io/docs/",
|
||||
"bakery_ia_docs": "../SIGNOZ_COMPLETE_CONFIGURATION_GUIDE.md",
|
||||
"issues": "https://github.com/your-repo/issues"
|
||||
},
|
||||
"notes": {
|
||||
"format_fix": "All dashboards have been updated to use the correct SigNoz JSON schema with proper filter arrays to resolve the 'e.filter is not a function' error.",
|
||||
"compatibility": "Tested with SigNoz v0.10.0+ and OpenTelemetry Collector v0.45.0+",
|
||||
"customization": "You can customize these dashboards by editing the JSON files or cloning them in the SigNoz UI"
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,437 @@
|
||||
{
|
||||
"description": "Comprehensive infrastructure monitoring dashboard for Bakery IA Kubernetes cluster",
|
||||
"tags": ["infrastructure", "kubernetes", "k8s", "system"],
|
||||
"name": "bakery-ia-infrastructure-monitoring",
|
||||
"title": "Bakery IA - Infrastructure Monitoring",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-infra-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "pod-count",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "pod-phase",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "container-restarts",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "node-condition",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 6,
|
||||
"w": 12,
|
||||
"h": 3,
|
||||
"i": "deployment-status",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"namespace": {
|
||||
"id": "namespace-var",
|
||||
"name": "namespace",
|
||||
"description": "Filter by Kubernetes namespace",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(resource_attrs['k8s.namespace.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'k8s.pod.phase' AND value != '' ORDER BY value",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": "bakery-ia"
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "pod-count",
|
||||
"title": "Total Pods",
|
||||
"description": "Total number of pods in the namespace",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "value",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count",
|
||||
"aggregateAttribute": {
|
||||
"key": "k8s.pod.phase",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "Total Pods",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "pod-phase",
|
||||
"title": "Pod Phase Distribution",
|
||||
"description": "Pods by phase (Running, Pending, Failed, etc.)",
|
||||
"isStacked": true,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "k8s.pod.phase",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "phase",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{phase}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "container-restarts",
|
||||
"title": "Container Restarts",
|
||||
"description": "Container restart count over time",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "k8s.container.restarts",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "increase",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"id": "k8s.pod.name--string--tag--false",
|
||||
"key": "k8s.pod.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{k8s.pod.name}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "node-condition",
|
||||
"title": "Node Conditions",
|
||||
"description": "Node condition status (Ready, MemoryPressure, DiskPressure, etc.)",
|
||||
"isStacked": true,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "k8s.node.condition_ready",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"id": "k8s.node.name--string--tag--false",
|
||||
"key": "k8s.node.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{k8s.node.name}} Ready",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "deployment-status",
|
||||
"title": "Deployment Status (Desired vs Available)",
|
||||
"description": "Deployment replicas: desired vs available",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "k8s.deployment.desired",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"id": "k8s.deployment.name--string--tag--false",
|
||||
"key": "k8s.deployment.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{k8s.deployment.name}} (desired)",
|
||||
"reduceTo": "avg"
|
||||
},
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "B",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "k8s.deployment.available",
|
||||
"dataType": "int64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "B",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"id": "k8s.deployment.name--string--tag--false",
|
||||
"key": "k8s.deployment.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{k8s.deployment.name}} (available)",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
}
|
||||
]
|
||||
}
|
||||
333
infrastructure/monitoring/signoz/dashboards/log-analysis.json
Normal file
333
infrastructure/monitoring/signoz/dashboards/log-analysis.json
Normal file
@@ -0,0 +1,333 @@
|
||||
{
|
||||
"description": "Comprehensive log analysis and search dashboard",
|
||||
"tags": ["logs", "analysis", "search"],
|
||||
"name": "bakery-ia-log-analysis",
|
||||
"title": "Bakery IA - Log Analysis",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-logs-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "log-volume",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "error-logs",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "logs-by-level",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "logs-by-service",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"service": {
|
||||
"id": "service-var",
|
||||
"name": "service",
|
||||
"description": "Filter by service name",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'log_lines_total' AND value != '' ORDER BY value",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": null
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "log-volume",
|
||||
"title": "Log Volume",
|
||||
"description": "Total log volume by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "log_lines_total",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "logs/s"
|
||||
},
|
||||
{
|
||||
"id": "error-logs",
|
||||
"title": "Error Logs",
|
||||
"description": "Error log volume by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "log_lines_total",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "rate",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "level",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "error"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}} (errors)",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "logs/s"
|
||||
},
|
||||
{
|
||||
"id": "logs-by-level",
|
||||
"title": "Logs by Level",
|
||||
"description": "Distribution of logs by severity level",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "pie",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "log_lines_total",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "sum",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "level",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
}
|
||||
],
|
||||
"legend": "{{level}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "logs-by-service",
|
||||
"title": "Logs by Service",
|
||||
"description": "Distribution of logs by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "pie",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "sum",
|
||||
"aggregateAttribute": {
|
||||
"key": "log_lines_total",
|
||||
"dataType": "int64",
|
||||
"type": "Counter",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "sum",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
}
|
||||
]
|
||||
}
|
||||
303
infrastructure/monitoring/signoz/dashboards/system-health.json
Normal file
303
infrastructure/monitoring/signoz/dashboards/system-health.json
Normal file
@@ -0,0 +1,303 @@
|
||||
{
|
||||
"description": "Comprehensive system health monitoring dashboard",
|
||||
"tags": ["system", "health", "monitoring"],
|
||||
"name": "bakery-ia-system-health",
|
||||
"title": "Bakery IA - System Health",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-health-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "system-availability",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "health-score",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "cpu-usage",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "memory-usage",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"namespace": {
|
||||
"id": "namespace-var",
|
||||
"name": "namespace",
|
||||
"description": "Filter by Kubernetes namespace",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(resource_attrs['k8s.namespace.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'system_availability' AND value != '' ORDER BY value",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": "bakery-ia"
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "system-availability",
|
||||
"title": "System Availability",
|
||||
"description": "Overall system availability percentage",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "value",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "system_availability",
|
||||
"dataType": "float64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "System Availability",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "percent"
|
||||
},
|
||||
{
|
||||
"id": "health-score",
|
||||
"title": "Service Health Score",
|
||||
"description": "Overall service health score",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "value",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "service_health_score",
|
||||
"dataType": "float64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "latest",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "Health Score",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "cpu-usage",
|
||||
"title": "CPU Usage",
|
||||
"description": "System CPU usage over time",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "system_cpu_usage",
|
||||
"dataType": "float64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "avg",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "CPU Usage",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "percent"
|
||||
},
|
||||
{
|
||||
"id": "memory-usage",
|
||||
"title": "Memory Usage",
|
||||
"description": "System memory usage over time",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "metrics",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "avg",
|
||||
"aggregateAttribute": {
|
||||
"key": "system_memory_usage",
|
||||
"dataType": "float64",
|
||||
"type": "Gauge",
|
||||
"isColumn": false
|
||||
},
|
||||
"timeAggregation": "avg",
|
||||
"spaceAggregation": "avg",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"id": "filter-k8s-namespace",
|
||||
"key": {
|
||||
"id": "k8s.namespace.name--string--tag--false",
|
||||
"key": "k8s.namespace.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": false
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.namespace}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "Memory Usage",
|
||||
"reduceTo": "avg"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "percent"
|
||||
}
|
||||
]
|
||||
}
|
||||
429
infrastructure/monitoring/signoz/dashboards/user-activity.json
Normal file
429
infrastructure/monitoring/signoz/dashboards/user-activity.json
Normal file
@@ -0,0 +1,429 @@
|
||||
{
|
||||
"description": "User activity and behavior monitoring dashboard",
|
||||
"tags": ["user", "activity", "behavior"],
|
||||
"name": "bakery-ia-user-activity",
|
||||
"title": "Bakery IA - User Activity",
|
||||
"uploadedGrafana": false,
|
||||
"uuid": "bakery-ia-user-01",
|
||||
"version": "v4",
|
||||
"collapsableRowsMigrated": true,
|
||||
"layout": [
|
||||
{
|
||||
"x": 0,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "active-users",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 0,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "user-sessions",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "user-actions",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 6,
|
||||
"y": 3,
|
||||
"w": 6,
|
||||
"h": 3,
|
||||
"i": "page-views",
|
||||
"moved": false,
|
||||
"static": false
|
||||
},
|
||||
{
|
||||
"x": 0,
|
||||
"y": 6,
|
||||
"w": 12,
|
||||
"h": 4,
|
||||
"i": "geo-visitors",
|
||||
"moved": false,
|
||||
"static": false
|
||||
}
|
||||
],
|
||||
"variables": {
|
||||
"service": {
|
||||
"id": "service-var",
|
||||
"name": "service",
|
||||
"description": "Filter by service name",
|
||||
"type": "QUERY",
|
||||
"queryValue": "SELECT DISTINCT(serviceName) FROM signoz_traces.distributed_signoz_index_v2 ORDER BY serviceName",
|
||||
"customValue": "",
|
||||
"textboxValue": "",
|
||||
"showALLOption": true,
|
||||
"multiSelect": false,
|
||||
"order": 1,
|
||||
"modificationUUID": "",
|
||||
"sort": "ASC",
|
||||
"selectedValue": "bakery-frontend"
|
||||
}
|
||||
},
|
||||
"widgets": [
|
||||
{
|
||||
"id": "active-users",
|
||||
"title": "Active Users",
|
||||
"description": "Number of active users by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count_distinct",
|
||||
"aggregateAttribute": {
|
||||
"key": "user.id",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"timeAggregation": "count_distinct",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "user-sessions",
|
||||
"title": "User Sessions",
|
||||
"description": "Total user sessions by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count",
|
||||
"aggregateAttribute": {
|
||||
"key": "session.id",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"timeAggregation": "count",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "span.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "user_session"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "user-actions",
|
||||
"title": "User Actions",
|
||||
"description": "Total user actions by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count",
|
||||
"aggregateAttribute": {
|
||||
"key": "user.action",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"timeAggregation": "count",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "span.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "user_action"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "page-views",
|
||||
"title": "Page Views",
|
||||
"description": "Total page views by service",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "graph",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count",
|
||||
"aggregateAttribute": {
|
||||
"key": "page.path",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"timeAggregation": "count",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "span.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "page_view"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [
|
||||
{
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
}
|
||||
],
|
||||
"legend": "{{serviceName}}",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
},
|
||||
{
|
||||
"id": "geo-visitors",
|
||||
"title": "Geolocation Visitors",
|
||||
"description": "Number of visitors who shared location data",
|
||||
"isStacked": false,
|
||||
"nullZeroValues": "zero",
|
||||
"opacity": "1",
|
||||
"panelTypes": "value",
|
||||
"query": {
|
||||
"builder": {
|
||||
"queryData": [
|
||||
{
|
||||
"dataSource": "traces",
|
||||
"queryName": "A",
|
||||
"aggregateOperator": "count",
|
||||
"aggregateAttribute": {
|
||||
"key": "user.id",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"timeAggregation": "count",
|
||||
"spaceAggregation": "sum",
|
||||
"functions": [],
|
||||
"filters": {
|
||||
"items": [
|
||||
{
|
||||
"key": {
|
||||
"key": "serviceName",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "{{.service}}"
|
||||
},
|
||||
{
|
||||
"key": {
|
||||
"key": "span.name",
|
||||
"dataType": "string",
|
||||
"type": "tag",
|
||||
"isColumn": true
|
||||
},
|
||||
"op": "=",
|
||||
"value": "user_location"
|
||||
}
|
||||
],
|
||||
"op": "AND"
|
||||
},
|
||||
"expression": "A",
|
||||
"disabled": false,
|
||||
"having": [],
|
||||
"stepInterval": 60,
|
||||
"limit": null,
|
||||
"orderBy": [],
|
||||
"groupBy": [],
|
||||
"legend": "Visitors with Location Data (See GEOLOCATION_VISUALIZATION_GUIDE.md for map integration)",
|
||||
"reduceTo": "sum"
|
||||
}
|
||||
],
|
||||
"queryFormulas": []
|
||||
},
|
||||
"queryType": "builder"
|
||||
},
|
||||
"fillSpans": false,
|
||||
"yAxisUnit": "none"
|
||||
}
|
||||
]
|
||||
}
|
||||
392
infrastructure/monitoring/signoz/deploy-signoz.sh
Executable file
392
infrastructure/monitoring/signoz/deploy-signoz.sh
Executable file
@@ -0,0 +1,392 @@
|
||||
#!/bin/bash
|
||||
|
||||
# ============================================================================
|
||||
# SigNoz Deployment Script for Bakery IA
|
||||
# ============================================================================
|
||||
# This script deploys SigNoz monitoring stack using Helm
|
||||
# Supports both development and production environments
|
||||
# ============================================================================
|
||||
|
||||
set -e
|
||||
|
||||
# Color codes for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Function to display help
|
||||
show_help() {
|
||||
echo "Usage: $0 [OPTIONS] ENVIRONMENT"
|
||||
echo ""
|
||||
echo "Deploy SigNoz monitoring stack for Bakery IA"
|
||||
echo ""
|
||||
echo "Arguments:
|
||||
ENVIRONMENT Environment to deploy to (dev|prod)"
|
||||
echo ""
|
||||
echo "Options:
|
||||
-h, --help Show this help message
|
||||
-d, --dry-run Dry run - show what would be done without actually deploying
|
||||
-u, --upgrade Upgrade existing deployment
|
||||
-r, --remove Remove/Uninstall SigNoz deployment
|
||||
-n, --namespace NAMESPACE Specify namespace (default: bakery-ia)"
|
||||
echo ""
|
||||
echo "Examples:
|
||||
$0 dev # Deploy to development
|
||||
$0 prod # Deploy to production
|
||||
$0 --upgrade prod # Upgrade production deployment
|
||||
$0 --remove dev # Remove development deployment"
|
||||
echo ""
|
||||
echo "Docker Hub Authentication:"
|
||||
echo " This script automatically creates a Docker Hub secret for image pulls."
|
||||
echo " Provide credentials via environment variables (recommended):"
|
||||
echo " export DOCKERHUB_USERNAME='your-username'"
|
||||
echo " export DOCKERHUB_PASSWORD='your-personal-access-token'"
|
||||
echo " Or ensure you're logged in with Docker CLI:"
|
||||
echo " docker login"
|
||||
}
|
||||
|
||||
# Parse command line arguments
|
||||
DRY_RUN=false
|
||||
UPGRADE=false
|
||||
REMOVE=false
|
||||
NAMESPACE="bakery-ia"
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
-h|--help)
|
||||
show_help
|
||||
exit 0
|
||||
;;
|
||||
-d|--dry-run)
|
||||
DRY_RUN=true
|
||||
shift
|
||||
;;
|
||||
-u|--upgrade)
|
||||
UPGRADE=true
|
||||
shift
|
||||
;;
|
||||
-r|--remove)
|
||||
REMOVE=true
|
||||
shift
|
||||
;;
|
||||
-n|--namespace)
|
||||
NAMESPACE="$2"
|
||||
shift 2
|
||||
;;
|
||||
dev|prod)
|
||||
ENVIRONMENT="$1"
|
||||
shift
|
||||
;;
|
||||
*)
|
||||
echo "Unknown argument: $1"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Validate environment
|
||||
if [[ -z "$ENVIRONMENT" ]]; then
|
||||
echo "Error: Environment not specified. Use 'dev' or 'prod'."
|
||||
show_help
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ "$ENVIRONMENT" != "dev" && "$ENVIRONMENT" != "prod" ]]; then
|
||||
echo "Error: Invalid environment. Use 'dev' or 'prod'."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Function to check if Helm is installed
|
||||
check_helm() {
|
||||
if ! command -v helm &> /dev/null; then
|
||||
echo "${RED}Error: Helm is not installed. Please install Helm first.${NC}"
|
||||
echo "Installation instructions: https://helm.sh/docs/intro/install/"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Function to check if kubectl is configured
|
||||
check_kubectl() {
|
||||
if ! kubectl cluster-info &> /dev/null; then
|
||||
echo "${RED}Error: kubectl is not configured or cannot connect to cluster.${NC}"
|
||||
echo "Please ensure you have access to a Kubernetes cluster."
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Function to check if namespace exists, create if not
|
||||
ensure_namespace() {
|
||||
if ! kubectl get namespace "$NAMESPACE" &> /dev/null; then
|
||||
echo "${BLUE}Creating namespace $NAMESPACE...${NC}"
|
||||
if [[ "$DRY_RUN" == true ]]; then
|
||||
echo " (dry-run) Would create namespace $NAMESPACE"
|
||||
else
|
||||
kubectl create namespace "$NAMESPACE"
|
||||
echo "${GREEN}Namespace $NAMESPACE created.${NC}"
|
||||
fi
|
||||
else
|
||||
echo "${BLUE}Namespace $NAMESPACE already exists.${NC}"
|
||||
fi
|
||||
}
|
||||
|
||||
# Function to create Docker Hub secret for image pulls
|
||||
create_dockerhub_secret() {
|
||||
echo "${BLUE}Setting up Docker Hub image pull secret...${NC}"
|
||||
|
||||
if [[ "$DRY_RUN" == true ]]; then
|
||||
echo " (dry-run) Would create Docker Hub secret in namespace $NAMESPACE"
|
||||
return
|
||||
fi
|
||||
|
||||
# Check if secret already exists
|
||||
if kubectl get secret dockerhub-creds -n "$NAMESPACE" &> /dev/null; then
|
||||
echo "${GREEN}Docker Hub secret already exists in namespace $NAMESPACE.${NC}"
|
||||
return
|
||||
fi
|
||||
|
||||
# Check if Docker Hub credentials are available
|
||||
if [[ -n "$DOCKERHUB_USERNAME" ]] && [[ -n "$DOCKERHUB_PASSWORD" ]]; then
|
||||
echo "${BLUE}Found DOCKERHUB_USERNAME and DOCKERHUB_PASSWORD environment variables${NC}"
|
||||
|
||||
kubectl create secret docker-registry dockerhub-creds \
|
||||
--docker-server=https://index.docker.io/v1/ \
|
||||
--docker-username="$DOCKERHUB_USERNAME" \
|
||||
--docker-password="$DOCKERHUB_PASSWORD" \
|
||||
--docker-email="${DOCKERHUB_EMAIL:-noreply@bakery-ia.local}" \
|
||||
-n "$NAMESPACE"
|
||||
|
||||
echo "${GREEN}Docker Hub secret created successfully.${NC}"
|
||||
|
||||
elif [[ -f "$HOME/.docker/config.json" ]]; then
|
||||
echo "${BLUE}Attempting to use Docker CLI credentials...${NC}"
|
||||
|
||||
# Try to extract credentials from Docker config
|
||||
if grep -q "credsStore" "$HOME/.docker/config.json"; then
|
||||
echo "${YELLOW}Docker is using a credential store. Please set environment variables:${NC}"
|
||||
echo " export DOCKERHUB_USERNAME='your-username'"
|
||||
echo " export DOCKERHUB_PASSWORD='your-password-or-token'"
|
||||
echo "${YELLOW}Continuing without Docker Hub authentication...${NC}"
|
||||
return
|
||||
fi
|
||||
|
||||
# Try to extract from base64 encoded auth
|
||||
AUTH=$(cat "$HOME/.docker/config.json" | jq -r '.auths["https://index.docker.io/v1/"].auth // empty' 2>/dev/null)
|
||||
if [[ -n "$AUTH" ]]; then
|
||||
echo "${GREEN}Found Docker Hub credentials in Docker config${NC}"
|
||||
local DOCKER_USERNAME=$(echo "$AUTH" | base64 -d | cut -d: -f1)
|
||||
local DOCKER_PASSWORD=$(echo "$AUTH" | base64 -d | cut -d: -f2-)
|
||||
|
||||
kubectl create secret docker-registry dockerhub-creds \
|
||||
--docker-server=https://index.docker.io/v1/ \
|
||||
--docker-username="$DOCKER_USERNAME" \
|
||||
--docker-password="$DOCKER_PASSWORD" \
|
||||
--docker-email="${DOCKERHUB_EMAIL:-noreply@bakery-ia.local}" \
|
||||
-n "$NAMESPACE"
|
||||
|
||||
echo "${GREEN}Docker Hub secret created successfully.${NC}"
|
||||
else
|
||||
echo "${YELLOW}Could not find Docker Hub credentials${NC}"
|
||||
echo "${YELLOW}To enable automatic Docker Hub authentication:${NC}"
|
||||
echo " 1. Run 'docker login', OR"
|
||||
echo " 2. Set environment variables:"
|
||||
echo " export DOCKERHUB_USERNAME='your-username'"
|
||||
echo " export DOCKERHUB_PASSWORD='your-password-or-token'"
|
||||
echo "${YELLOW}Continuing without Docker Hub authentication...${NC}"
|
||||
fi
|
||||
else
|
||||
echo "${YELLOW}Docker Hub credentials not found${NC}"
|
||||
echo "${YELLOW}To enable automatic Docker Hub authentication:${NC}"
|
||||
echo " 1. Run 'docker login', OR"
|
||||
echo " 2. Set environment variables:"
|
||||
echo " export DOCKERHUB_USERNAME='your-username'"
|
||||
echo " export DOCKERHUB_PASSWORD='your-password-or-token'"
|
||||
echo "${YELLOW}Continuing without Docker Hub authentication...${NC}"
|
||||
fi
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Function to add and update Helm repository
|
||||
setup_helm_repo() {
|
||||
echo "${BLUE}Setting up SigNoz Helm repository...${NC}"
|
||||
|
||||
if [[ "$DRY_RUN" == true ]]; then
|
||||
echo " (dry-run) Would add SigNoz Helm repository"
|
||||
return
|
||||
fi
|
||||
|
||||
# Add SigNoz Helm repository
|
||||
if helm repo list | grep -q "^signoz"; then
|
||||
echo "${BLUE}SigNoz repository already added, updating...${NC}"
|
||||
helm repo update signoz
|
||||
else
|
||||
echo "${BLUE}Adding SigNoz Helm repository...${NC}"
|
||||
helm repo add signoz https://charts.signoz.io
|
||||
helm repo update
|
||||
fi
|
||||
|
||||
echo "${GREEN}Helm repository ready.${NC}"
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Function to deploy SigNoz
|
||||
deploy_signoz() {
|
||||
local values_file="infrastructure/helm/signoz-values-$ENVIRONMENT.yaml"
|
||||
|
||||
if [[ ! -f "$values_file" ]]; then
|
||||
echo "${RED}Error: Values file $values_file not found.${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "${BLUE}Deploying SigNoz to $ENVIRONMENT environment...${NC}"
|
||||
echo " Using values file: $values_file"
|
||||
echo " Target namespace: $NAMESPACE"
|
||||
echo " Chart version: Latest from signoz/signoz"
|
||||
|
||||
if [[ "$DRY_RUN" == true ]]; then
|
||||
echo " (dry-run) Would deploy SigNoz with:"
|
||||
echo " helm upgrade --install signoz signoz/signoz -n $NAMESPACE -f $values_file --wait --timeout 15m"
|
||||
return
|
||||
fi
|
||||
|
||||
# Use upgrade --install to handle both new installations and upgrades
|
||||
echo "${BLUE}Installing/Upgrading SigNoz...${NC}"
|
||||
echo "This may take 10-15 minutes..."
|
||||
|
||||
helm upgrade --install signoz signoz/signoz \
|
||||
-n "$NAMESPACE" \
|
||||
-f "$values_file" \
|
||||
--wait \
|
||||
--timeout 15m \
|
||||
--create-namespace
|
||||
|
||||
echo "${GREEN}SigNoz deployment completed.${NC}"
|
||||
echo ""
|
||||
|
||||
# Show deployment status
|
||||
show_deployment_status
|
||||
}
|
||||
|
||||
# Function to remove SigNoz
|
||||
remove_signoz() {
|
||||
echo "${BLUE}Removing SigNoz deployment from namespace $NAMESPACE...${NC}"
|
||||
|
||||
if [[ "$DRY_RUN" == true ]]; then
|
||||
echo " (dry-run) Would remove SigNoz deployment"
|
||||
return
|
||||
fi
|
||||
|
||||
if helm list -n "$NAMESPACE" | grep -q signoz; then
|
||||
helm uninstall signoz -n "$NAMESPACE" --wait
|
||||
echo "${GREEN}SigNoz deployment removed.${NC}"
|
||||
|
||||
# Optionally remove PVCs (commented out by default for safety)
|
||||
echo ""
|
||||
echo "${YELLOW}Note: Persistent Volume Claims (PVCs) were NOT deleted.${NC}"
|
||||
echo "To delete PVCs and all data, run:"
|
||||
echo " kubectl delete pvc -n $NAMESPACE -l app.kubernetes.io/instance=signoz"
|
||||
else
|
||||
echo "${YELLOW}No SigNoz deployment found in namespace $NAMESPACE.${NC}"
|
||||
fi
|
||||
}
|
||||
|
||||
# Function to show deployment status
|
||||
show_deployment_status() {
|
||||
echo ""
|
||||
echo "${BLUE}=== SigNoz Deployment Status ===${NC}"
|
||||
echo ""
|
||||
|
||||
# Get pods
|
||||
echo "Pods:"
|
||||
kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
echo ""
|
||||
|
||||
# Get services
|
||||
echo "Services:"
|
||||
kubectl get svc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
echo ""
|
||||
|
||||
# Get ingress
|
||||
echo "Ingress:"
|
||||
kubectl get ingress -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
echo ""
|
||||
|
||||
# Show access information
|
||||
show_access_info
|
||||
}
|
||||
|
||||
# Function to show access information
|
||||
show_access_info() {
|
||||
echo "${BLUE}=== Access Information ===${NC}"
|
||||
|
||||
if [[ "$ENVIRONMENT" == "dev" ]]; then
|
||||
echo "SigNoz UI: http://monitoring.bakery-ia.local"
|
||||
echo ""
|
||||
echo "OpenTelemetry Collector Endpoints (from within cluster):"
|
||||
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
|
||||
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
|
||||
echo ""
|
||||
echo "Port-forward for local access:"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4317:4317"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4318:4318"
|
||||
else
|
||||
echo "SigNoz UI: https://monitoring.bakewise.ai"
|
||||
echo ""
|
||||
echo "OpenTelemetry Collector Endpoints (from within cluster):"
|
||||
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
|
||||
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
|
||||
echo ""
|
||||
echo "External endpoints (if exposed):"
|
||||
echo " Check ingress configuration for external OTLP endpoints"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "Default credentials:"
|
||||
echo " Username: admin@example.com"
|
||||
echo " Password: admin"
|
||||
echo ""
|
||||
echo "Note: Change default password after first login!"
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Main execution
|
||||
main() {
|
||||
echo "${BLUE}"
|
||||
echo "=========================================="
|
||||
echo "🚀 SigNoz Deployment for Bakery IA"
|
||||
echo "=========================================="
|
||||
echo "${NC}"
|
||||
|
||||
# Check prerequisites
|
||||
check_helm
|
||||
check_kubectl
|
||||
|
||||
# Ensure namespace
|
||||
ensure_namespace
|
||||
|
||||
if [[ "$REMOVE" == true ]]; then
|
||||
remove_signoz
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Setup Helm repository
|
||||
setup_helm_repo
|
||||
|
||||
# Create Docker Hub secret for image pulls
|
||||
create_dockerhub_secret
|
||||
|
||||
# Deploy SigNoz
|
||||
deploy_signoz
|
||||
|
||||
echo "${GREEN}"
|
||||
echo "=========================================="
|
||||
echo "✅ SigNoz deployment completed!"
|
||||
echo "=========================================="
|
||||
echo "${NC}"
|
||||
}
|
||||
|
||||
# Run main function
|
||||
main
|
||||
141
infrastructure/monitoring/signoz/generate-test-traffic.sh
Executable file
141
infrastructure/monitoring/signoz/generate-test-traffic.sh
Executable file
@@ -0,0 +1,141 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Generate Test Traffic to Services
|
||||
# This script generates API calls to verify telemetry data collection
|
||||
|
||||
set -e
|
||||
|
||||
NAMESPACE="bakery-ia"
|
||||
GREEN='\033[0;32m'
|
||||
BLUE='\033[0;34m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m'
|
||||
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE} Generating Test Traffic for SigNoz Verification${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo ""
|
||||
|
||||
# Check if ingress is accessible
|
||||
echo -e "${BLUE}Step 1: Verifying Gateway Access${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
GATEWAY_POD=$(kubectl get pods -n $NAMESPACE -l app=gateway --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
|
||||
if [[ -z "$GATEWAY_POD" ]]; then
|
||||
echo -e "${YELLOW}⚠ Gateway pod not running. Starting port-forward...${NC}"
|
||||
# Port forward in background
|
||||
kubectl port-forward -n $NAMESPACE svc/gateway-service 8000:8000 &
|
||||
PORT_FORWARD_PID=$!
|
||||
sleep 3
|
||||
API_URL="http://localhost:8000"
|
||||
else
|
||||
echo -e "${GREEN}✓ Gateway is running: $GATEWAY_POD${NC}"
|
||||
# Use internal service
|
||||
API_URL="http://gateway-service.$NAMESPACE.svc.cluster.local:8000"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Function to make API call from inside cluster
|
||||
make_request() {
|
||||
local endpoint=$1
|
||||
local description=$2
|
||||
|
||||
echo -e "${BLUE}→ Testing: $description${NC}"
|
||||
echo " Endpoint: $endpoint"
|
||||
|
||||
if [[ -n "$GATEWAY_POD" ]]; then
|
||||
# Make request from inside the gateway pod
|
||||
RESPONSE=$(kubectl exec -n $NAMESPACE $GATEWAY_POD -- curl -s -w "\nHTTP_CODE:%{http_code}" "$API_URL$endpoint" 2>/dev/null || echo "FAILED")
|
||||
else
|
||||
# Make request from localhost
|
||||
RESPONSE=$(curl -s -w "\nHTTP_CODE:%{http_code}" "$API_URL$endpoint" 2>/dev/null || echo "FAILED")
|
||||
fi
|
||||
|
||||
if [[ "$RESPONSE" == "FAILED" ]]; then
|
||||
echo -e " ${YELLOW}⚠ Request failed${NC}"
|
||||
else
|
||||
HTTP_CODE=$(echo "$RESPONSE" | grep "HTTP_CODE" | cut -d: -f2)
|
||||
if [[ "$HTTP_CODE" == "200" ]] || [[ "$HTTP_CODE" == "401" ]] || [[ "$HTTP_CODE" == "404" ]]; then
|
||||
echo -e " ${GREEN}✓ Response received (HTTP $HTTP_CODE)${NC}"
|
||||
else
|
||||
echo -e " ${YELLOW}⚠ Unexpected response (HTTP $HTTP_CODE)${NC}"
|
||||
fi
|
||||
fi
|
||||
echo ""
|
||||
sleep 1
|
||||
}
|
||||
|
||||
# Generate traffic to various endpoints
|
||||
echo -e "${BLUE}Step 2: Generating Traffic to Services${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo ""
|
||||
|
||||
# Health checks (should generate traces)
|
||||
make_request "/health" "Gateway Health Check"
|
||||
make_request "/api/health" "API Health Check"
|
||||
|
||||
# Auth service endpoints
|
||||
make_request "/api/auth/health" "Auth Service Health"
|
||||
|
||||
# Tenant service endpoints
|
||||
make_request "/api/tenants/health" "Tenant Service Health"
|
||||
|
||||
# Inventory service endpoints
|
||||
make_request "/api/inventory/health" "Inventory Service Health"
|
||||
|
||||
# Orders service endpoints
|
||||
make_request "/api/orders/health" "Orders Service Health"
|
||||
|
||||
# Forecasting service endpoints
|
||||
make_request "/api/forecasting/health" "Forecasting Service Health"
|
||||
|
||||
echo -e "${BLUE}Step 3: Checking Service Logs for Telemetry${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo ""
|
||||
|
||||
# Check a few service pods for tracing logs
|
||||
SERVICES=("auth-service" "inventory-service" "gateway")
|
||||
|
||||
for service in "${SERVICES[@]}"; do
|
||||
POD=$(kubectl get pods -n $NAMESPACE -l app=$service --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
|
||||
if [[ -n "$POD" ]]; then
|
||||
echo -e "${BLUE}Checking $service ($POD)...${NC}"
|
||||
TRACING_LOG=$(kubectl logs -n $NAMESPACE $POD --tail=100 2>/dev/null | grep -i "tracing\|otel" | head -n 2 || echo "")
|
||||
if [[ -n "$TRACING_LOG" ]]; then
|
||||
echo -e "${GREEN}✓ Tracing configured:${NC}"
|
||||
echo "$TRACING_LOG" | sed 's/^/ /'
|
||||
else
|
||||
echo -e "${YELLOW}⚠ No tracing logs found${NC}"
|
||||
fi
|
||||
echo ""
|
||||
fi
|
||||
done
|
||||
|
||||
# Wait for data to be processed
|
||||
echo -e "${BLUE}Step 4: Waiting for Data Processing${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "Waiting 30 seconds for telemetry data to be processed..."
|
||||
for i in {30..1}; do
|
||||
echo -ne "\r ${i} seconds remaining..."
|
||||
sleep 1
|
||||
done
|
||||
echo -e "\n"
|
||||
|
||||
# Cleanup port-forward if started
|
||||
if [[ -n "$PORT_FORWARD_PID" ]]; then
|
||||
kill $PORT_FORWARD_PID 2>/dev/null || true
|
||||
fi
|
||||
|
||||
echo -e "${GREEN}✓ Test traffic generation complete!${NC}"
|
||||
echo ""
|
||||
echo -e "${BLUE}Next Steps:${NC}"
|
||||
echo "1. Run the verification script to check for collected data:"
|
||||
echo " ./infrastructure/helm/verify-signoz-telemetry.sh"
|
||||
echo ""
|
||||
echo "2. Access SigNoz UI to visualize the data:"
|
||||
echo " https://monitoring.bakery-ia.local"
|
||||
echo " or"
|
||||
echo " kubectl port-forward -n bakery-ia svc/signoz 3301:8080"
|
||||
echo " Then go to: http://localhost:3301"
|
||||
echo ""
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
175
infrastructure/monitoring/signoz/import-dashboards.sh
Executable file
175
infrastructure/monitoring/signoz/import-dashboards.sh
Executable file
@@ -0,0 +1,175 @@
|
||||
#!/bin/bash
|
||||
|
||||
# SigNoz Dashboard Importer for Bakery IA
|
||||
# This script imports all SigNoz dashboards into your SigNoz instance
|
||||
|
||||
# Configuration
|
||||
SIGNOZ_HOST="localhost"
|
||||
SIGNOZ_PORT="3301"
|
||||
SIGNOZ_API_KEY="" # Add your API key if authentication is required
|
||||
DASHBOARDS_DIR="infrastructure/signoz/dashboards"
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Function to display help
|
||||
show_help() {
|
||||
echo "Usage: $0 [options]"
|
||||
echo ""
|
||||
echo "Options:
|
||||
-h, --host SigNoz host (default: localhost)
|
||||
-p, --port SigNoz port (default: 3301)
|
||||
-k, --api-key SigNoz API key (if required)
|
||||
-d, --dir Dashboards directory (default: infrastructure/signoz/dashboards)
|
||||
-h, --help Show this help message"
|
||||
echo ""
|
||||
echo "Example:
|
||||
$0 --host signoz.example.com --port 3301 --api-key your-api-key"
|
||||
}
|
||||
|
||||
# Parse command line arguments
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
-h|--host)
|
||||
SIGNOZ_HOST="$2"
|
||||
shift 2
|
||||
;;
|
||||
-p|--port)
|
||||
SIGNOZ_PORT="$2"
|
||||
shift 2
|
||||
;;
|
||||
-k|--api-key)
|
||||
SIGNOZ_API_KEY="$2"
|
||||
shift 2
|
||||
;;
|
||||
-d|--dir)
|
||||
DASHBOARDS_DIR="$2"
|
||||
shift 2
|
||||
;;
|
||||
--help)
|
||||
show_help
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo "Unknown option: $1"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Check if dashboards directory exists
|
||||
if [ ! -d "$DASHBOARDS_DIR" ]; then
|
||||
echo -e "${RED}Error: Dashboards directory not found: $DASHBOARDS_DIR${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check if jq is installed for JSON validation
|
||||
if ! command -v jq &> /dev/null; then
|
||||
echo -e "${YELLOW}Warning: jq not found. Skipping JSON validation.${NC}"
|
||||
VALIDATE_JSON=false
|
||||
else
|
||||
VALIDATE_JSON=true
|
||||
fi
|
||||
|
||||
# Function to validate JSON
|
||||
validate_json() {
|
||||
local file="$1"
|
||||
if [ "$VALIDATE_JSON" = true ]; then
|
||||
if ! jq empty "$file" &> /dev/null; then
|
||||
echo -e "${RED}Error: Invalid JSON in file: $file${NC}"
|
||||
return 1
|
||||
fi
|
||||
fi
|
||||
return 0
|
||||
}
|
||||
|
||||
# Function to import a single dashboard
|
||||
import_dashboard() {
|
||||
local file="$1"
|
||||
local filename=$(basename "$file")
|
||||
local dashboard_name=$(jq -r '.name' "$file" 2>/dev/null || echo "Unknown")
|
||||
|
||||
echo -e "${BLUE}Importing dashboard: $dashboard_name ($filename)${NC}"
|
||||
|
||||
# Prepare curl command
|
||||
local curl_cmd="curl -s -X POST http://$SIGNOZ_HOST:$SIGNOZ_PORT/api/v1/dashboards/import"
|
||||
|
||||
if [ -n "$SIGNOZ_API_KEY" ]; then
|
||||
curl_cmd="$curl_cmd -H \"Authorization: Bearer $SIGNOZ_API_KEY\""
|
||||
fi
|
||||
|
||||
curl_cmd="$curl_cmd -H \"Content-Type: application/json\" -d @\"$file\""
|
||||
|
||||
# Execute import
|
||||
local response=$(eval "$curl_cmd")
|
||||
|
||||
# Check response
|
||||
if echo "$response" | grep -q "success"; then
|
||||
echo -e "${GREEN}✓ Successfully imported: $dashboard_name${NC}"
|
||||
return 0
|
||||
else
|
||||
echo -e "${RED}✗ Failed to import: $dashboard_name${NC}"
|
||||
echo "Response: $response"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Main import process
|
||||
echo -e "${YELLOW}=== SigNoz Dashboard Importer for Bakery IA ===${NC}"
|
||||
echo -e "${BLUE}Configuration:${NC}"
|
||||
echo " Host: $SIGNOZ_HOST"
|
||||
echo " Port: $SIGNOZ_PORT"
|
||||
echo " Dashboards Directory: $DASHBOARDS_DIR"
|
||||
if [ -n "$SIGNOZ_API_KEY" ]; then
|
||||
echo " API Key: ******** (set)"
|
||||
else
|
||||
echo " API Key: Not configured"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Count dashboards
|
||||
DASHBOARD_COUNT=$(find "$DASHBOARDS_DIR" -name "*.json" | wc -l)
|
||||
echo -e "${BLUE}Found $DASHBOARD_COUNT dashboards to import${NC}"
|
||||
echo ""
|
||||
|
||||
# Import each dashboard
|
||||
SUCCESS_COUNT=0
|
||||
FAILURE_COUNT=0
|
||||
|
||||
for file in "$DASHBOARDS_DIR"/*.json; do
|
||||
if [ -f "$file" ]; then
|
||||
# Validate JSON
|
||||
if validate_json "$file"; then
|
||||
if import_dashboard "$file"; then
|
||||
((SUCCESS_COUNT++))
|
||||
else
|
||||
((FAILURE_COUNT++))
|
||||
fi
|
||||
else
|
||||
((FAILURE_COUNT++))
|
||||
fi
|
||||
echo ""
|
||||
fi
|
||||
done
|
||||
|
||||
# Summary
|
||||
echo -e "${YELLOW}=== Import Summary ===${NC}"
|
||||
echo -e "${GREEN}Successfully imported: $SUCCESS_COUNT dashboards${NC}"
|
||||
if [ $FAILURE_COUNT -gt 0 ]; then
|
||||
echo -e "${RED}Failed to import: $FAILURE_COUNT dashboards${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
if [ $FAILURE_COUNT -eq 0 ]; then
|
||||
echo -e "${GREEN}All dashboards imported successfully!${NC}"
|
||||
echo "You can now access them in your SigNoz UI at:"
|
||||
echo "http://$SIGNOZ_HOST:$SIGNOZ_PORT/dashboards"
|
||||
else
|
||||
echo -e "${YELLOW}Some dashboards failed to import. Check the errors above.${NC}"
|
||||
exit 1
|
||||
fi
|
||||
12
infrastructure/monitoring/signoz/signoz-values-dev.yaml
Normal file
12
infrastructure/monitoring/signoz/signoz-values-dev.yaml
Normal file
@@ -0,0 +1,12 @@
|
||||
# SigNoz Helm Chart Values - Development Environment
|
||||
# Optimized for local development with minimal resource usage
|
||||
# DEPLOYED IN bakery-ia NAMESPACE - Ingress managed by bakery-ingress
|
||||
#
|
||||
# Official Chart: https://github.com/SigNoz/charts
|
||||
# Install Command: helm install signoz signoz/signoz -n bakery-ia -f signoz-values-dev.yaml
|
||||
|
||||
global:
|
||||
storageClass: "standard"
|
||||
clusterName: "bakery-ia-dev"
|
||||
domain: "monitoring.bakery-ia.local"
|
||||
# Docker Hub credentials - applied to all sub-charts (including Zookeeper, ClickHouse, etc)
|
||||
12
infrastructure/monitoring/signoz/signoz-values-prod.yaml
Normal file
12
infrastructure/monitoring/signoz/signoz-values-prod.yaml
Normal file
@@ -0,0 +1,12 @@
|
||||
# SigNoz Helm Chart Values - Production Environment
|
||||
# High-availability configuration with resource optimization
|
||||
# DEPLOYED IN bakery-ia NAMESPACE - Ingress managed by bakery-ingress-prod
|
||||
#
|
||||
# Official Chart: https://github.com/SigNoz/charts
|
||||
# Install Command: helm install signoz signoz/signoz -n bakery-ia -f signoz-values-prod.yaml
|
||||
|
||||
global:
|
||||
storageClass: "microk8s-hostpath" # For MicroK8s, use "microk8s-hostpath" or custom storage class
|
||||
clusterName: "bakery-ia-prod"
|
||||
domain: "monitoring.bakewise.ai"
|
||||
# Docker Hub credentials - applied to all sub-charts (including Zookeeper, ClickHouse, etc)
|
||||
177
infrastructure/monitoring/signoz/verify-signoz-telemetry.sh
Executable file
177
infrastructure/monitoring/signoz/verify-signoz-telemetry.sh
Executable file
@@ -0,0 +1,177 @@
|
||||
#!/bin/bash
|
||||
|
||||
# SigNoz Telemetry Verification Script
|
||||
# This script verifies that services are correctly sending metrics, logs, and traces to SigNoz
|
||||
# and that SigNoz is collecting them properly.
|
||||
|
||||
set -e
|
||||
|
||||
NAMESPACE="bakery-ia"
|
||||
GREEN='\033[0;32m'
|
||||
RED='\033[0;31m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE} SigNoz Telemetry Verification Script${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo ""
|
||||
|
||||
# Step 1: Verify SigNoz Components are Running
|
||||
echo -e "${BLUE}[1/7] Checking SigNoz Components Status...${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
OTEL_POD=$(kubectl get pods -n $NAMESPACE -l app.kubernetes.io/name=signoz,app.kubernetes.io/component=otel-collector --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
|
||||
SIGNOZ_POD=$(kubectl get pods -n $NAMESPACE -l app.kubernetes.io/name=signoz,app.kubernetes.io/component=signoz --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
|
||||
CLICKHOUSE_POD=$(kubectl get pods -n $NAMESPACE -l clickhouse.altinity.com/chi=signoz-clickhouse --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
|
||||
|
||||
if [[ -n "$OTEL_POD" && -n "$SIGNOZ_POD" && -n "$CLICKHOUSE_POD" ]]; then
|
||||
echo -e "${GREEN}✓ All SigNoz components are running${NC}"
|
||||
echo " - OTel Collector: $OTEL_POD"
|
||||
echo " - SigNoz Frontend: $SIGNOZ_POD"
|
||||
echo " - ClickHouse: $CLICKHOUSE_POD"
|
||||
else
|
||||
echo -e "${RED}✗ Some SigNoz components are not running${NC}"
|
||||
kubectl get pods -n $NAMESPACE | grep signoz
|
||||
exit 1
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Step 2: Check OTel Collector Endpoints
|
||||
echo -e "${BLUE}[2/7] Verifying OTel Collector Endpoints...${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
OTEL_SVC=$(kubectl get svc -n $NAMESPACE signoz-otel-collector -o jsonpath='{.spec.clusterIP}')
|
||||
echo "OTel Collector Service IP: $OTEL_SVC"
|
||||
echo ""
|
||||
echo "Available endpoints:"
|
||||
kubectl get svc -n $NAMESPACE signoz-otel-collector -o jsonpath='{range .spec.ports[*]}{.name}{"\t"}{.port}{"\n"}{end}' | column -t
|
||||
echo ""
|
||||
echo -e "${GREEN}✓ OTel Collector endpoints are exposed${NC}"
|
||||
echo ""
|
||||
|
||||
# Step 3: Check OTel Collector Logs for Data Reception
|
||||
echo -e "${BLUE}[3/7] Checking OTel Collector for Recent Activity...${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
echo "Recent OTel Collector logs (last 20 lines):"
|
||||
kubectl logs -n $NAMESPACE $OTEL_POD --tail=20 | grep -E "received|exported|traces|metrics|logs" || echo "No recent telemetry data found in logs"
|
||||
echo ""
|
||||
|
||||
# Step 4: Check Service Configurations
|
||||
echo -e "${BLUE}[4/7] Verifying Service Telemetry Configuration...${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
# Check ConfigMap for OTEL settings
|
||||
OTEL_ENDPOINT=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.OTEL_EXPORTER_OTLP_ENDPOINT}')
|
||||
ENABLE_TRACING=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.ENABLE_TRACING}')
|
||||
ENABLE_METRICS=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.ENABLE_METRICS}')
|
||||
ENABLE_LOGS=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.ENABLE_LOGS}')
|
||||
|
||||
echo "Configuration from bakery-config ConfigMap:"
|
||||
echo " OTEL_EXPORTER_OTLP_ENDPOINT: $OTEL_ENDPOINT"
|
||||
echo " ENABLE_TRACING: $ENABLE_TRACING"
|
||||
echo " ENABLE_METRICS: $ENABLE_METRICS"
|
||||
echo " ENABLE_LOGS: $ENABLE_LOGS"
|
||||
echo ""
|
||||
|
||||
if [[ "$ENABLE_TRACING" == "true" && "$ENABLE_METRICS" == "true" && "$ENABLE_LOGS" == "true" ]]; then
|
||||
echo -e "${GREEN}✓ Telemetry is enabled in configuration${NC}"
|
||||
else
|
||||
echo -e "${YELLOW}⚠ Some telemetry features may be disabled${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Step 5: Test OTel Collector Health
|
||||
echo -e "${BLUE}[5/7] Testing OTel Collector Health Endpoint...${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
HEALTH_CHECK=$(kubectl exec -n $NAMESPACE $OTEL_POD -- wget -qO- http://localhost:13133/ 2>/dev/null || echo "FAILED")
|
||||
if [[ "$HEALTH_CHECK" == *"Server available"* ]] || [[ "$HEALTH_CHECK" == "{}" ]]; then
|
||||
echo -e "${GREEN}✓ OTel Collector health check passed${NC}"
|
||||
else
|
||||
echo -e "${RED}✗ OTel Collector health check failed${NC}"
|
||||
echo "Response: $HEALTH_CHECK"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Step 6: Query ClickHouse for Telemetry Data
|
||||
echo -e "${BLUE}[6/7] Querying ClickHouse for Telemetry Data...${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
|
||||
# Get ClickHouse credentials
|
||||
CH_PASSWORD=$(kubectl get secret -n $NAMESPACE signoz-clickhouse -o jsonpath='{.data.admin-password}' 2>/dev/null | base64 -d || echo "27ff0399-0d3a-4bd8-919d-17c2181e6fb9")
|
||||
|
||||
echo "Checking for traces in ClickHouse..."
|
||||
TRACES_COUNT=$(kubectl exec -n $NAMESPACE $CLICKHOUSE_POD -- clickhouse-client --user=admin --password=$CH_PASSWORD --query="SELECT count() FROM signoz_traces.signoz_index_v2 WHERE timestamp >= now() - INTERVAL 1 HOUR" 2>/dev/null || echo "0")
|
||||
echo " Traces in last hour: $TRACES_COUNT"
|
||||
|
||||
echo "Checking for metrics in ClickHouse..."
|
||||
METRICS_COUNT=$(kubectl exec -n $NAMESPACE $CLICKHOUSE_POD -- clickhouse-client --user=admin --password=$CH_PASSWORD --query="SELECT count() FROM signoz_metrics.samples_v4 WHERE unix_milli >= toUnixTimestamp(now() - INTERVAL 1 HOUR) * 1000" 2>/dev/null || echo "0")
|
||||
echo " Metrics in last hour: $METRICS_COUNT"
|
||||
|
||||
echo "Checking for logs in ClickHouse..."
|
||||
LOGS_COUNT=$(kubectl exec -n $NAMESPACE $CLICKHOUSE_POD -- clickhouse-client --user=admin --password=$CH_PASSWORD --query="SELECT count() FROM signoz_logs.logs WHERE timestamp >= now() - INTERVAL 1 HOUR" 2>/dev/null || echo "0")
|
||||
echo " Logs in last hour: $LOGS_COUNT"
|
||||
echo ""
|
||||
|
||||
if [[ "$TRACES_COUNT" -gt "0" || "$METRICS_COUNT" -gt "0" || "$LOGS_COUNT" -gt "0" ]]; then
|
||||
echo -e "${GREEN}✓ Telemetry data found in ClickHouse!${NC}"
|
||||
else
|
||||
echo -e "${YELLOW}⚠ No telemetry data found in the last hour${NC}"
|
||||
echo " This might be normal if:"
|
||||
echo " - Services were just deployed"
|
||||
echo " - No traffic has been generated yet"
|
||||
echo " - Services haven't finished initializing"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Step 7: Access Information
|
||||
echo -e "${BLUE}[7/7] SigNoz UI Access Information${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo ""
|
||||
echo "SigNoz is accessible via ingress at:"
|
||||
echo -e " ${GREEN}https://monitoring.bakery-ia.local${NC}"
|
||||
echo ""
|
||||
echo "Or via port-forward:"
|
||||
echo -e " ${YELLOW}kubectl port-forward -n $NAMESPACE svc/signoz 3301:8080${NC}"
|
||||
echo " Then access: http://localhost:3301"
|
||||
echo ""
|
||||
echo "To view OTel Collector metrics:"
|
||||
echo -e " ${YELLOW}kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 8888:8888${NC}"
|
||||
echo " Then access: http://localhost:8888/metrics"
|
||||
echo ""
|
||||
|
||||
# Summary
|
||||
echo ""
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE} Verification Summary${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo ""
|
||||
echo "Component Status:"
|
||||
echo " ✓ SigNoz components running"
|
||||
echo " ✓ OTel Collector healthy"
|
||||
echo " ✓ Configuration correct"
|
||||
echo ""
|
||||
echo "Data Collection (last hour):"
|
||||
echo " Traces: $TRACES_COUNT"
|
||||
echo " Metrics: $METRICS_COUNT"
|
||||
echo " Logs: $LOGS_COUNT"
|
||||
echo ""
|
||||
|
||||
if [[ "$TRACES_COUNT" -gt "0" || "$METRICS_COUNT" -gt "0" || "$LOGS_COUNT" -gt "0" ]]; then
|
||||
echo -e "${GREEN}✓ SigNoz is collecting telemetry data successfully!${NC}"
|
||||
else
|
||||
echo -e "${YELLOW}⚠ To generate telemetry data, try:${NC}"
|
||||
echo ""
|
||||
echo "1. Generate traffic to your services:"
|
||||
echo " curl http://localhost/api/health"
|
||||
echo ""
|
||||
echo "2. Check service logs for tracing initialization:"
|
||||
echo " kubectl logs -n $NAMESPACE <service-pod> | grep -i 'tracing\\|otel\\|signoz'"
|
||||
echo ""
|
||||
echo "3. Wait a few minutes and run this script again"
|
||||
fi
|
||||
echo ""
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
446
infrastructure/monitoring/signoz/verify-signoz.sh
Executable file
446
infrastructure/monitoring/signoz/verify-signoz.sh
Executable file
@@ -0,0 +1,446 @@
|
||||
#!/bin/bash
|
||||
|
||||
# ============================================================================
|
||||
# SigNoz Verification Script for Bakery IA
|
||||
# ============================================================================
|
||||
# This script verifies that SigNoz is properly deployed and functioning
|
||||
# ============================================================================
|
||||
|
||||
set -e
|
||||
|
||||
# Color codes for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Function to display help
|
||||
show_help() {
|
||||
echo "Usage: $0 [OPTIONS] ENVIRONMENT"
|
||||
echo ""
|
||||
echo "Verify SigNoz deployment for Bakery IA"
|
||||
echo ""
|
||||
echo "Arguments:
|
||||
ENVIRONMENT Environment to verify (dev|prod)"
|
||||
echo ""
|
||||
echo "Options:
|
||||
-h, --help Show this help message
|
||||
-n, --namespace NAMESPACE Specify namespace (default: bakery-ia)"
|
||||
echo ""
|
||||
echo "Examples:
|
||||
$0 dev # Verify development deployment
|
||||
$0 prod # Verify production deployment
|
||||
$0 --namespace monitoring dev # Verify with custom namespace"
|
||||
}
|
||||
|
||||
# Parse command line arguments
|
||||
NAMESPACE="bakery-ia"
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
-h|--help)
|
||||
show_help
|
||||
exit 0
|
||||
;;
|
||||
-n|--namespace)
|
||||
NAMESPACE="$2"
|
||||
shift 2
|
||||
;;
|
||||
dev|prod)
|
||||
ENVIRONMENT="$1"
|
||||
shift
|
||||
;;
|
||||
*)
|
||||
echo "Unknown argument: $1"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Validate environment
|
||||
if [[ -z "$ENVIRONMENT" ]]; then
|
||||
echo "Error: Environment not specified. Use 'dev' or 'prod'."
|
||||
show_help
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ "$ENVIRONMENT" != "dev" && "$ENVIRONMENT" != "prod" ]]; then
|
||||
echo "Error: Invalid environment. Use 'dev' or 'prod'."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Function to check if kubectl is configured
|
||||
check_kubectl() {
|
||||
if ! kubectl cluster-info &> /dev/null; then
|
||||
echo "${RED}Error: kubectl is not configured or cannot connect to cluster.${NC}"
|
||||
echo "Please ensure you have access to a Kubernetes cluster."
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Function to check namespace exists
|
||||
check_namespace() {
|
||||
if ! kubectl get namespace "$NAMESPACE" &> /dev/null; then
|
||||
echo "${RED}Error: Namespace $NAMESPACE does not exist.${NC}"
|
||||
echo "Please deploy SigNoz first using: ./deploy-signoz.sh $ENVIRONMENT"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Function to verify SigNoz deployment
|
||||
verify_deployment() {
|
||||
echo "${BLUE}"
|
||||
echo "=========================================="
|
||||
echo "🔍 Verifying SigNoz Deployment"
|
||||
echo "=========================================="
|
||||
echo "Environment: $ENVIRONMENT"
|
||||
echo "Namespace: $NAMESPACE"
|
||||
echo "${NC}"
|
||||
echo ""
|
||||
|
||||
# Check if SigNoz helm release exists
|
||||
echo "${BLUE}1. Checking Helm release...${NC}"
|
||||
if helm list -n "$NAMESPACE" | grep -q signoz; then
|
||||
echo "${GREEN}✅ SigNoz Helm release found${NC}"
|
||||
else
|
||||
echo "${RED}❌ SigNoz Helm release not found${NC}"
|
||||
echo "Please deploy SigNoz first using: ./deploy-signoz.sh $ENVIRONMENT"
|
||||
exit 1
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Check pod status
|
||||
echo "${BLUE}2. Checking pod status...${NC}"
|
||||
local total_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
|
||||
local running_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz --field-selector=status.phase=Running 2>/dev/null | grep -c "Running" || echo "0")
|
||||
local ready_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep "Running" | grep "1/1" | wc -l | tr -d ' ' || echo "0")
|
||||
|
||||
echo "Total pods: $total_pods"
|
||||
echo "Running pods: $running_pods"
|
||||
echo "Ready pods: $ready_pods"
|
||||
|
||||
if [[ $total_pods -eq 0 ]]; then
|
||||
echo "${RED}❌ No SigNoz pods found${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ $running_pods -eq $total_pods ]]; then
|
||||
echo "${GREEN}✅ All pods are running${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Some pods are not running${NC}"
|
||||
fi
|
||||
|
||||
if [[ $ready_pods -eq $total_pods ]]; then
|
||||
echo "${GREEN}✅ All pods are ready${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Some pods are not ready${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Show pod details
|
||||
echo "${BLUE}Pod Details:${NC}"
|
||||
kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
echo ""
|
||||
|
||||
# Check services
|
||||
echo "${BLUE}3. Checking services...${NC}"
|
||||
local service_count=$(kubectl get svc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
|
||||
|
||||
if [[ $service_count -gt 0 ]]; then
|
||||
echo "${GREEN}✅ Services found ($service_count services)${NC}"
|
||||
kubectl get svc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
else
|
||||
echo "${RED}❌ No services found${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Check ingress
|
||||
echo "${BLUE}4. Checking ingress...${NC}"
|
||||
local ingress_count=$(kubectl get ingress -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
|
||||
|
||||
if [[ $ingress_count -gt 0 ]]; then
|
||||
echo "${GREEN}✅ Ingress found ($ingress_count ingress resources)${NC}"
|
||||
kubectl get ingress -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
else
|
||||
echo "${YELLOW}⚠️ No ingress found (may be configured in main namespace)${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Check PVCs
|
||||
echo "${BLUE}5. Checking persistent volume claims...${NC}"
|
||||
local pvc_count=$(kubectl get pvc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
|
||||
|
||||
if [[ $pvc_count -gt 0 ]]; then
|
||||
echo "${GREEN}✅ PVCs found ($pvc_count PVCs)${NC}"
|
||||
kubectl get pvc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
else
|
||||
echo "${YELLOW}⚠️ No PVCs found (may not be required for all components)${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Check resource usage
|
||||
echo "${BLUE}6. Checking resource usage...${NC}"
|
||||
if command -v kubectl &> /dev/null && kubectl top pods -n "$NAMESPACE" &> /dev/null; then
|
||||
echo "${GREEN}✅ Resource usage:${NC}"
|
||||
kubectl top pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
|
||||
else
|
||||
echo "${YELLOW}⚠️ Metrics server not available or no resource usage data${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Check logs for errors
|
||||
echo "${BLUE}7. Checking for errors in logs...${NC}"
|
||||
local error_found=false
|
||||
|
||||
# Check each pod for errors
|
||||
while IFS= read -r pod; do
|
||||
if [[ -n "$pod" ]]; then
|
||||
local pod_errors=$(kubectl logs -n "$NAMESPACE" "$pod" 2>/dev/null | grep -i "error\|exception\|fail\|crash" | wc -l || echo "0")
|
||||
if [[ $pod_errors -gt 0 ]]; then
|
||||
echo "${RED}❌ Errors found in pod $pod ($pod_errors errors)${NC}"
|
||||
error_found=true
|
||||
fi
|
||||
fi
|
||||
done < <(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz -o name | sed 's|pod/||')
|
||||
|
||||
if [[ "$error_found" == false ]]; then
|
||||
echo "${GREEN}✅ No errors found in logs${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Environment-specific checks
|
||||
if [[ "$ENVIRONMENT" == "dev" ]]; then
|
||||
verify_dev_specific
|
||||
else
|
||||
verify_prod_specific
|
||||
fi
|
||||
|
||||
# Show access information
|
||||
show_access_info
|
||||
}
|
||||
|
||||
# Function for development-specific verification
|
||||
verify_dev_specific() {
|
||||
echo "${BLUE}8. Development-specific checks...${NC}"
|
||||
|
||||
# Check if ingress is configured
|
||||
if kubectl get ingress -n "$NAMESPACE" 2>/dev/null | grep -q "monitoring.bakery-ia.local"; then
|
||||
echo "${GREEN}✅ Development ingress configured${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Development ingress not found${NC}"
|
||||
fi
|
||||
|
||||
# Check unified signoz component resource limits (should be lower for dev)
|
||||
local signoz_mem=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.template.spec.containers[0].resources.limits.memory}' 2>/dev/null || echo "")
|
||||
if [[ -n "$signoz_mem" ]]; then
|
||||
echo "${GREEN}✅ SigNoz component found (memory limit: $signoz_mem)${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Could not verify SigNoz component resources${NC}"
|
||||
fi
|
||||
|
||||
# Check single replica setup for dev
|
||||
local replicas=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "0")
|
||||
if [[ $replicas -eq 1 ]]; then
|
||||
echo "${GREEN}✅ Single replica configuration (appropriate for dev)${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Multiple replicas detected (replicas: $replicas)${NC}"
|
||||
fi
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Function for production-specific verification
|
||||
verify_prod_specific() {
|
||||
echo "${BLUE}8. Production-specific checks...${NC}"
|
||||
|
||||
# Check if TLS is configured
|
||||
if kubectl get ingress -n "$NAMESPACE" 2>/dev/null | grep -q "signoz-tls"; then
|
||||
echo "${GREEN}✅ TLS certificate configured${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ TLS certificate not found${NC}"
|
||||
fi
|
||||
|
||||
# Check if multiple replicas are running for HA
|
||||
local signoz_replicas=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "1")
|
||||
if [[ $signoz_replicas -gt 1 ]]; then
|
||||
echo "${GREEN}✅ High availability configured ($signoz_replicas SigNoz replicas)${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Single SigNoz replica detected (not highly available)${NC}"
|
||||
fi
|
||||
|
||||
# Check Zookeeper replicas (critical for production)
|
||||
local zk_replicas=$(kubectl get statefulset -n "$NAMESPACE" -l app.kubernetes.io/component=zookeeper -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "0")
|
||||
if [[ $zk_replicas -eq 3 ]]; then
|
||||
echo "${GREEN}✅ Zookeeper properly configured with 3 replicas${NC}"
|
||||
elif [[ $zk_replicas -gt 0 ]]; then
|
||||
echo "${YELLOW}⚠️ Zookeeper has $zk_replicas replicas (recommend 3 for production)${NC}"
|
||||
else
|
||||
echo "${RED}❌ Zookeeper not found${NC}"
|
||||
fi
|
||||
|
||||
# Check OTel Collector replicas
|
||||
local otel_replicas=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=otel-collector -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "1")
|
||||
if [[ $otel_replicas -gt 1 ]]; then
|
||||
echo "${GREEN}✅ OTel Collector HA configured ($otel_replicas replicas)${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Single OTel Collector replica${NC}"
|
||||
fi
|
||||
|
||||
# Check resource limits (should be higher for prod)
|
||||
local signoz_mem=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.template.spec.containers[0].resources.limits.memory}' 2>/dev/null || echo "")
|
||||
if [[ -n "$signoz_mem" ]]; then
|
||||
echo "${GREEN}✅ Production resource limits applied (memory: $signoz_mem)${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Could not verify resource limits${NC}"
|
||||
fi
|
||||
|
||||
# Check HPA (Horizontal Pod Autoscaler)
|
||||
local hpa_count=$(kubectl get hpa -n "$NAMESPACE" 2>/dev/null | grep -c signoz || echo "0")
|
||||
if [[ $hpa_count -gt 0 ]]; then
|
||||
echo "${GREEN}✅ Horizontal Pod Autoscaler configured${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ No HPA found (consider enabling for production)${NC}"
|
||||
fi
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Function to show access information
|
||||
show_access_info() {
|
||||
echo "${BLUE}"
|
||||
echo "=========================================="
|
||||
echo "📋 Access Information"
|
||||
echo "=========================================="
|
||||
echo "${NC}"
|
||||
|
||||
if [[ "$ENVIRONMENT" == "dev" ]]; then
|
||||
echo "SigNoz UI: http://monitoring.bakery-ia.local"
|
||||
echo ""
|
||||
echo "OpenTelemetry Collector (within cluster):"
|
||||
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
|
||||
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
|
||||
echo ""
|
||||
echo "Port-forward for local access:"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4317:4317"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4318:4318"
|
||||
else
|
||||
echo "SigNoz UI: https://monitoring.bakewise.ai"
|
||||
echo ""
|
||||
echo "OpenTelemetry Collector (within cluster):"
|
||||
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
|
||||
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "Default Credentials:"
|
||||
echo " Username: admin@example.com"
|
||||
echo " Password: admin"
|
||||
echo ""
|
||||
echo "⚠️ IMPORTANT: Change default password after first login!"
|
||||
echo ""
|
||||
|
||||
# Show connection test commands
|
||||
echo "Connection Test Commands:"
|
||||
if [[ "$ENVIRONMENT" == "dev" ]]; then
|
||||
echo " # Test SigNoz UI"
|
||||
echo " curl http://monitoring.bakery-ia.local"
|
||||
echo ""
|
||||
echo " # Test via port-forward"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
|
||||
echo " curl http://localhost:8080"
|
||||
else
|
||||
echo " # Test SigNoz UI"
|
||||
echo " curl https://monitoring.bakewise.ai"
|
||||
echo ""
|
||||
echo " # Test API health"
|
||||
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
|
||||
echo " curl http://localhost:8080/api/v1/health"
|
||||
fi
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Function to run connectivity tests
|
||||
run_connectivity_tests() {
|
||||
echo "${BLUE}"
|
||||
echo "=========================================="
|
||||
echo "🔗 Running Connectivity Tests"
|
||||
echo "=========================================="
|
||||
echo "${NC}"
|
||||
|
||||
# Test pod readiness first
|
||||
echo "Checking pod readiness..."
|
||||
local ready_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz --field-selector=status.phase=Running 2>/dev/null | grep "Running" | grep -c "1/1\|2/2" || echo "0")
|
||||
local total_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
|
||||
|
||||
if [[ $ready_pods -eq $total_pods && $total_pods -gt 0 ]]; then
|
||||
echo "${GREEN}✅ All pods are ready ($ready_pods/$total_pods)${NC}"
|
||||
else
|
||||
echo "${YELLOW}⚠️ Some pods not ready ($ready_pods/$total_pods)${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Test internal service connectivity
|
||||
echo "Testing internal service connectivity..."
|
||||
local signoz_svc=$(kubectl get svc -n "$NAMESPACE" signoz -o jsonpath='{.spec.clusterIP}' 2>/dev/null || echo "")
|
||||
if [[ -n "$signoz_svc" ]]; then
|
||||
echo "${GREEN}✅ SigNoz service accessible at $signoz_svc:8080${NC}"
|
||||
else
|
||||
echo "${RED}❌ SigNoz service not found${NC}"
|
||||
fi
|
||||
|
||||
local otel_svc=$(kubectl get svc -n "$NAMESPACE" signoz-otel-collector -o jsonpath='{.spec.clusterIP}' 2>/dev/null || echo "")
|
||||
if [[ -n "$otel_svc" ]]; then
|
||||
echo "${GREEN}✅ OTel Collector service accessible at $otel_svc:4317 (gRPC), $otel_svc:4318 (HTTP)${NC}"
|
||||
else
|
||||
echo "${RED}❌ OTel Collector service not found${NC}"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
if [[ "$ENVIRONMENT" == "prod" ]]; then
|
||||
echo "${YELLOW}⚠️ Production connectivity tests require valid DNS and TLS${NC}"
|
||||
echo " Please ensure monitoring.bakewise.ai resolves to your cluster"
|
||||
echo ""
|
||||
echo "Manual test:"
|
||||
echo " curl -I https://monitoring.bakewise.ai"
|
||||
fi
|
||||
}
|
||||
|
||||
# Main execution
|
||||
main() {
|
||||
echo "${BLUE}"
|
||||
echo "=========================================="
|
||||
echo "🔍 SigNoz Verification for Bakery IA"
|
||||
echo "=========================================="
|
||||
echo "${NC}"
|
||||
|
||||
# Check prerequisites
|
||||
check_kubectl
|
||||
check_namespace
|
||||
|
||||
# Verify deployment
|
||||
verify_deployment
|
||||
|
||||
# Run connectivity tests
|
||||
run_connectivity_tests
|
||||
|
||||
echo "${GREEN}"
|
||||
echo "=========================================="
|
||||
echo "✅ Verification Complete"
|
||||
echo "=========================================="
|
||||
echo "${NC}"
|
||||
|
||||
echo "Summary:"
|
||||
echo " Environment: $ENVIRONMENT"
|
||||
echo " Namespace: $NAMESPACE"
|
||||
echo ""
|
||||
echo "Next Steps:"
|
||||
echo " 1. Access SigNoz UI and verify dashboards"
|
||||
echo " 2. Configure alert rules for your services"
|
||||
echo " 3. Instrument your applications with OpenTelemetry"
|
||||
echo " 4. Set up custom dashboards for key metrics"
|
||||
echo ""
|
||||
}
|
||||
|
||||
# Run main function
|
||||
main
|
||||
9
infrastructure/namespaces/bakery-ia.yaml
Normal file
9
infrastructure/namespaces/bakery-ia.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: bakery-ia
|
||||
labels:
|
||||
name: bakery-ia
|
||||
environment: local
|
||||
app.kubernetes.io/name: bakery-ia
|
||||
app.kubernetes.io/part-of: bakery-forecasting-platform
|
||||
11
infrastructure/namespaces/flux-system.yaml
Normal file
11
infrastructure/namespaces/flux-system.yaml
Normal file
@@ -0,0 +1,11 @@
|
||||
# Flux System Namespace
|
||||
# This namespace is required for Flux CD components
|
||||
# It should be created before any Flux resources are applied
|
||||
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: flux-system
|
||||
labels:
|
||||
app.kubernetes.io/name: flux
|
||||
kubernetes.io/metadata.name: flux-system
|
||||
7
infrastructure/namespaces/kustomization.yaml
Normal file
7
infrastructure/namespaces/kustomization.yaml
Normal file
@@ -0,0 +1,7 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
resources:
|
||||
- bakery-ia.yaml
|
||||
- tekton-pipelines.yaml
|
||||
- flux-system.yaml
|
||||
11
infrastructure/namespaces/tekton-pipelines.yaml
Normal file
11
infrastructure/namespaces/tekton-pipelines.yaml
Normal file
@@ -0,0 +1,11 @@
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: tekton-pipelines
|
||||
labels:
|
||||
app.kubernetes.io/name: tekton
|
||||
app.kubernetes.io/component: pipelines
|
||||
kubernetes.io/metadata.name: tekton-pipelines
|
||||
pod-security.kubernetes.io/enforce: restricted
|
||||
pod-security.kubernetes.io/audit: restricted
|
||||
pod-security.kubernetes.io/warn: restricted
|
||||
@@ -0,0 +1,27 @@
|
||||
# Create a root CA certificate for local development
|
||||
# NOTE: This certificate must be ready before the local-ca-issuer can be used
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: Certificate
|
||||
metadata:
|
||||
name: local-ca-cert
|
||||
namespace: cert-manager # This ensures the secret is created in the cert-manager namespace
|
||||
spec:
|
||||
isCA: true
|
||||
commonName: bakery-ia-local-ca
|
||||
subject:
|
||||
organizationalUnits:
|
||||
- "Bakery IA Local CA"
|
||||
organizations:
|
||||
- "Bakery IA"
|
||||
countries:
|
||||
- "US"
|
||||
secretName: local-ca-key-pair
|
||||
privateKey:
|
||||
algorithm: ECDSA
|
||||
size: 256
|
||||
issuerRef:
|
||||
name: selfsigned-issuer
|
||||
kind: ClusterIssuer
|
||||
group: cert-manager.io
|
||||
duration: 8760h # 1 year
|
||||
renewBefore: 720h # 30 days
|
||||
23
infrastructure/platform/cert-manager/cert-manager.yaml
Normal file
23
infrastructure/platform/cert-manager/cert-manager.yaml
Normal file
@@ -0,0 +1,23 @@
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: cert-manager
|
||||
---
|
||||
# NOTE: Do NOT define cert-manager ServiceAccounts here!
|
||||
# The ServiceAccounts (cert-manager, cert-manager-cainjector, cert-manager-webhook)
|
||||
# are created by the upstream cert-manager installation (kubernetes_restart.sh).
|
||||
# Redefining them here would strip their RBAC bindings and break authentication.
|
||||
---
|
||||
# Self-signed ClusterIssuer for bootstrapping the CA certificate chain
|
||||
# This issuer is used to create the root CA certificate which then
|
||||
# becomes the issuer for all other certificates in the cluster
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: selfsigned-issuer
|
||||
spec:
|
||||
selfSigned: {}
|
||||
---
|
||||
# Cert-manager installation using Helm repository
|
||||
# This will be installed via kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml
|
||||
# The actual installation will be done via command line, this file documents the resources
|
||||
@@ -0,0 +1,23 @@
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-production
|
||||
namespace: cert-manager
|
||||
spec:
|
||||
acme:
|
||||
# The ACME server URL (Let's Encrypt production)
|
||||
server: https://acme-v02.api.letsencrypt.org/directory
|
||||
# Email address used for ACME registration
|
||||
email: admin@bakewise.ai
|
||||
# Name of a secret used to store the ACME account private key
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-production
|
||||
# Enable the HTTP-01 challenge provider
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: public
|
||||
podTemplate:
|
||||
spec:
|
||||
nodeSelector:
|
||||
"kubernetes.io/os": linux
|
||||
@@ -0,0 +1,24 @@
|
||||
# Let's Encrypt Staging ClusterIssuer
|
||||
# Use this for testing before switching to production
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-staging
|
||||
spec:
|
||||
acme:
|
||||
# The ACME server URL (Let's Encrypt staging)
|
||||
server: https://acme-staging-v02.api.letsencrypt.org/directory
|
||||
# Email address used for ACME registration
|
||||
email: admin@bakery-ia.local # Change this to your email
|
||||
# Name of a secret used to store the ACME account private key
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-staging
|
||||
# Enable the HTTP-01 challenge provider
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: public
|
||||
podTemplate:
|
||||
spec:
|
||||
nodeSelector:
|
||||
"kubernetes.io/os": linux
|
||||
9
infrastructure/platform/cert-manager/kustomization.yaml
Normal file
9
infrastructure/platform/cert-manager/kustomization.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
resources:
|
||||
- cert-manager.yaml
|
||||
- ca-root-certificate.yaml
|
||||
- local-ca-issuer.yaml
|
||||
- cluster-issuer-staging.yaml
|
||||
- cluster-issuer-production.yaml
|
||||
@@ -0,0 +1,7 @@
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: local-ca-issuer
|
||||
spec:
|
||||
ca:
|
||||
secretName: local-ca-key-pair
|
||||
@@ -0,0 +1,8 @@
|
||||
# Self-signed ClusterIssuer for local development certificates
|
||||
# This issuer can generate self-signed certificates without needing external CA
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: selfsigned-issuer
|
||||
spec:
|
||||
selfSigned: {}
|
||||
104
infrastructure/platform/gateway/gateway-service.yaml
Normal file
104
infrastructure/platform/gateway/gateway-service.yaml
Normal file
@@ -0,0 +1,104 @@
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: gateway
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: gateway
|
||||
app.kubernetes.io/component: gateway
|
||||
app.kubernetes.io/part-of: bakery-ia
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: gateway
|
||||
app.kubernetes.io/component: gateway
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app.kubernetes.io/name: gateway
|
||||
app.kubernetes.io/component: gateway
|
||||
spec:
|
||||
containers:
|
||||
- name: gateway
|
||||
image: bakery/gateway:latest
|
||||
ports:
|
||||
- containerPort: 8000
|
||||
name: http
|
||||
envFrom:
|
||||
- configMapRef:
|
||||
name: bakery-config
|
||||
- secretRef:
|
||||
name: database-secrets
|
||||
- secretRef:
|
||||
name: redis-secrets
|
||||
- secretRef:
|
||||
name: rabbitmq-secrets
|
||||
- secretRef:
|
||||
name: jwt-secrets
|
||||
- secretRef:
|
||||
name: external-api-secrets
|
||||
- secretRef:
|
||||
name: payment-secrets
|
||||
- secretRef:
|
||||
name: email-secrets
|
||||
- secretRef:
|
||||
name: monitoring-secrets
|
||||
- secretRef:
|
||||
name: pos-integration-secrets
|
||||
- secretRef:
|
||||
name: whatsapp-secrets
|
||||
env:
|
||||
- name: OTEL_EXPORTER_OTLP_ENDPOINT
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: bakery-config
|
||||
key: OTEL_EXPORTER_OTLP_ENDPOINT
|
||||
- name: SIGNOZ_OTEL_COLLECTOR_URL
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: bakery-config
|
||||
key: SIGNOZ_OTEL_COLLECTOR_URL
|
||||
resources:
|
||||
requests:
|
||||
memory: "256Mi"
|
||||
cpu: "100m"
|
||||
limits:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /health
|
||||
port: 8000
|
||||
initialDelaySeconds: 30
|
||||
timeoutSeconds: 10
|
||||
periodSeconds: 30
|
||||
failureThreshold: 3
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /health
|
||||
port: 8000
|
||||
initialDelaySeconds: 5
|
||||
timeoutSeconds: 5
|
||||
periodSeconds: 10
|
||||
failureThreshold: 3
|
||||
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: gateway-service
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: gateway
|
||||
app.kubernetes.io/component: gateway
|
||||
spec:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 8000
|
||||
targetPort: 8000
|
||||
protocol: TCP
|
||||
name: http
|
||||
selector:
|
||||
app.kubernetes.io/name: gateway
|
||||
app.kubernetes.io/component: gateway
|
||||
5
infrastructure/platform/gateway/kustomization.yaml
Normal file
5
infrastructure/platform/gateway/kustomization.yaml
Normal file
@@ -0,0 +1,5 @@
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
|
||||
resources:
|
||||
- gateway-service.yaml
|
||||
45
infrastructure/platform/hpa/forecasting-hpa.yaml
Normal file
45
infrastructure/platform/hpa/forecasting-hpa.yaml
Normal file
@@ -0,0 +1,45 @@
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: forecasting-service-hpa
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: forecasting-service
|
||||
app.kubernetes.io/component: autoscaling
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: forecasting-service
|
||||
minReplicas: 1
|
||||
maxReplicas: 3
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 70
|
||||
- type: Resource
|
||||
resource:
|
||||
name: memory
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 75
|
||||
behavior:
|
||||
scaleDown:
|
||||
stabilizationWindowSeconds: 300
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 50
|
||||
periodSeconds: 60
|
||||
scaleUp:
|
||||
stabilizationWindowSeconds: 60
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 30
|
||||
- type: Pods
|
||||
value: 1
|
||||
periodSeconds: 60
|
||||
selectPolicy: Max
|
||||
45
infrastructure/platform/hpa/notification-hpa.yaml
Normal file
45
infrastructure/platform/hpa/notification-hpa.yaml
Normal file
@@ -0,0 +1,45 @@
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: notification-service-hpa
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: notification-service
|
||||
app.kubernetes.io/component: autoscaling
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: notification-service
|
||||
minReplicas: 1
|
||||
maxReplicas: 3
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 70
|
||||
- type: Resource
|
||||
resource:
|
||||
name: memory
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 80
|
||||
behavior:
|
||||
scaleDown:
|
||||
stabilizationWindowSeconds: 300
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 50
|
||||
periodSeconds: 60
|
||||
scaleUp:
|
||||
stabilizationWindowSeconds: 60
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 30
|
||||
- type: Pods
|
||||
value: 1
|
||||
periodSeconds: 60
|
||||
selectPolicy: Max
|
||||
45
infrastructure/platform/hpa/orders-hpa.yaml
Normal file
45
infrastructure/platform/hpa/orders-hpa.yaml
Normal file
@@ -0,0 +1,45 @@
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: orders-service-hpa
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: orders-service
|
||||
app.kubernetes.io/component: autoscaling
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: orders-service
|
||||
minReplicas: 1
|
||||
maxReplicas: 3
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 70
|
||||
- type: Resource
|
||||
resource:
|
||||
name: memory
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 80
|
||||
behavior:
|
||||
scaleDown:
|
||||
stabilizationWindowSeconds: 300
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 50
|
||||
periodSeconds: 60
|
||||
scaleUp:
|
||||
stabilizationWindowSeconds: 60
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 30
|
||||
- type: Pods
|
||||
value: 1
|
||||
periodSeconds: 60
|
||||
selectPolicy: Max
|
||||
198
infrastructure/platform/mail/mailu-helm/MIGRATION_GUIDE.md
Normal file
198
infrastructure/platform/mail/mailu-helm/MIGRATION_GUIDE.md
Normal file
@@ -0,0 +1,198 @@
|
||||
# Mailu Migration Guide: From Kustomize to Helm
|
||||
|
||||
This document outlines the migration process from the Kustomize-based Mailu deployment to the Helm-based deployment.
|
||||
|
||||
## Overview
|
||||
|
||||
The Mailu email server has been migrated from a Kustomize-based deployment to a Helm chart-based deployment. This change provides better maintainability, easier upgrades, and standardized configuration management.
|
||||
|
||||
## Key Changes
|
||||
|
||||
### 1. Service Names
|
||||
- **Old**: `mailu-smtp`, `email-smtp`, `mailu-front`, `mailu-admin`, `mailu-imap`, `mailu-antispam`
|
||||
- **New**: `mailu-postfix`, `mailu-front`, `mailu-admin`, `mailu-dovecot`, `mailu-rspamd`
|
||||
|
||||
### 2. Configuration Method
|
||||
- **Old**: Individual YAML manifests with Kustomize overlays
|
||||
- **New**: Helm chart with values files for environment-specific configuration
|
||||
|
||||
### 3. Directory Structure
|
||||
- **Old**: `infrastructure/platform/mail/mailu/{base,overlays/{dev,prod}}`
|
||||
- **New**: `infrastructure/platform/mail/mailu-helm/{dev,prod}`
|
||||
|
||||
### 4. Ingress Configuration
|
||||
- **Old**: Ingress resources created as part of the Kustomize setup
|
||||
- **New**: Built-in ingress disabled in Helm chart to work with existing ingress controller
|
||||
|
||||
## Updated Service References
|
||||
|
||||
The following configurations have been updated to use the new Helm service names:
|
||||
|
||||
## Ingress Configuration
|
||||
|
||||
The Mailu Helm chart has been configured to work with your existing ingress setup:
|
||||
|
||||
- **ingress.enabled: false**: Disables the chart's built-in Ingress creation
|
||||
- **tlsFlavorOverride: notls**: Tells Mailu's internal NGINX not to enforce TLS, as your Ingress handles TLS termination
|
||||
- **realIpHeader: X-Forwarded-For**: Ensures Mailu's NGINX logs and processes the correct client IPs from behind your Ingress
|
||||
- **realIpFrom: 0.0.0.0/0**: Trusts all proxies (restrict to your Ingress pod CIDR for security)
|
||||
|
||||
### Required Ingress Resource
|
||||
|
||||
You need to create an Ingress resource to route traffic to Mailu. Here's an example:
|
||||
|
||||
```yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: mailu-ingress
|
||||
namespace: bakery-ia # Same as Mailu's namespace
|
||||
annotations:
|
||||
kubernetes.io/ingress.class: nginx # Or your Ingress class
|
||||
nginx.ingress.kubernetes.io/proxy-body-size: "100m" # Allow larger email attachments
|
||||
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600" # For long connections
|
||||
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
|
||||
nginx.ingress.kubernetes.io/force-ssl-redirect: "true" # Redirect HTTP to HTTPS
|
||||
# If using Cert-Manager: cert-manager.io/cluster-issuer: "letsencrypt-prod"
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- mail.bakery-ia.dev # or mail.bakewise.ai for prod
|
||||
secretName: mail-tls-secret # Your TLS Secret
|
||||
rules:
|
||||
- host: mail.bakery-ia.dev # or mail.bakewise.ai for prod
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: mailu-front-http # Mailu's front service (check with kubectl get svc -n bakery-ia)
|
||||
port:
|
||||
number: 80
|
||||
```
|
||||
|
||||
Apply it: `kubectl apply -f ingress.yaml`.
|
||||
|
||||
This routes all traffic from https://mail.[domain]/ to Mailu's internal NGINX, which proxies to webmail (/webmail), admin (/admin), etc.
|
||||
|
||||
## Updated Service References
|
||||
|
||||
The following configurations have been updated to use the new Helm service names:
|
||||
|
||||
### Common ConfigMap
|
||||
- `SMTP_HOST` changed from `email-smtp.bakery-ia.svc.cluster.local` to `mailu-postfix.bakery-ia.svc.cluster.local`
|
||||
|
||||
### SigNoz Configuration
|
||||
- `signoz_smtp_host` changed from `email-smtp.bakery-ia.svc.cluster.local` to `mailu-postfix.bakery-ia.svc.cluster.local`
|
||||
- `smtp_smarthost` changed from `email-smtp.bakery-ia.svc.cluster.local:587` to `mailu-postfix.bakery-ia.svc.cluster.local:587`
|
||||
|
||||
## Deployment Process
|
||||
|
||||
### Prerequisites
|
||||
1. Helm 3.x installed
|
||||
2. Access to Kubernetes cluster
|
||||
3. Namespace `bakery-ia` exists
|
||||
|
||||
### Deployment Commands
|
||||
|
||||
#### For Development:
|
||||
```bash
|
||||
# Add Mailu Helm repository
|
||||
helm repo add mailu https://mailu.github.io/helm-charts/
|
||||
helm repo update
|
||||
|
||||
# Install Mailu for development
|
||||
helm upgrade --install mailu-dev mailu/mailu \
|
||||
--namespace bakery-ia \
|
||||
--create-namespace \
|
||||
--values infrastructure/platform/mail/mailu-helm/values.yaml \
|
||||
--values infrastructure/platform/mail/mailu-helm/dev/values.yaml
|
||||
```
|
||||
|
||||
#### For Production:
|
||||
```bash
|
||||
# Add Mailu Helm repository
|
||||
helm repo add mailu https://mailu.github.io/helm-charts/
|
||||
helm repo update
|
||||
|
||||
# Install Mailu for production
|
||||
helm upgrade --install mailu-prod mailu/mailu \
|
||||
--namespace bakery-ia \
|
||||
--create-namespace \
|
||||
--values infrastructure/platform/mail/mailu-helm/values.yaml \
|
||||
--values infrastructure/platform/mail/mailu-helm/prod/values.yaml
|
||||
```
|
||||
|
||||
## Critical Configuration Preservation
|
||||
|
||||
All critical configurations from the original Kustomize setup have been preserved:
|
||||
|
||||
- Domain and hostname settings
|
||||
- External SMTP relay configuration (Mailgun)
|
||||
- Redis integration with shared cluster
|
||||
- Database connection settings
|
||||
- TLS certificate management
|
||||
- Resource limits and requests
|
||||
- Network policies
|
||||
- Storage configuration (10Gi PVC)
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
If rollback to the Kustomize setup is needed:
|
||||
|
||||
1. Uninstall the Helm release:
|
||||
```bash
|
||||
helm uninstall mailu-dev -n bakery-ia # or mailu-prod
|
||||
```
|
||||
|
||||
2. Revert the configuration changes in `infrastructure/environments/common/configs/configmap.yaml` and `infrastructure/monitoring/signoz/signoz-values-prod.yaml`
|
||||
|
||||
3. Deploy the old Kustomize manifests:
|
||||
```bash
|
||||
kubectl apply -k infrastructure/platform/mail/mailu/overlays/dev
|
||||
# or
|
||||
kubectl apply -k infrastructure/platform/mail/mailu/overlays/prod
|
||||
```
|
||||
|
||||
## Verification Steps
|
||||
|
||||
After deployment, verify the following:
|
||||
|
||||
1. Check that all Mailu pods are running:
|
||||
```bash
|
||||
kubectl get pods -n bakery-ia | grep mailu
|
||||
```
|
||||
|
||||
2. Verify SMTP connectivity from other services:
|
||||
```bash
|
||||
# Test from a pod in the same namespace
|
||||
kubectl run test-smtp --image=curlimages/curl -n bakery-ia --rm -it -- \
|
||||
nc -zv mailu-postfix.bakery-ia.svc.cluster.local 587
|
||||
```
|
||||
|
||||
3. Check that notification service can send emails:
|
||||
```bash
|
||||
kubectl logs -n bakery-ia deployment/notification-service | grep -i smtp
|
||||
```
|
||||
|
||||
4. Verify web interface accessibility:
|
||||
```bash
|
||||
kubectl port-forward -n bakery-ia svc/mailu-front 8080:80
|
||||
# Then visit http://localhost:8080/admin
|
||||
```
|
||||
|
||||
## Known Issues
|
||||
|
||||
1. During migration, existing email data should be backed up before uninstalling the old deployment
|
||||
2. DNS records may need to be updated to point to the new service endpoints
|
||||
3. Some custom configurations may need to be reapplied after Helm installation
|
||||
|
||||
## Support
|
||||
|
||||
For issues with the new Helm-based deployment:
|
||||
|
||||
1. Check the [official Mailu Helm chart documentation](https://github.com/Mailu/helm-charts)
|
||||
2. Review Helm release status: `helm status mailu-[dev|prod] -n bakery-ia`
|
||||
3. Check pod logs: `kubectl logs -n bakery-ia deployment/[mailu-postfix|mailu-front|etc.]`
|
||||
4. Verify network connectivity between services
|
||||
171
infrastructure/platform/mail/mailu-helm/README.md
Normal file
171
infrastructure/platform/mail/mailu-helm/README.md
Normal file
@@ -0,0 +1,171 @@
|
||||
# Mailu Helm Chart for Bakery-IA
|
||||
|
||||
This directory contains the Helm chart configuration for Mailu, replacing the previous Kustomize-based setup.
|
||||
|
||||
## Overview
|
||||
|
||||
The Mailu email server is now deployed using the official Mailu Helm chart instead of Kustomize manifests. This provides better maintainability, easier upgrades, and standardized configuration. The setup is configured to work behind your existing Ingress controller (NGINX), with the internal Mailu NGINX acting as a proxy for services like webmail while your existing Ingress handles traffic routing, TLS termination, and forwarding to Mailu's internal NGINX on HTTP (port 80).
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
mailu-helm/
|
||||
├── values.yaml # Base configuration values
|
||||
├── dev/
|
||||
│ └── values.yaml # Development-specific overrides
|
||||
├── prod/
|
||||
│ └── values.yaml # Production-specific overrides
|
||||
└── mailu-ingress.yaml # Sample ingress configuration for use with existing ingress
|
||||
```
|
||||
|
||||
## Critical Configuration Preservation
|
||||
|
||||
The following critical configurations from the original Kustomize setup have been preserved:
|
||||
|
||||
- **Domain settings**: Domain and hostnames for both dev and prod
|
||||
- **External relay**: Mailgun SMTP relay configuration
|
||||
- **Redis integration**: Connection to shared Redis cluster (database 15)
|
||||
- **Database settings**: PostgreSQL connection details
|
||||
- **Resource limits**: CPU and memory requests/limits matching original setup
|
||||
- **Network policies**: Security policies restricting access to authorized services
|
||||
- **Storage**: 10Gi persistent volume for mail data
|
||||
- **Ingress configuration**: Built-in ingress disabled to work with existing ingress
|
||||
|
||||
## Deployment
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. Helm 3.x installed
|
||||
2. Kubernetes cluster with storage provisioner
|
||||
3. Ingress controller (NGINX) - already deployed in your cluster
|
||||
4. Cert-manager for TLS certificates (optional, depends on your ingress setup)
|
||||
5. External SMTP relay account (Mailgun)
|
||||
|
||||
### Deployment Commands
|
||||
|
||||
#### For Development:
|
||||
```bash
|
||||
helm repo add mailu https://mailu.github.io/helm-charts/
|
||||
helm repo update
|
||||
helm install mailu-dev mailu/mailu \
|
||||
--namespace bakery-ia \
|
||||
--create-namespace \
|
||||
--values mailu-helm/values.yaml \
|
||||
--values mailu-helm/dev/values.yaml
|
||||
```
|
||||
|
||||
#### For Production:
|
||||
```bash
|
||||
helm repo add mailu https://mailu.github.io/helm-charts/
|
||||
helm repo update
|
||||
helm install mailu-prod mailu/mailu \
|
||||
--namespace bakery-ia \
|
||||
--create-namespace \
|
||||
--values mailu-helm/values.yaml \
|
||||
--values mailu-helm/prod/values.yaml
|
||||
```
|
||||
|
||||
### Upgrading
|
||||
|
||||
To upgrade to a newer version of the Mailu Helm chart:
|
||||
```bash
|
||||
helm repo update
|
||||
helm upgrade mailu-dev mailu/mailu \
|
||||
--namespace bakery-ia \
|
||||
--values mailu-helm/values.yaml \
|
||||
--values mailu-helm/dev/values.yaml
|
||||
```
|
||||
|
||||
## Ingress Configuration
|
||||
|
||||
The Mailu Helm chart is configured to work with your existing Ingress setup:
|
||||
|
||||
- **ingress.enabled: false**: Disables the chart's built-in Ingress creation
|
||||
- **tlsFlavorOverride: notls**: Tells Mailu's internal NGINX not to enforce TLS, as your Ingress handles TLS termination
|
||||
- **realIpHeader: X-Forwarded-For**: Ensures Mailu's NGINX logs and processes the correct client IPs from behind your Ingress
|
||||
- **realIpFrom: 0.0.0.0/0**: Trusts all proxies (restrict to your Ingress pod CIDR for security)
|
||||
|
||||
### Required Ingress Resource
|
||||
|
||||
You need to create an Ingress resource to route traffic to Mailu. Here's an example:
|
||||
|
||||
```yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: mailu-ingress
|
||||
namespace: bakery-ia # Same as Mailu's namespace
|
||||
annotations:
|
||||
kubernetes.io/ingress.class: nginx # Or your Ingress class
|
||||
nginx.ingress.kubernetes.io/proxy-body-size: "100m" # Allow larger email attachments
|
||||
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600" # For long connections
|
||||
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
|
||||
nginx.ingress.kubernetes.io/force-ssl-redirect: "true" # Redirect HTTP to HTTPS
|
||||
# If using Cert-Manager: cert-manager.io/cluster-issuer: "letsencrypt-prod"
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- mail.bakery-ia.dev # or mail.bakewise.ai for prod
|
||||
secretName: mail-tls-secret # Your TLS Secret
|
||||
rules:
|
||||
- host: mail.bakery-ia.dev # or mail.bakewise.ai for prod
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: mailu-front-http # Mailu's front service (check with kubectl get svc -n bakery-ia)
|
||||
port:
|
||||
number: 80
|
||||
```
|
||||
|
||||
Apply it: `kubectl apply -f ingress.yaml`.
|
||||
|
||||
This routes all traffic from https://mail.[domain]/ to Mailu's internal NGINX, which proxies to webmail (/webmail), admin (/admin), etc.
|
||||
|
||||
## Configuration Details
|
||||
|
||||
### Environment-Specific Values
|
||||
|
||||
- **Development** (`dev/values.yaml`):
|
||||
- Domain: `bakery-ia.local`
|
||||
- No TLS enforcement internally (handled by ingress)
|
||||
- Disabled antivirus to save resources
|
||||
- Debug logging level
|
||||
|
||||
- **Production** (`prod/values.yaml`):
|
||||
- Domain: `bakewise.ai`
|
||||
- No TLS enforcement internally (handled by ingress)
|
||||
- Enabled antivirus
|
||||
- Warning logging level
|
||||
|
||||
### Secrets Management
|
||||
|
||||
Sensitive values like passwords and API keys should be managed through Kubernetes secrets rather than being stored in the values files. The Helm chart supports referencing existing secrets for:
|
||||
|
||||
- Database passwords
|
||||
- Redis passwords
|
||||
- External relay credentials
|
||||
- Mailu secret key
|
||||
|
||||
## Integration with Notification Service
|
||||
|
||||
The notification service continues to connect to Mailu via the internal service name `mailu-postfix.bakery-ia.svc.cluster.local` on port 587 with STARTTLS.
|
||||
|
||||
## Access Information
|
||||
|
||||
- **Admin Panel**: `https://mail.[domain]/admin`
|
||||
- **Webmail**: `https://mail.[domain]/webmail`
|
||||
- **SMTP**: `mail.[domain]:587` (STARTTLS) - handled via separate TCP services if needed
|
||||
- **IMAP**: `mail.[domain]:993` (SSL/TLS) - handled via separate TCP services if needed
|
||||
|
||||
## Migration Notes
|
||||
|
||||
When migrating from the Kustomize setup to Helm:
|
||||
|
||||
1. Ensure all existing PVCs are preserved during migration
|
||||
2. Export any existing mail data before migration if needed
|
||||
3. Update any hardcoded service references in other deployments
|
||||
4. Verify that network policies still allow necessary communications
|
||||
5. Configure your existing ingress to route traffic to the Mailu services
|
||||
@@ -0,0 +1,38 @@
|
||||
# CoreDNS ConfigMap patch to forward external DNS queries to Unbound for DNSSEC validation
|
||||
# This is required for Mailu Admin which requires DNSSEC-validating DNS resolver
|
||||
#
|
||||
# Apply with: kubectl apply -f coredns-unbound-patch.yaml
|
||||
# Then restart CoreDNS: kubectl rollout restart deployment coredns -n kube-system
|
||||
#
|
||||
# Note: The Unbound service IP (10.104.127.213) may change when the cluster is recreated.
|
||||
# The setup script will automatically update this based on the actual Unbound service IP.
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: coredns
|
||||
namespace: kube-system
|
||||
data:
|
||||
Corefile: |
|
||||
.:53 {
|
||||
errors
|
||||
health {
|
||||
lameduck 5s
|
||||
}
|
||||
ready
|
||||
kubernetes cluster.local in-addr.arpa ip6.arpa {
|
||||
pods insecure
|
||||
fallthrough in-addr.arpa ip6.arpa
|
||||
ttl 30
|
||||
}
|
||||
prometheus :9153
|
||||
forward . UNBOUND_SERVICE_IP {
|
||||
max_concurrent 1000
|
||||
}
|
||||
cache 30 {
|
||||
disable success cluster.local
|
||||
disable denial cluster.local
|
||||
}
|
||||
loop
|
||||
reload
|
||||
loadbalance
|
||||
}
|
||||
@@ -0,0 +1,94 @@
|
||||
# Mailgun SMTP Credentials Secret for Mailu
|
||||
#
|
||||
# This secret stores Mailgun credentials for outbound email relay.
|
||||
# Mailu uses Mailgun as an external SMTP relay to send all outbound emails.
|
||||
#
|
||||
# ============================================================================
|
||||
# HOW TO CONFIGURE:
|
||||
# ============================================================================
|
||||
#
|
||||
# 1. Go to https://www.mailgun.com and create an account
|
||||
#
|
||||
# 2. Add and verify your domain:
|
||||
# - For dev: bakery-ia.dev
|
||||
# - For prod: bakewise.ai
|
||||
#
|
||||
# 3. Go to Domain Settings > SMTP credentials in Mailgun dashboard
|
||||
#
|
||||
# 4. Note your SMTP credentials:
|
||||
# - SMTP hostname: smtp.mailgun.org
|
||||
# - Port: 587 (TLS/STARTTLS)
|
||||
# - Username: typically postmaster@yourdomain.com
|
||||
# - Password: your Mailgun SMTP password (NOT the API key)
|
||||
#
|
||||
# 5. Base64 encode your credentials:
|
||||
# echo -n 'postmaster@bakewise.ai' | base64
|
||||
# echo -n 'your-mailgun-smtp-password' | base64
|
||||
#
|
||||
# 6. Replace the placeholder values below with your encoded credentials
|
||||
#
|
||||
# 7. Apply this secret:
|
||||
# kubectl apply -f mailgun-credentials-secret.yaml -n bakery-ia
|
||||
#
|
||||
# ============================================================================
|
||||
# IMPORTANT NOTES:
|
||||
# ============================================================================
|
||||
#
|
||||
# - Use the SMTP password from Mailgun, NOT the API key
|
||||
# - The username format is: postmaster@yourdomain.com
|
||||
# - For sandbox domains, Mailgun requires adding authorized recipients
|
||||
# - Production domains need DNS verification (SPF, DKIM records)
|
||||
#
|
||||
# ============================================================================
|
||||
# DNS RECORDS REQUIRED FOR MAILGUN:
|
||||
# ============================================================================
|
||||
#
|
||||
# Add these DNS records to your domain for proper email delivery:
|
||||
#
|
||||
# 1. SPF Record (TXT):
|
||||
# Name: @
|
||||
# Value: v=spf1 include:mailgun.org ~all
|
||||
#
|
||||
# 2. DKIM Records (TXT):
|
||||
# Mailgun will provide two DKIM keys to add as TXT records
|
||||
# (check your Mailgun domain settings for exact values)
|
||||
#
|
||||
# 3. MX Records (optional, only if receiving via Mailgun):
|
||||
# Priority 10: mxa.mailgun.org
|
||||
# Priority 10: mxb.mailgun.org
|
||||
#
|
||||
# ============================================================================
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: mailu-mailgun-credentials
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app: mailu
|
||||
component: external-relay
|
||||
annotations:
|
||||
description: "Mailgun SMTP credentials for Mailu external relay"
|
||||
type: Opaque
|
||||
stringData:
|
||||
# ============================================================================
|
||||
# REPLACE THESE VALUES WITH YOUR MAILGUN CREDENTIALS
|
||||
# ============================================================================
|
||||
#
|
||||
# Option 1: Use stringData (plain text - Kubernetes will encode automatically)
|
||||
# This is easier for initial setup but shows credentials in the file
|
||||
#
|
||||
RELAY_USERNAME: "postmaster@sandboxc1bff891532b4f0c83056a68ae080b4c.mailgun.org"
|
||||
RELAY_PASSWORD: "2e47104abadad8eb820d00042ea6d5eb-77c6c375-89c7ea55"
|
||||
#
|
||||
# ============================================================================
|
||||
# ALTERNATIVE: Use pre-encoded values (more secure for version control)
|
||||
# ============================================================================
|
||||
# Comment out stringData above and uncomment data below:
|
||||
#
|
||||
# data:
|
||||
# # Base64 encoded values
|
||||
# # echo -n 'postmaster@bakewise.ai' | base64
|
||||
# RELAY_USERNAME: cG9zdG1hc3RlckBiYWtld2lzZS5haQ==
|
||||
# # echo -n 'your-password' | base64
|
||||
# RELAY_PASSWORD: WU9VUl9NQUlMR1VOX1NNVFBfUEFTU1dPUkQ=
|
||||
@@ -0,0 +1,34 @@
|
||||
# Mailu Admin Credentials Secret
|
||||
# This secret stores the initial admin account password for Mailu
|
||||
#
|
||||
# The password is used by the Helm chart's initialAccount feature to create
|
||||
# the admin user automatically during deployment.
|
||||
#
|
||||
# IMPORTANT: Replace the base64-encoded password before applying!
|
||||
#
|
||||
# To generate a secure password and encode it:
|
||||
# PASSWORD=$(openssl rand -base64 16 | tr -d '/+=' | head -c 16)
|
||||
# echo -n "$PASSWORD" | base64
|
||||
#
|
||||
# To apply this secret:
|
||||
# kubectl apply -f mailu-admin-credentials-secret.yaml -n bakery-ia
|
||||
#
|
||||
# After deployment, you can log in to the Mailu admin panel at:
|
||||
# https://mail.<domain>/admin
|
||||
# Username: admin@<domain>
|
||||
# Password: <the password you set>
|
||||
#
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: mailu-admin-credentials
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: mailu
|
||||
app.kubernetes.io/component: admin
|
||||
type: Opaque
|
||||
data:
|
||||
# Base64-encoded password
|
||||
# Example: "changeme123" = Y2hhbmdlbWUxMjM=
|
||||
# IMPORTANT: Replace with your own secure password!
|
||||
password: "Y2hhbmdlbWUxMjM="
|
||||
@@ -0,0 +1,26 @@
|
||||
# Self-signed TLS certificate secret for Mailu Front
|
||||
# This is required by the Mailu Helm chart even when TLS is disabled (tls.flavor: notls)
|
||||
# The Front pod mounts this secret for internal certificate handling
|
||||
#
|
||||
# For production, replace with proper certificates from cert-manager or Let's Encrypt
|
||||
# This script generates a self-signed certificate valid for 365 days
|
||||
#
|
||||
# To regenerate manually:
|
||||
# openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
|
||||
# -keyout tls.key -out tls.crt \
|
||||
# -subj "/CN=mail.bakery-ia.dev/O=bakery-ia"
|
||||
# kubectl create secret tls mailu-certificates \
|
||||
# --cert=tls.crt --key=tls.key -n bakery-ia
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: mailu-certificates
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: mailu
|
||||
app.kubernetes.io/component: certificates
|
||||
type: kubernetes.io/tls
|
||||
data:
|
||||
# Generated certificate for mail.bakery-ia.dev
|
||||
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURRekNDQWl1Z0F3SUJBZ0lVVWg1Rlg5cWlPRDdkc2FmVi9KemlKWWh1WUZJd0RRWUpLb1pJaHZjTkFRRUwKQlFBd01URWJNQmtHQTFVRUF3d1NiV0ZwYkM1aVlXdGxjbmt0YVdFdVpHVjJNUkl3RUFZRFZRUUtEQWxDWVd0bApjbmtnU1VFd0hoY05Nall3TVRFNU1qQTBOakkwV2hjTk1qY3dNVEU1TWpBME5qSTBXakF4TVJzd0dRWURWUVFECkRCSnRZV2xzTG1KaGEyVnllUzFwWVM1a1pYWXhFakFRQmdOVkJBb01DVUpoYTJWeWVTQkpRVENDQVNJd0RRWUoKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTDJlbXM2YW5DSjV5N0JQNm9KdTQ2TldQSXJ3Zlg3Mgp3WmgxZERJaVlIMmNsalBESldsb3ROU0JFTngxUkZZSEc3Z0VSRVk1MHpFQ3UwSC9Vc0YzRFlPTFhobkYwdVRXCkNSTmJFRjFoYjZNT2lqanVmOWJHKzdsVkJ5NmZkMXZRTzJpOTA1VktxRTdEZllraWIwVkpxN0duVUo5RWFtOFgKSWxTaUphY1F6Mm11WXd6QjBPN3hZeVV3VFFWTDcvSnRNTWs5ZjZDY1ZENXFRMGJuWEJNM2hqcVVGWTlnbEF5dApZZHBUUUhPdms1WXgrZk1nL2JZVlBjQ0VhZFhVVkhBdHoxYlJybGIwenlMc3FXeHd2OXlWN0pCM210TkNmbFdsCkRCWWRIb3J0ZlROTHVSNFhhRTNXT2pnbzkwT1ltbi9PYll6Mld0SXUwMnp5MkhrTnBNYUFvVmtDQXdFQUFhTlQKTUZFd0hRWURWUjBPQkJZRUZMS2hPc254WnpXQ1RyMFFuSTdjaE1hbWtTb2pNQjhHQTFVZEl3UVlNQmFBRkxLaApPc254WnpXQ1RyMFFuSTdjaE1hbWtTb2pNQThHQTFVZEV3RUIvd1FGTUFNQkFmOHdEUVlKS29aSWh2Y05BUUVCCkJRQURnZ0VCQUFMQ3hGV1VnY3Z3ZVpoRjFHdlNnR3R3VW9WakJtcG1GYnFPMC93S2lqMlhDRmZ6L0FqanZaOHMKOGVIUEc5Z3crbjlpaGNSN016Q2V5ZldRd1FsaTBXZkcySzBvUDFGeUxoYU9aMlhtdU9nNnhNRG5EVzBVZWtqMwpCYWdHc3RFVXpqQlR1UlJ3WS9uck5vb1ZCOVFoYnhoeW9mbXkrVzVmczhZMDNTZG9paTFpWG1iSEhaemMyL21ICmF2UDE0Z3BzWUNDZVl6aklyWm05WWE4Rzhpc2tYelNnZU0vSEhpRzhJOWhKRkJYaHRYYWRjeGkvbU5hNHRKcWgKM1crTEIzaEQ4NFVkZ3MrR3pCZ0hHdnIwdWxMMTQvaUxVRXFySXZaWjN2VTlvNlZ4MlBvRjQ3cjBQNXpOZXVTNwpkRk5xT3JJT2phSm5yMXFVb0tMeWd3RUhqdVRNbUk0PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
|
||||
tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRQzlucHJPbXB3aWVjdXcKVCtxQ2J1T2pWanlLOEgxKzlzR1lkWFF5SW1COW5KWXp3eVZwYUxUVWdSRGNkVVJXQnh1NEJFUkdPZE14QXJ0QgovMUxCZHcyRGkxNFp4ZExrMWdrVFd4QmRZVytqRG9vNDduL1d4dnU1VlFjdW4zZGIwRHRvdmRPVlNxaE93MzJKCkltOUZTYXV4cDFDZlJHcHZGeUpVb2lXbkVNOXBybU1Nd2REdThXTWxNRTBGUysveWJUREpQWCtnbkZRK2FrTkcKNTF3VE40WTZsQldQWUpRTXJXSGFVMEJ6cjVPV01mbnpJUDIyRlQzQWhHblYxRlJ3TGM5VzBhNVc5TThpN0tscwpjTC9jbGV5UWQ1clRRbjVWcFF3V0hSNks3WDB6UzdrZUYyaE4xam80S1BkRG1KcC96bTJNOWxyU0x0TnM4dGg1CkRhVEdnS0ZaQWdNQkFBRUNnZ0VBSW51TFQzTVNYYnFrYmdXNmNjblVuOGw0N1JOYTN4SGtsdU1WSkdEWUJ6L0kKbU5VdUlvTW1EMWNCUi9ZVFhVbWhvczh6MDBtRXZHN3d1c25CdE9qL2ppSjBGRi9EUUZZa0JGOFZGTVk1VlArNQo1eXlJRnZqTW9pRnlVdW93L0lOYnFtcUs1YVZVQWk3T3ozZHhvTG9LL1IyZUxiaDFXb3BzZGRPZTRValBUenBVCnU1TVl4NXlMVnVZc1A3U09TSHRrd2UvMDN5RFJLckl2V3k1QlBtYzJRVEhUcEJPVUJHNC9DcFJWR1ozZjhLa0QKN2QrNlZlNzd1TWV1eERPOG1HZ1paNTRpd0NuMStYR2NFcVFVR1Z1WngrcVpodVhTZks0ajR3eWVtbndlRUFCdgptTlNZSXQ2OG91SSs0cEFyV1ZONEFjaXhWRUxIV1d6MDRYTm56WFUyNFFLQmdRRDBlc0JZenVkRzJaU2t5SWJRCnU4SXhwT2RzRjRnU1lpekNkMTNLQktGbm9obTFrVzlYemRmS3ZObnJxcFRPRnJIYkRXUTdpaUhKM2NqVjlBVTUKTlEwMVUzWXY0SzhkdWtnb2MvRUFhbnQvRjhvMG5qc0pJZ2Z2WTFuUHNPVFVFcGtRQk1QSGpraGpyM3FBNkh4dgp4b0I2OEdVdU1OVHRkQitBV0Y0dXR1T2JoUUtCZ1FER2pnNmJnbGpXRVR4TnhBalQxc21reEpzQ2xmbFFteXRmCmNiaDVWempzdGNad2lLSjh1b0xjT0d4b05HWDJIaGJRQU5wRWhUR3FuMEZIbGxFc1BYbXBoeUJVY01JUFZTWEkKRUlLeU9kL3ZMYjhjWG9ydDZMaDNNS0FoakVLbExENVZOcDhXbVlQM3dCVE1ia3BrM0NDdWxDSEJLcEJXV2Y2NgpQWFp0RUZKa3hRS0JnQjNSTHM1bUJhME5jbVNhbEY2MjE1dG9hbFV6bFlQd2QxY01hZUx1cDZUVkQxK21xamJDClF6UlZ6aHBCQnI4UDQ0YzgzZUdwR2kvZG5kWUNXZlM5Tkt3eFRyUE9LbTFzdjhvM1FjaDBORFd1K0Jsc3h2UjUKTXhDT1JIRGhPVGRvUVVURDRBRGhxSkNINFdBQmV0UERHUDVsZldHaDBRWlk2RktsOUc2c0haeGxBb0dBWnlLLwpIN1B6WlM2S3ZuSkhpNUlVSjh3Z0lKVzZiVTVNbDBWQTUzYVJFUlBTd2YyWE9XYkFOcGZ3WjZoZ0ZobkhDOENGCm4vWDN1SU1FcTZTL0FWWGxibFBNVFZCTTNSNERoQXBmZVNocTA1aFZudXpWQ1lOSzNrNlp2eE5XUXVuYWJ2VHkKYWhEUDVjOFdmcUlEYnFTUkxWMndzdC9qSFplZG95dnQ2ZlVDZDJrQ2dZRUFsbzRZelRabC8vays0WGlpeHVMQQpnZ2ZieTBoS3M1QWlLcFY0Q3pVZVE1Y0tZT2k5SXpvQzJMckxTWCtVckgvd0w3MGdCRzZneUNSZ1dLaW1RbmFWCnRZTy8xM1NyUFVnbm51R2o2Q0I1YUVreXYyTGFPVmV2WEZFcmlFbWQ1cWJKSXJYMENmZ1FuRnI2dm5RZDRwUFMKOGRVMkdhaDRiNVdNSjVJdzgwU3BjR0k9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K
|
||||
171
infrastructure/platform/mail/mailu-helm/dev/values.yaml
Normal file
171
infrastructure/platform/mail/mailu-helm/dev/values.yaml
Normal file
@@ -0,0 +1,171 @@
|
||||
# Development-tuned Mailu configuration
|
||||
global:
|
||||
# Using Unbound DNS for DNSSEC validation (required by Mailu admin)
|
||||
# Unbound service is available at unbound-dns.bakery-ia.svc.cluster.local
|
||||
# Static ClusterIP configured in unbound-helm/values.yaml
|
||||
custom_dns_servers: "10.96.53.53" # Unbound DNS static ClusterIP
|
||||
|
||||
# Redis configuration - use built-in Mailu Redis (no authentication needed)
|
||||
externalRedis:
|
||||
enabled: false
|
||||
|
||||
# Component-specific DNS configuration
|
||||
# Admin requires DNSSEC validation - use Unbound DNS (forwards cluster.local to kube-dns)
|
||||
admin:
|
||||
dnsPolicy: "None"
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- "10.96.53.53" # Unbound DNS static ClusterIP (forwards cluster.local to kube-dns)
|
||||
searches:
|
||||
- "bakery-ia.svc.cluster.local"
|
||||
- "svc.cluster.local"
|
||||
- "cluster.local"
|
||||
options:
|
||||
- name: ndots
|
||||
value: "5"
|
||||
|
||||
# RSPAMD needs Unbound for DNSSEC validation (DKIM/SPF/DMARC checks)
|
||||
# Using ClusterFirst with search domains + Kubernetes DNS which can forward to Unbound
|
||||
rspamd:
|
||||
dnsPolicy: "ClusterFirst"
|
||||
|
||||
# Domain configuration for dev
|
||||
# NOTE: Using .dev TLD instead of .local because email-validator library
|
||||
# rejects .local domains as "special-use or reserved names" (RFC 6761)
|
||||
domain: "bakery-ia.dev"
|
||||
hostnames:
|
||||
- "mail.bakery-ia.dev"
|
||||
|
||||
# Initial admin account for dev environment
|
||||
# Password is stored in mailu-admin-credentials secret
|
||||
initialAccount:
|
||||
enabled: true
|
||||
username: "admin"
|
||||
domain: "bakery-ia.dev"
|
||||
existingSecret: "mailu-admin-credentials"
|
||||
existingSecretPasswordKey: "password"
|
||||
mode: "ifmissing"
|
||||
|
||||
# External relay configuration for dev (Mailgun)
|
||||
# All outbound emails will be relayed through Mailgun SMTP
|
||||
# To configure:
|
||||
# 1. Register at mailgun.com and verify your domain (bakery-ia.dev)
|
||||
# 2. Get your SMTP credentials from Mailgun dashboard
|
||||
# 3. Update the secret in configs/mailgun-credentials-secret.yaml
|
||||
# 4. Apply the secret: kubectl apply -f configs/mailgun-credentials-secret.yaml -n bakery-ia
|
||||
externalRelay:
|
||||
host: "[smtp.mailgun.org]:587"
|
||||
# Credentials loaded from Kubernetes secret
|
||||
secretName: "mailu-mailgun-credentials"
|
||||
usernameKey: "RELAY_USERNAME"
|
||||
passwordKey: "RELAY_PASSWORD"
|
||||
|
||||
# Environment-specific configurations
|
||||
persistence:
|
||||
enabled: true
|
||||
# Development: use default storage class
|
||||
storageClass: "standard"
|
||||
size: "5Gi"
|
||||
|
||||
# Resource optimizations for development
|
||||
resources:
|
||||
admin:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
front:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "200m"
|
||||
memory: "128Mi"
|
||||
postfix:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "256Mi"
|
||||
dovecot:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "256Mi"
|
||||
rspamd:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "200m"
|
||||
memory: "128Mi"
|
||||
webmail:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "200m"
|
||||
memory: "128Mi"
|
||||
clamav:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "512Mi"
|
||||
|
||||
replicaCount: 1 # Single replica for development
|
||||
|
||||
# Security settings
|
||||
secretKey: "generate-strong-key-here-for-development"
|
||||
|
||||
# Ingress configuration for development - disabled to use with existing ingress
|
||||
ingress:
|
||||
enabled: false # Disable chart's Ingress; use existing one
|
||||
tls: false # Disable TLS in chart since ingress handles it
|
||||
tlsFlavorOverride: notls # No TLS on internal NGINX; expect external proxy to handle TLS
|
||||
realIpHeader: X-Forwarded-For # Header for client IP from your Ingress
|
||||
realIpFrom: 0.0.0.0/0 # Trust all proxies (restrict to your Ingress pod CIDR for security)
|
||||
path: /
|
||||
pathType: ImplementationSpecific
|
||||
|
||||
# TLS flavor for dev (may use self-signed)
|
||||
tls:
|
||||
flavor: "notls" # Disable TLS for development
|
||||
|
||||
# Welcome message (disabled in dev)
|
||||
welcomeMessage:
|
||||
enabled: false
|
||||
|
||||
# Log level for dev
|
||||
logLevel: "DEBUG"
|
||||
|
||||
# Development-specific overrides
|
||||
env:
|
||||
DEBUG: "true"
|
||||
LOG_LEVEL: "INFO"
|
||||
|
||||
# Disable or simplify monitoring in development
|
||||
monitoring:
|
||||
enabled: false
|
||||
|
||||
# Network Policy for dev
|
||||
networkPolicy:
|
||||
enabled: true
|
||||
ingressController:
|
||||
namespace: ingress-nginx
|
||||
podSelector: |
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: ingress-nginx
|
||||
app.kubernetes.io/instance: ingress-nginx
|
||||
app.kubernetes.io/component: controller
|
||||
monitoring:
|
||||
namespace: monitoring
|
||||
podSelector: |
|
||||
matchLabels:
|
||||
app: signoz-prometheus
|
||||
31
infrastructure/platform/mail/mailu-helm/mailu-ingress.yaml
Normal file
31
infrastructure/platform/mail/mailu-helm/mailu-ingress.yaml
Normal file
@@ -0,0 +1,31 @@
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: mailu-ingress
|
||||
namespace: bakery-ia
|
||||
labels:
|
||||
app.kubernetes.io/name: mailu
|
||||
app.kubernetes.io/component: ingress
|
||||
annotations:
|
||||
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
|
||||
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
|
||||
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
|
||||
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
|
||||
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
||||
spec:
|
||||
ingressClassName: nginx
|
||||
tls:
|
||||
- hosts:
|
||||
- mail.bakery-ia.dev
|
||||
secretName: bakery-dev-tls-cert
|
||||
rules:
|
||||
- host: mail.bakery-ia.dev
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: mailu-front # Helm release name 'mailu' + component 'front'
|
||||
port:
|
||||
number: 80
|
||||
164
infrastructure/platform/mail/mailu-helm/prod/values.yaml
Normal file
164
infrastructure/platform/mail/mailu-helm/prod/values.yaml
Normal file
@@ -0,0 +1,164 @@
|
||||
# Production-tuned Mailu configuration
|
||||
global:
|
||||
# Using Kubernetes cluster DNS for name resolution
|
||||
custom_dns_servers: "10.96.0.10" # Kubernetes cluster DNS IP
|
||||
|
||||
# Redis configuration - use built-in Mailu Redis (no authentication needed for internal)
|
||||
externalRedis:
|
||||
enabled: false
|
||||
|
||||
# DNS configuration for production
|
||||
# Use Kubernetes DNS (ClusterFirst) which forwards to Unbound via CoreDNS
|
||||
# This is configured automatically by the mailu-helm Tilt resource
|
||||
admin:
|
||||
dnsPolicy: "ClusterFirst"
|
||||
|
||||
rspamd:
|
||||
dnsPolicy: "ClusterFirst"
|
||||
|
||||
# Domain configuration for production
|
||||
domain: "bakewise.ai"
|
||||
hostnames:
|
||||
- "mail.bakewise.ai"
|
||||
|
||||
# Initial admin account for production environment
|
||||
# Password is stored in mailu-admin-credentials secret
|
||||
initialAccount:
|
||||
enabled: true
|
||||
username: "admin"
|
||||
domain: "bakewise.ai"
|
||||
existingSecret: "mailu-admin-credentials"
|
||||
existingSecretPasswordKey: "password"
|
||||
mode: "ifmissing"
|
||||
|
||||
# External relay configuration for production (Mailgun)
|
||||
# All outbound emails will be relayed through Mailgun SMTP
|
||||
# To configure:
|
||||
# 1. Register at mailgun.com and verify your domain (bakewise.ai)
|
||||
# 2. Get your SMTP credentials from Mailgun dashboard
|
||||
# 3. Update the secret in configs/mailgun-credentials-secret.yaml
|
||||
# 4. Apply the secret: kubectl apply -f configs/mailgun-credentials-secret.yaml -n bakery-ia
|
||||
externalRelay:
|
||||
host: "[smtp.mailgun.org]:587"
|
||||
# Credentials loaded from Kubernetes secret
|
||||
secretName: "mailu-mailgun-credentials"
|
||||
usernameKey: "RELAY_USERNAME"
|
||||
passwordKey: "RELAY_PASSWORD"
|
||||
|
||||
# Environment-specific configurations
|
||||
persistence:
|
||||
enabled: true
|
||||
# Production: use microk8s-hostpath or longhorn
|
||||
storageClass: "longhorn" # Assuming Longhorn is available in production
|
||||
size: "20Gi" # Larger storage for production email volume
|
||||
|
||||
# Resource allocations for production
|
||||
resources:
|
||||
admin:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "512Mi"
|
||||
front:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
postfix:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "512Mi"
|
||||
dovecot:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "512Mi"
|
||||
rspamd:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
clamav:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "512Mi"
|
||||
limits:
|
||||
cpu: "1"
|
||||
memory: "1Gi"
|
||||
|
||||
replicaCount: 1 # Can be increased in production as needed
|
||||
|
||||
# Security settings
|
||||
secretKey: "generate-strong-key-here-for-production"
|
||||
|
||||
# Ingress configuration for production - disabled to use with existing ingress
|
||||
ingress:
|
||||
enabled: false # Disable chart's Ingress; use existing one
|
||||
tls: false # Disable TLS in chart since ingress handles it
|
||||
tlsFlavorOverride: notls # No TLS on internal NGINX; expect external proxy to handle TLS
|
||||
realIpHeader: X-Forwarded-For # Header for client IP from your Ingress
|
||||
realIpFrom: 0.0.0.0/0 # Trust all proxies (restrict to your Ingress pod CIDR for security)
|
||||
path: /
|
||||
pathType: ImplementationSpecific
|
||||
|
||||
# TLS flavor for production (uses Let's Encrypt)
|
||||
tls:
|
||||
flavor: "cert"
|
||||
|
||||
# Welcome message (enabled in production)
|
||||
welcomeMessage:
|
||||
enabled: true
|
||||
subject: "Welcome to Bakewise.ai Email Service"
|
||||
body: "Welcome to our email service. Please change your password and update your profile."
|
||||
|
||||
# Log level for production
|
||||
logLevel: "WARNING"
|
||||
|
||||
# Enable antivirus in production
|
||||
antivirus:
|
||||
enabled: true
|
||||
flavor: "clamav"
|
||||
|
||||
# Production-specific settings
|
||||
env:
|
||||
DEBUG: "false"
|
||||
LOG_LEVEL: "WARNING"
|
||||
TLS_FLAVOR: "cert"
|
||||
REDIS_PASSWORD: "secure-redis-password"
|
||||
|
||||
# Enable monitoring in production
|
||||
monitoring:
|
||||
enabled: true
|
||||
|
||||
# Production-specific security settings
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
fsGroup: 1000
|
||||
|
||||
# Network policies for production
|
||||
networkPolicy:
|
||||
enabled: true
|
||||
ingressController:
|
||||
namespace: ingress-nginx
|
||||
podSelector: |
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: ingress-nginx
|
||||
app.kubernetes.io/instance: ingress-nginx
|
||||
app.kubernetes.io/component: controller
|
||||
monitoring:
|
||||
namespace: monitoring
|
||||
podSelector: |
|
||||
matchLabels:
|
||||
app: signoz-prometheus
|
||||
269
infrastructure/platform/mail/mailu-helm/scripts/deploy-mailu-prod.sh
Executable file
269
infrastructure/platform/mail/mailu-helm/scripts/deploy-mailu-prod.sh
Executable file
@@ -0,0 +1,269 @@
|
||||
#!/bin/bash
|
||||
# =============================================================================
|
||||
# Mailu Production Deployment Script
|
||||
# =============================================================================
|
||||
# This script automates the deployment of Mailu mail server for production.
|
||||
# It handles:
|
||||
# 1. Unbound DNS deployment (for DNSSEC validation)
|
||||
# 2. CoreDNS configuration (forward to Unbound)
|
||||
# 3. TLS certificate secret creation
|
||||
# 4. Admin credentials secret creation
|
||||
# 5. Mailu Helm deployment (admin user created automatically via initialAccount)
|
||||
#
|
||||
# Usage:
|
||||
# ./deploy-mailu-prod.sh [--domain DOMAIN] [--admin-password PASSWORD]
|
||||
#
|
||||
# Example:
|
||||
# ./deploy-mailu-prod.sh --domain bakewise.ai --admin-password 'SecurePass123!'
|
||||
# =============================================================================
|
||||
|
||||
set -e
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Default values
|
||||
DOMAIN="${DOMAIN:-bakewise.ai}"
|
||||
ADMIN_PASSWORD="${ADMIN_PASSWORD:-}"
|
||||
NAMESPACE="bakery-ia"
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
MAILU_HELM_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
|
||||
# Parse arguments
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--domain)
|
||||
DOMAIN="$2"
|
||||
shift 2
|
||||
;;
|
||||
--admin-password)
|
||||
ADMIN_PASSWORD="$2"
|
||||
shift 2
|
||||
;;
|
||||
--help)
|
||||
echo "Usage: $0 [--domain DOMAIN] [--admin-password PASSWORD]"
|
||||
echo ""
|
||||
echo "Options:"
|
||||
echo " --domain Domain for Mailu (default: bakewise.ai)"
|
||||
echo " --admin-password Password for admin@DOMAIN user"
|
||||
echo ""
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo -e "${RED}Unknown option: $1${NC}"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
print_step() {
|
||||
echo -e "\n${BLUE}==>${NC} ${GREEN}$1${NC}"
|
||||
}
|
||||
|
||||
print_warning() {
|
||||
echo -e "${YELLOW}WARNING:${NC} $1"
|
||||
}
|
||||
|
||||
print_error() {
|
||||
echo -e "${RED}ERROR:${NC} $1"
|
||||
}
|
||||
|
||||
print_success() {
|
||||
echo -e "${GREEN}✓${NC} $1"
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# Step 0: Prerequisites Check
|
||||
# =============================================================================
|
||||
print_step "Step 0: Checking prerequisites..."
|
||||
|
||||
if ! command -v kubectl &> /dev/null; then
|
||||
print_error "kubectl not found. Please install kubectl."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if ! command -v helm &> /dev/null; then
|
||||
print_error "helm not found. Please install helm."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if ! kubectl get namespace "$NAMESPACE" &>/dev/null; then
|
||||
print_warning "Namespace $NAMESPACE does not exist. Creating..."
|
||||
kubectl create namespace "$NAMESPACE"
|
||||
fi
|
||||
|
||||
print_success "Prerequisites check passed"
|
||||
|
||||
# =============================================================================
|
||||
# Step 1: Deploy Unbound DNS Resolver
|
||||
# =============================================================================
|
||||
print_step "Step 1: Deploying Unbound DNS resolver..."
|
||||
|
||||
if kubectl get deployment unbound -n "$NAMESPACE" &>/dev/null; then
|
||||
print_success "Unbound already deployed"
|
||||
else
|
||||
helm upgrade --install unbound "$MAILU_HELM_DIR/../../networking/dns/unbound-helm" \
|
||||
-n "$NAMESPACE" \
|
||||
-f "$MAILU_HELM_DIR/../../networking/dns/unbound-helm/values.yaml" \
|
||||
-f "$MAILU_HELM_DIR/../../networking/dns/unbound-helm/prod/values.yaml" \
|
||||
--timeout 5m \
|
||||
--wait
|
||||
|
||||
print_success "Unbound deployed"
|
||||
fi
|
||||
|
||||
# Wait for Unbound to be ready
|
||||
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=unbound -n "$NAMESPACE" --timeout=120s
|
||||
|
||||
# Get Unbound service IP
|
||||
UNBOUND_IP=$(kubectl get svc unbound-dns -n "$NAMESPACE" -o jsonpath='{.spec.clusterIP}')
|
||||
echo "Unbound DNS service IP: $UNBOUND_IP"
|
||||
|
||||
# =============================================================================
|
||||
# Step 2: Configure CoreDNS to Forward to Unbound
|
||||
# =============================================================================
|
||||
print_step "Step 2: Configuring CoreDNS for DNSSEC validation..."
|
||||
|
||||
# Check current CoreDNS forward configuration
|
||||
CURRENT_FORWARD=$(kubectl get configmap coredns -n kube-system -o jsonpath='{.data.Corefile}' | grep -o 'forward \. [0-9.]*' | awk '{print $3}' || echo "")
|
||||
|
||||
if [ "$CURRENT_FORWARD" != "$UNBOUND_IP" ]; then
|
||||
echo "Updating CoreDNS to forward to Unbound ($UNBOUND_IP)..."
|
||||
|
||||
kubectl patch configmap coredns -n kube-system --type merge -p "{
|
||||
\"data\": {
|
||||
\"Corefile\": \".:53 {\\n errors\\n health {\\n lameduck 5s\\n }\\n ready\\n kubernetes cluster.local in-addr.arpa ip6.arpa {\\n pods insecure\\n fallthrough in-addr.arpa ip6.arpa\\n ttl 30\\n }\\n prometheus :9153\\n forward . $UNBOUND_IP {\\n max_concurrent 1000\\n }\\n cache 30 {\\n disable success cluster.local\\n disable denial cluster.local\\n }\\n loop\\n reload\\n loadbalance\\n}\\n\"
|
||||
}
|
||||
}"
|
||||
|
||||
# Restart CoreDNS
|
||||
kubectl rollout restart deployment coredns -n kube-system
|
||||
kubectl rollout status deployment coredns -n kube-system --timeout=60s
|
||||
|
||||
print_success "CoreDNS configured to forward to Unbound"
|
||||
else
|
||||
print_success "CoreDNS already configured for Unbound"
|
||||
fi
|
||||
|
||||
# =============================================================================
|
||||
# Step 3: Create TLS Certificate Secret
|
||||
# =============================================================================
|
||||
print_step "Step 3: Creating TLS certificate secret..."
|
||||
|
||||
if kubectl get secret mailu-certificates -n "$NAMESPACE" &>/dev/null; then
|
||||
print_success "TLS certificate secret already exists"
|
||||
else
|
||||
TEMP_DIR=$(mktemp -d)
|
||||
cd "$TEMP_DIR"
|
||||
|
||||
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
|
||||
-keyout tls.key -out tls.crt \
|
||||
-subj "/CN=mail.$DOMAIN/O=$DOMAIN" 2>/dev/null
|
||||
|
||||
kubectl create secret tls mailu-certificates \
|
||||
--cert=tls.crt \
|
||||
--key=tls.key \
|
||||
-n "$NAMESPACE"
|
||||
|
||||
rm -rf "$TEMP_DIR"
|
||||
print_success "TLS certificate secret created"
|
||||
fi
|
||||
|
||||
# =============================================================================
|
||||
# Step 4: Create Admin Credentials Secret
|
||||
# =============================================================================
|
||||
print_step "Step 4: Creating admin credentials secret..."
|
||||
|
||||
if kubectl get secret mailu-admin-credentials -n "$NAMESPACE" &>/dev/null; then
|
||||
print_success "Admin credentials secret already exists"
|
||||
# Retrieve existing password for summary output
|
||||
if [ -z "$ADMIN_PASSWORD" ]; then
|
||||
ADMIN_PASSWORD=$(kubectl get secret mailu-admin-credentials -n "$NAMESPACE" -o jsonpath='{.data.password}' | base64 -d)
|
||||
fi
|
||||
else
|
||||
if [ -z "$ADMIN_PASSWORD" ]; then
|
||||
# Generate a random password
|
||||
ADMIN_PASSWORD=$(openssl rand -base64 16 | tr -d '/+=' | head -c 16)
|
||||
echo -e "${YELLOW}Generated admin password: $ADMIN_PASSWORD${NC}"
|
||||
echo -e "${YELLOW}Please save this password securely!${NC}"
|
||||
fi
|
||||
|
||||
kubectl create secret generic mailu-admin-credentials \
|
||||
--from-literal=password="$ADMIN_PASSWORD" \
|
||||
-n "$NAMESPACE"
|
||||
|
||||
print_success "Admin credentials secret created"
|
||||
fi
|
||||
|
||||
# =============================================================================
|
||||
# Step 5: Deploy Mailu via Helm
|
||||
# =============================================================================
|
||||
print_step "Step 5: Deploying Mailu via Helm..."
|
||||
|
||||
# Add Mailu Helm repository
|
||||
helm repo add mailu https://mailu.github.io/helm-charts 2>/dev/null || true
|
||||
helm repo update mailu
|
||||
|
||||
# Deploy Mailu
|
||||
helm upgrade --install mailu mailu/mailu \
|
||||
-n "$NAMESPACE" \
|
||||
-f "$MAILU_HELM_DIR/values.yaml" \
|
||||
-f "$MAILU_HELM_DIR/prod/values.yaml" \
|
||||
--timeout 10m
|
||||
|
||||
print_success "Mailu Helm release deployed (admin user will be created automatically)"
|
||||
|
||||
# =============================================================================
|
||||
# Step 6: Wait for Pods to be Ready
|
||||
# =============================================================================
|
||||
print_step "Step 6: Waiting for Mailu pods to be ready..."
|
||||
|
||||
echo "This may take 5-10 minutes (ClamAV takes time to initialize)..."
|
||||
|
||||
# Wait for admin pod first (it's the key dependency)
|
||||
kubectl wait --for=condition=ready pod -l app.kubernetes.io/component=admin -n "$NAMESPACE" --timeout=300s || {
|
||||
print_error "Admin pod failed to start. Checking logs..."
|
||||
kubectl logs -n "$NAMESPACE" -l app.kubernetes.io/component=admin --tail=50
|
||||
exit 1
|
||||
}
|
||||
|
||||
print_success "Admin pod is ready"
|
||||
|
||||
# Show pod status
|
||||
echo ""
|
||||
echo "Mailu Pod Status:"
|
||||
kubectl get pods -n "$NAMESPACE" | grep mailu
|
||||
|
||||
print_success "Admin user created automatically via Helm initialAccount"
|
||||
|
||||
# =============================================================================
|
||||
# Summary
|
||||
# =============================================================================
|
||||
echo ""
|
||||
echo "=============================================="
|
||||
echo -e "${GREEN}Mailu Deployment Complete!${NC}"
|
||||
echo "=============================================="
|
||||
echo ""
|
||||
echo "Admin Credentials:"
|
||||
echo " Email: admin@$DOMAIN"
|
||||
echo " Password: $ADMIN_PASSWORD"
|
||||
echo ""
|
||||
echo "Access URLs (configure Ingress/DNS first):"
|
||||
echo " Admin Panel: https://mail.$DOMAIN/admin"
|
||||
echo " Webmail: https://mail.$DOMAIN/webmail"
|
||||
echo " SMTP: mail.$DOMAIN:587 (STARTTLS)"
|
||||
echo " IMAP: mail.$DOMAIN:993 (SSL)"
|
||||
echo ""
|
||||
echo "Next Steps:"
|
||||
echo " 1. Configure DNS records (A, MX, SPF, DMARC)"
|
||||
echo " 2. Get DKIM key: kubectl exec -n $NAMESPACE deployment/mailu-admin -- cat /dkim/$DOMAIN.dkim.pub"
|
||||
echo " 3. Add DKIM TXT record to DNS"
|
||||
echo " 4. Configure Ingress for mail.$DOMAIN"
|
||||
echo ""
|
||||
echo "To check pod status:"
|
||||
echo " kubectl get pods -n $NAMESPACE | grep mailu"
|
||||
echo ""
|
||||
235
infrastructure/platform/mail/mailu-helm/values.yaml
Normal file
235
infrastructure/platform/mail/mailu-helm/values.yaml
Normal file
@@ -0,0 +1,235 @@
|
||||
# Base Mailu Helm values for Bakery-IA
|
||||
# Preserves critical configurations from the original Kustomize setup
|
||||
|
||||
# Global DNS configuration for DNSSEC validation
|
||||
global:
|
||||
# Using Unbound DNS resolver directly for DNSSEC validation
|
||||
# Unbound service is available at unbound-dns.bakery-ia.svc.cluster.local
|
||||
# Static ClusterIP configured in unbound-helm/values.yaml
|
||||
custom_dns_servers: "10.96.53.53" # Unbound DNS static ClusterIP
|
||||
|
||||
# Domain configuration
|
||||
domain: "DOMAIN_PLACEHOLDER"
|
||||
hostnames:
|
||||
- "mail.DOMAIN_PLACEHOLDER"
|
||||
|
||||
# Mailu version to match the original setup
|
||||
mailuVersion: "2024.06"
|
||||
|
||||
# Secret key for authentication cookies
|
||||
secretKey: "cb61b934d47029a64117c0e4110c93f66bbcf5eaa15c84c42727fad78f7"
|
||||
|
||||
# Timezone
|
||||
timezone: "Etc/UTC"
|
||||
|
||||
# Postmaster configuration
|
||||
postmaster: "admin"
|
||||
|
||||
# Initial admin account configuration
|
||||
# This creates an admin user as part of the Helm deployment
|
||||
# Credentials can be provided directly or via Kubernetes secret
|
||||
initialAccount:
|
||||
enabled: true
|
||||
username: "admin"
|
||||
domain: "" # Set in environment-specific values (dev/prod)
|
||||
password: "" # Leave empty to use existingSecret
|
||||
existingSecret: "mailu-admin-credentials"
|
||||
existingSecretPasswordKey: "password"
|
||||
mode: "ifmissing" # Only create if account doesn't exist
|
||||
|
||||
# TLS configuration
|
||||
tls:
|
||||
flavor: "notls" # Disable TLS for development
|
||||
|
||||
# Limits configuration
|
||||
limits:
|
||||
messageSizeLimitInMegabytes: 50
|
||||
authRatelimit:
|
||||
ip: "60/hour"
|
||||
user: "100/day"
|
||||
messageRatelimit:
|
||||
value: "200/day"
|
||||
|
||||
# External relay configuration (Mailgun)
|
||||
# Mailu will relay all outbound emails through Mailgun SMTP
|
||||
# Credentials are loaded from Kubernetes secret for security
|
||||
externalRelay:
|
||||
host: "[smtp.mailgun.org]:587"
|
||||
# Use existing secret for credentials (recommended for security)
|
||||
secretName: "mailu-mailgun-credentials"
|
||||
usernameKey: "RELAY_USERNAME"
|
||||
passwordKey: "RELAY_PASSWORD"
|
||||
|
||||
# Webmail configuration
|
||||
webmail:
|
||||
enabled: true
|
||||
type: "roundcube"
|
||||
|
||||
# Antivirus and antispam configuration
|
||||
antivirus:
|
||||
enabled: false # Disabled in dev to save resources
|
||||
antispam:
|
||||
enabled: true
|
||||
flavor: "rspamd"
|
||||
|
||||
# Welcome message
|
||||
welcomeMessage:
|
||||
enabled: false # Disabled during development
|
||||
|
||||
# Logging
|
||||
logLevel: "INFO"
|
||||
|
||||
# Network configuration
|
||||
subnet: "10.42.0.0/16"
|
||||
|
||||
# Redis configuration - using internal Redis (built-in)
|
||||
externalRedis:
|
||||
enabled: false
|
||||
# host: "redis-service.bakery-ia.svc.cluster.local"
|
||||
# port: 6380
|
||||
adminQuotaDbId: 15
|
||||
adminRateLimitDbId: 15
|
||||
rspamdDbId: 15
|
||||
|
||||
# Database configuration - using default SQLite (built-in)
|
||||
externalDatabase:
|
||||
enabled: false
|
||||
# type: "postgresql"
|
||||
# host: "postgres-service.bakery-ia.svc.cluster.local"
|
||||
# port: 5432
|
||||
# database: "mailu"
|
||||
# username: "mailu"
|
||||
# password: "E8Kz47YmVzDlHGs1M9wAbJzxcKnGONCT"
|
||||
|
||||
# Persistence configuration
|
||||
persistence:
|
||||
single_pvc: true
|
||||
size: 10Gi
|
||||
storageClass: ""
|
||||
accessModes: [ReadWriteOnce]
|
||||
|
||||
# Ingress configuration - disabled to use with existing ingress
|
||||
ingress:
|
||||
enabled: false # Disable chart's Ingress; use existing one
|
||||
tls: false # Disable TLS in chart since ingress handles it
|
||||
tlsFlavorOverride: notls # No TLS on internal NGINX; expect external proxy to handle TLS
|
||||
realIpHeader: X-Forwarded-For # Header for client IP from your Ingress
|
||||
realIpFrom: 0.0.0.0/0 # Trust all proxies (restrict to your Ingress pod CIDR for security)
|
||||
path: /
|
||||
pathType: ImplementationSpecific
|
||||
|
||||
# Optional: Enable PROXY protocol for mail protocols if your Ingress supports TCP proxying
|
||||
proxyProtocol:
|
||||
smtp: false
|
||||
smtps: false
|
||||
submission: false
|
||||
imap: false
|
||||
imaps: false
|
||||
pop3: false
|
||||
pop3s: false
|
||||
manageSieve: false
|
||||
|
||||
# Front configuration
|
||||
front:
|
||||
image:
|
||||
tag: "2024.06"
|
||||
replicaCount: 1
|
||||
service:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
http: 80
|
||||
https: 443
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 128Mi
|
||||
limits:
|
||||
cpu: 200m
|
||||
memory: 256Mi
|
||||
|
||||
# Admin configuration
|
||||
admin:
|
||||
image:
|
||||
tag: "2024.06"
|
||||
replicaCount: 1
|
||||
service:
|
||||
type: ClusterIP
|
||||
port: 80
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 256Mi
|
||||
limits:
|
||||
cpu: 300m
|
||||
memory: 512Mi
|
||||
|
||||
# Postfix configuration
|
||||
postfix:
|
||||
image:
|
||||
tag: "2024.06"
|
||||
replicaCount: 1
|
||||
service:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
smtp: 25
|
||||
submission: 587
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 256Mi
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
|
||||
# Dovecot configuration
|
||||
dovecot:
|
||||
image:
|
||||
tag: "2024.06"
|
||||
replicaCount: 1
|
||||
service:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
imap: 143
|
||||
imaps: 993
|
||||
resources:
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 256Mi
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 512Mi
|
||||
|
||||
# Rspamd configuration
|
||||
rspamd:
|
||||
image:
|
||||
tag: "2024.06"
|
||||
replicaCount: 1
|
||||
service:
|
||||
type: ClusterIP
|
||||
ports:
|
||||
rspamd: 11333
|
||||
rspamd-admin: 11334
|
||||
resources:
|
||||
requests:
|
||||
cpu: 200m
|
||||
memory: 512Mi
|
||||
limits:
|
||||
cpu: 1000m
|
||||
memory: 1Gi
|
||||
|
||||
# Network Policy
|
||||
networkPolicy:
|
||||
enabled: true
|
||||
ingressController:
|
||||
namespace: ingress-nginx
|
||||
podSelector: |
|
||||
matchLabels:
|
||||
app.kubernetes.io/name: ingress-nginx
|
||||
app.kubernetes.io/instance: ingress-nginx
|
||||
app.kubernetes.io/component: controller
|
||||
|
||||
# DNS Policy Configuration
|
||||
# Use Kubernetes DNS (ClusterFirst) for internal service resolution
|
||||
# DNSSEC validation for email is handled by rspamd component
|
||||
# Note: For production with DNSSEC needs, configure CoreDNS to forward to Unbound
|
||||
dnsPolicy: "ClusterFirst"
|
||||
@@ -0,0 +1,18 @@
|
||||
apiVersion: v2
|
||||
name: unbound
|
||||
description: A Helm chart for deploying Unbound DNS resolver for Bakery-IA
|
||||
type: application
|
||||
version: 0.1.0
|
||||
appVersion: "1.19.1"
|
||||
maintainers:
|
||||
- name: Bakery-IA Team
|
||||
email: devops@bakery-ia.com
|
||||
keywords:
|
||||
- dns
|
||||
- resolver
|
||||
- caching
|
||||
- unbound
|
||||
home: https://www.nlnetlabs.nl/projects/unbound/
|
||||
sources:
|
||||
- https://github.com/NLnetLabs/unbound
|
||||
- https://hub.docker.com/r/mvance/unbound
|
||||
@@ -0,0 +1,64 @@
|
||||
# Development values for unbound DNS resolver
|
||||
# Using same configuration as production for consistency
|
||||
|
||||
# Use official image for development (same as production)
|
||||
image:
|
||||
repository: "mvance/unbound"
|
||||
tag: "latest"
|
||||
pullPolicy: "IfNotPresent"
|
||||
|
||||
# Resource settings (slightly lower than production for dev)
|
||||
resources:
|
||||
requests:
|
||||
cpu: "100m"
|
||||
memory: "128Mi"
|
||||
limits:
|
||||
cpu: "300m"
|
||||
memory: "384Mi"
|
||||
|
||||
# Single replica for development (can be scaled if needed)
|
||||
replicaCount: 1
|
||||
|
||||
# Development annotations
|
||||
podAnnotations:
|
||||
environment: "development"
|
||||
managed-by: "helm"
|
||||
|
||||
# Probe settings (same as production but slightly faster)
|
||||
probes:
|
||||
readiness:
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 30
|
||||
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"
|
||||
liveness:
|
||||
initialDelaySeconds: 30
|
||||
periodSeconds: 60
|
||||
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"
|
||||
|
||||
# Custom Unbound forward records for Kubernetes DNS
|
||||
config:
|
||||
enabled: true
|
||||
# The mvance/unbound image includes forward-records.conf
|
||||
# We need to add Kubernetes-specific forwarding zones
|
||||
forwardRecords: |
|
||||
# Forward all queries to Cloudflare with DNSSEC (catch-all)
|
||||
forward-zone:
|
||||
name: "."
|
||||
forward-tls-upstream: yes
|
||||
forward-addr: 1.1.1.1@853#cloudflare-dns.com
|
||||
forward-addr: 1.0.0.1@853#cloudflare-dns.com
|
||||
|
||||
# Additional server config to mark cluster.local as insecure (no DNSSEC)
|
||||
# and use stub zones for Kubernetes internal DNS (more reliable than forward)
|
||||
serverConfig: |
|
||||
domain-insecure: "cluster.local."
|
||||
private-domain: "cluster.local."
|
||||
local-zone: "10.in-addr.arpa." nodefault
|
||||
|
||||
stub-zone:
|
||||
name: "cluster.local."
|
||||
stub-addr: 10.96.0.10
|
||||
|
||||
stub-zone:
|
||||
name: "10.in-addr.arpa."
|
||||
stub-addr: 10.96.0.10
|
||||
@@ -0,0 +1,50 @@
|
||||
# Production-specific values for unbound DNS resolver
|
||||
# Overrides for the production environment
|
||||
|
||||
# Use official image for production
|
||||
image:
|
||||
repository: "mvance/unbound"
|
||||
tag: "latest"
|
||||
pullPolicy: "IfNotPresent"
|
||||
|
||||
# Production resource settings (higher limits for reliability)
|
||||
resources:
|
||||
requests:
|
||||
cpu: "200m"
|
||||
memory: "256Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "512Mi"
|
||||
|
||||
# Production-specific settings
|
||||
replicaCount: 2
|
||||
|
||||
# Production annotations
|
||||
podAnnotations:
|
||||
environment: "production"
|
||||
critical: "true"
|
||||
|
||||
# Anti-affinity for high availability in production
|
||||
affinity:
|
||||
podAntiAffinity:
|
||||
preferredDuringSchedulingIgnoredDuringExecution:
|
||||
- weight: 100
|
||||
podAffinityTerm:
|
||||
labelSelector:
|
||||
matchExpressions:
|
||||
- key: app.kubernetes.io/name
|
||||
operator: In
|
||||
values:
|
||||
- unbound
|
||||
topologyKey: "kubernetes.io/hostname"
|
||||
|
||||
# Production probe settings (more conservative)
|
||||
probes:
|
||||
readiness:
|
||||
initialDelaySeconds: 15
|
||||
periodSeconds: 30
|
||||
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"
|
||||
liveness:
|
||||
initialDelaySeconds: 45
|
||||
periodSeconds: 60
|
||||
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"
|
||||
@@ -0,0 +1,63 @@
|
||||
{{/*
|
||||
Expand the name of the chart.
|
||||
*/}}
|
||||
{{- define "unbound.name" -}}
|
||||
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
|
||||
{{- end -}}
|
||||
|
||||
{{/*
|
||||
Create a default fully qualified app name.
|
||||
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
|
||||
*/}}
|
||||
{{- define "unbound.fullname" -}}
|
||||
{{- if .Values.fullnameOverride -}}
|
||||
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
|
||||
{{- else -}}
|
||||
{{- $name := default .Chart.Name .Values.nameOverride -}}
|
||||
{{- if contains $name .Release.Name -}}
|
||||
{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
|
||||
{{- else -}}
|
||||
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
|
||||
{{- end -}}
|
||||
{{- end -}}
|
||||
{{- end -}}
|
||||
|
||||
{{/*
|
||||
Create chart name and version as used by the chart label.
|
||||
*/}}
|
||||
{{- define "unbound.chart" -}}
|
||||
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
|
||||
{{- end -}}
|
||||
|
||||
{{/*
|
||||
Common labels
|
||||
*/}}
|
||||
{{- define "unbound.labels" -}}
|
||||
helm.sh/chart: {{ include "unbound.chart" . }}
|
||||
{{ include "unbound.selectorLabels" . }}
|
||||
{{- if .Chart.AppVersion }}
|
||||
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
|
||||
{{- end }}
|
||||
app.kubernetes.io/managed-by: {{ .Release.Service }}
|
||||
{{- end -}}
|
||||
|
||||
{{/*
|
||||
Selector labels
|
||||
*/}}
|
||||
{{- define "unbound.selectorLabels" -}}
|
||||
app.kubernetes.io/name: {{ include "unbound.name" . }}
|
||||
app.kubernetes.io/instance: {{ .Release.Name }}
|
||||
app.kubernetes.io/component: dns
|
||||
app.kubernetes.io/part-of: bakery-ia
|
||||
{{- end -}}
|
||||
|
||||
{{/*
|
||||
Create the name of the service account to use
|
||||
*/}}
|
||||
{{- define "unbound.serviceAccountName" -}}
|
||||
{{- if .Values.serviceAccount.create -}}
|
||||
{{ default (include "unbound.fullname" .) .Values.serviceAccount.name }}
|
||||
{{- else -}}
|
||||
{{ default "default" .Values.serviceAccount.name }}
|
||||
{{- end -}}
|
||||
{{- end -}}
|
||||
@@ -0,0 +1,22 @@
|
||||
{{- if .Values.config.enabled }}
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: {{ include "unbound.fullname" . }}-config
|
||||
namespace: {{ .Values.global.namespace }}
|
||||
labels:
|
||||
{{- include "unbound.labels" . | nindent 4 }}
|
||||
data:
|
||||
{{- if .Values.config.forwardRecords }}
|
||||
forward-records.conf: |
|
||||
{{ .Values.config.forwardRecords | indent 4 }}
|
||||
{{- end }}
|
||||
{{- if .Values.config.serverConfig }}
|
||||
a-records.conf: |
|
||||
{{ .Values.config.serverConfig | indent 4 }}
|
||||
{{- end }}
|
||||
{{- if .Values.config.content }}
|
||||
unbound.conf: |
|
||||
{{ .Values.config.content | indent 4 }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
@@ -0,0 +1,117 @@
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: {{ include "unbound.fullname" . }}
|
||||
namespace: {{ .Values.global.namespace }}
|
||||
labels:
|
||||
{{- include "unbound.labels" . | nindent 4 }}
|
||||
spec:
|
||||
replicas: {{ .Values.replicaCount }}
|
||||
selector:
|
||||
matchLabels:
|
||||
{{- include "unbound.selectorLabels" . | nindent 6 }}
|
||||
template:
|
||||
metadata:
|
||||
{{- with .Values.podAnnotations }}
|
||||
annotations:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
labels:
|
||||
{{- include "unbound.selectorLabels" . | nindent 8 }}
|
||||
spec:
|
||||
{{- with .Values.imagePullSecrets }}
|
||||
imagePullSecrets:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
serviceAccountName: {{ include "unbound.serviceAccountName" . }}
|
||||
securityContext:
|
||||
{{- toYaml .Values.podSecurityContext | nindent 8 }}
|
||||
containers:
|
||||
- name: {{ .Chart.Name }}
|
||||
securityContext:
|
||||
{{- toYaml .Values.securityContext | nindent 12 }}
|
||||
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
|
||||
imagePullPolicy: {{ .Values.image.pullPolicy }}
|
||||
ports:
|
||||
- name: dns-udp
|
||||
containerPort: {{ .Values.service.ports.dnsUdp }}
|
||||
protocol: UDP
|
||||
- name: dns-tcp
|
||||
containerPort: {{ .Values.service.ports.dnsTcp }}
|
||||
protocol: TCP
|
||||
{{- if .Values.probes.readiness.enabled }}
|
||||
readinessProbe:
|
||||
exec:
|
||||
command:
|
||||
- sh
|
||||
- -c
|
||||
- {{ .Values.probes.readiness.command | quote }}
|
||||
initialDelaySeconds: {{ .Values.probes.readiness.initialDelaySeconds }}
|
||||
periodSeconds: {{ .Values.probes.readiness.periodSeconds }}
|
||||
{{- end }}
|
||||
{{- if .Values.probes.liveness.enabled }}
|
||||
livenessProbe:
|
||||
exec:
|
||||
command:
|
||||
- sh
|
||||
- -c
|
||||
- {{ .Values.probes.liveness.command | quote }}
|
||||
initialDelaySeconds: {{ .Values.probes.liveness.initialDelaySeconds }}
|
||||
periodSeconds: {{ .Values.probes.liveness.periodSeconds }}
|
||||
{{- end }}
|
||||
resources:
|
||||
{{- toYaml .Values.resources | nindent 12 }}
|
||||
volumeMounts:
|
||||
{{- if .Values.config.enabled }}
|
||||
{{- if .Values.config.forwardRecords }}
|
||||
- name: unbound-config
|
||||
mountPath: /opt/unbound/etc/unbound/forward-records.conf
|
||||
subPath: forward-records.conf
|
||||
{{- end }}
|
||||
{{- if .Values.config.serverConfig }}
|
||||
- name: unbound-config
|
||||
mountPath: /opt/unbound/etc/unbound/a-records.conf
|
||||
subPath: a-records.conf
|
||||
{{- end }}
|
||||
{{- if .Values.config.content }}
|
||||
- name: unbound-config
|
||||
mountPath: /opt/unbound/etc/unbound/unbound.conf
|
||||
subPath: unbound.conf
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- with .Values.volumeMounts }}
|
||||
{{- toYaml . | nindent 12 }}
|
||||
{{- end }}
|
||||
{{- with .Values.env }}
|
||||
env:
|
||||
{{- toYaml . | nindent 12 }}
|
||||
{{- end }}
|
||||
volumes:
|
||||
{{- if .Values.config.enabled }}
|
||||
- name: unbound-config
|
||||
configMap:
|
||||
name: {{ include "unbound.fullname" . }}-config
|
||||
{{- end }}
|
||||
{{- with .Values.volumes }}
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- with .Values.nodeSelector }}
|
||||
nodeSelector:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- with .Values.affinity }}
|
||||
affinity:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- with .Values.tolerations }}
|
||||
tolerations:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- with .Values.extraInitContainers }}
|
||||
initContainers:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
{{- with .Values.extraContainers }}
|
||||
containers:
|
||||
{{- toYaml . | nindent 8 }}
|
||||
{{- end }}
|
||||
@@ -0,0 +1,27 @@
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: {{ .Values.global.dnsServiceName }}
|
||||
namespace: {{ .Values.global.namespace }}
|
||||
labels:
|
||||
{{- include "unbound.labels" . | nindent 4 }}
|
||||
{{- with .Values.serviceAnnotations }}
|
||||
annotations:
|
||||
{{- toYaml . | nindent 4 }}
|
||||
{{- end }}
|
||||
spec:
|
||||
type: {{ .Values.service.type }}
|
||||
{{- if .Values.service.clusterIP }}
|
||||
clusterIP: {{ .Values.service.clusterIP }}
|
||||
{{- end }}
|
||||
ports:
|
||||
- name: dns-udp
|
||||
port: {{ .Values.service.ports.dnsUdp }}
|
||||
targetPort: {{ .Values.service.ports.dnsUdp }}
|
||||
protocol: UDP
|
||||
- name: dns-tcp
|
||||
port: {{ .Values.service.ports.dnsTcp }}
|
||||
targetPort: {{ .Values.service.ports.dnsTcp }}
|
||||
protocol: TCP
|
||||
selector:
|
||||
{{- include "unbound.selectorLabels" . | nindent 4 }}
|
||||
@@ -0,0 +1,13 @@
|
||||
{{- if .Values.serviceAccount.create -}}
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: {{ include "unbound.serviceAccountName" . }}
|
||||
namespace: {{ .Values.global.namespace }}
|
||||
labels:
|
||||
{{- include "unbound.labels" . | nindent 4 }}
|
||||
{{- with .Values.serviceAccount.annotations }}
|
||||
annotations:
|
||||
{{- toYaml . | nindent 4 }}
|
||||
{{- end }}
|
||||
{{- end -}}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user