Initial commit - production deployment

This commit is contained in:
2026-01-21 17:17:16 +01:00
commit c23d00dd92
2289 changed files with 638440 additions and 0 deletions

View File

@@ -0,0 +1,119 @@
# Bakery-IA Namespace Management
## Overview
This document explains the namespace strategy for the Bakery-IA platform and how to properly manage namespaces during deployment.
## Namespace Architecture
The Bakery-IA platform uses the following namespaces:
### Core Namespaces
1. **`bakery-ia`** - Main application namespace
- Contains all microservices, databases, and application components
- Defined in: `infrastructure/namespaces/bakery-ia.yaml`
2. **`tekton-pipelines`** - CI/CD pipeline namespace
- Contains Tekton pipeline resources, tasks, and triggers
- Defined in: `infrastructure/namespaces/tekton-pipelines.yaml`
3. **`flux-system`** - GitOps namespace
- Contains Flux CD components for GitOps deployments
- Now defined in Helm chart: `infrastructure/cicd/flux/templates/namespace.yaml`
### Infrastructure Namespaces
Additional namespaces may be created for:
- Monitoring components
- Logging components
- Security components
## Deployment Order
**CRITICAL**: Namespaces must be created BEFORE any resources that depend on them.
### Correct Deployment Sequence
```bash
# 1. Create namespaces first
kubectl apply -f infrastructure/namespaces/
# 2. Apply common configurations (depends on bakery-ia namespace)
kubectl apply -f infrastructure/environments/common/configs/
# 3. Apply platform components
kubectl apply -f infrastructure/platform/
# 4. Apply CI/CD components (depends on tekton-pipelines)
kubectl apply -f infrastructure/cicd/
# 5. Apply monitoring components
kubectl apply -f infrastructure/monitoring/
```
## Common Issues and Solutions
### Issue: "namespace not found" errors
**Symptoms**: Errors like:
```
Error from server (NotFound): error when creating "path/to/resource.yaml": namespaces "[namespace-name]" not found
```
**Solutions**:
1. **Ensure namespaces are created first** - Use the deployment script that applies namespaces before other resources
2. **Check for templating issues** - If you see names like `[redacted secret rabbitmq-secrets:RABBITMQ_USER]-ia`, there may be environment variable substitution happening incorrectly
3. **Verify namespace YAML files** - Ensure the namespace files exist and are properly formatted
### Issue: Resource conflicts across namespaces
**Solution**: Use proper namespace isolation and RBAC policies to prevent cross-namespace conflicts.
## Best Practices
1. **Namespace Isolation**: Keep resources properly isolated by namespace
2. **RBAC**: Use namespace-specific RBAC roles and bindings
3. **Resource Quotas**: Apply resource quotas per namespace
4. **Network Policies**: Use network policies to control cross-namespace communication
## Troubleshooting
### Verify namespaces exist
```bash
kubectl get namespaces
```
### Check namespace labels
```bash
kubectl get namespace bakery-ia --show-labels
```
### View namespace events
```bash
kubectl describe namespace bakery-ia
```
## Migration from Old Structure
If you're migrating from the old structure where namespaces were scattered across different directories:
1. **Remove old namespace files** from:
- `infrastructure/environments/common/configs/namespace.yaml`
- `infrastructure/cicd/flux/namespace.yaml`
2. **Update kustomization files** to reference the centralized namespace files
3. **Use the new deployment script** that follows the correct order
## Future Enhancements
- Add namespace lifecycle management
- Implement namespace cleanup scripts
- Add namespace validation checks to CI/CD pipelines

57
infrastructure/README.md Normal file
View File

@@ -0,0 +1,57 @@
# Bakery-IA Infrastructure
This directory contains all infrastructure-as-code for the Bakery-IA project, organized according to best practices for maintainability and scalability.
## Directory Structure
```
infrastructure/
├── environments/ # Environment-specific configurations
│ ├── dev/ # Development environment
│ │ ├── k8s-manifests/ # Kubernetes manifests for dev
│ │ └── values/ # Environment-specific values
│ ├── staging/ # Staging environment
│ │ ├── k8s-manifests/
│ │ └── values/
│ └── prod/ # Production environment
│ ├── k8s-manifests/
│ ├── terraform/ # Production-specific IaC
│ └── values/
├── platform/ # Platform-level infrastructure
│ ├── cluster/ # Cluster configuration (EKS, Kind)
│ ├── networking/ # Network configuration
│ ├── security/ # Security policies and TLS
│ └── storage/ # Storage configuration
├── services/ # Application services
│ ├── databases/ # Database configurations
│ ├── api-gateway/ # API gateway configuration
│ └── microservices/ # Individual microservice configs
├── monitoring/ # Observability stack
│ └── signoz/ # SigNoz configuration
├── cicd/ # CI/CD pipeline components
├── security/ # Security configurations
├── scripts/ # Automation scripts
└── docs/ # Infrastructure documentation
```
## Environments
Each environment (dev, staging, prod) has its own configuration with appropriate isolation and security settings.
## Services
Services are organized by business domain with clear separation between databases, microservices, and infrastructure components.
## Getting Started
1. **Local Development**: Use `tilt up` to start the development environment
2. **Deployment**: Use `skaffold run` to deploy to your target environment
3. **CI/CD**: Tekton pipelines manage automated deployments
## Security
Security configurations are centralized in the `security/` directory with:
- TLS certificates and rotation scripts
- Network policies
- RBAC configurations
- Compliance checks

View File

@@ -0,0 +1,298 @@
# Bakery-IA CI/CD Implementation
This directory contains the configuration for the production-grade CI/CD system for Bakery-IA using Gitea, Tekton, and Flux CD.
## Architecture Overview
```mermaid
graph TD
A[Developer] -->|Push Code| B[Gitea]
B -->|Webhook| C[Tekton Pipelines]
C -->|Build/Test| D[Gitea Registry]
D -->|New Image| E[Flux CD]
E -->|kubectl apply| F[MicroK8s Cluster]
F -->|Metrics| G[SigNoz]
```
## Directory Structure
```
infrastructure/ci-cd/
├── gitea/ # Gitea configuration (Git server + registry)
│ └── values.yaml # Helm values for Gitea (ingress now in main config)
├── tekton/ # Tekton CI/CD pipeline configuration
│ ├── tasks/ # Individual pipeline tasks
│ │ ├── git-clone.yaml
│ │ ├── detect-changes.yaml
│ │ ├── kaniko-build.yaml
│ │ └── update-gitops.yaml
│ ├── pipelines/ # Pipeline definitions
│ │ └── ci-pipeline.yaml
│ └── triggers/ # Webhook trigger configuration
│ ├── trigger-template.yaml
│ ├── trigger-binding.yaml
│ ├── event-listener.yaml
│ └── gitlab-interceptor.yaml
├── flux/ # Flux CD GitOps Helm chart configuration
│ ├── Chart.yaml # Helm chart definition
│ ├── values.yaml # Default configuration values
│ ├── templates/ # Kubernetes manifest templates
│ │ ├── gitrepository.yaml
│ │ ├── kustomization.yaml
│ │ └── namespace.yaml
│ └── values/ # Additional value files
├── monitoring/ # Monitoring configuration
│ └── otel-collector.yaml # OpenTelemetry collector
└── README.md # This file
```
## Deployment Instructions
### Phase 1: Infrastructure Setup
1. **Deploy Gitea**:
```bash
# Add Helm repo
microk8s helm repo add gitea https://dl.gitea.io/charts
# Create namespace
microk8s kubectl create namespace gitea
# Install Gitea
microk8s helm install gitea gitea/gitea \
-n gitea \
-f infrastructure/ci-cd/gitea/values.yaml
# Note: Gitea ingress is now included in the main ingress configuration
# No separate ingress needs to be applied
```
2. **Deploy Tekton**:
```bash
# Create namespace
microk8s kubectl create namespace tekton-pipelines
# Install Tekton Pipelines
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
# Install Tekton Triggers
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
# Apply Tekton configurations
microk8s kubectl apply -f infrastructure/ci-cd/tekton/tasks/
microk8s kubectl apply -f infrastructure/ci-cd/tekton/pipelines/
microk8s kubectl apply -f infrastructure/ci-cd/tekton/triggers/
```
3. **Deploy Flux CD** (already enabled in MicroK8s):
```bash
# Verify Flux installation
microk8s kubectl get pods -n flux-system
# Apply Flux configurations using kustomize
microk8s kubectl apply -k infrastructure/ci-cd/flux/
```
### Phase 2: Configuration
1. **Set up Gitea webhook**:
- Go to your Gitea repository settings
- Add webhook with URL: `http://tekton-triggers.tekton-pipelines.svc.cluster.local:8080`
- Use the secret from `gitea-webhook-secret`
2. **Configure registry credentials**:
```bash
# Create registry credentials secret
microk8s kubectl create secret docker-registry gitea-registry-credentials \
-n tekton-pipelines \
--docker-server=gitea.bakery-ia.local:5000 \
--docker-username=your-username \
--docker-password=your-password
```
3. **Configure Git credentials for Flux**:
```bash
# Create Git credentials secret
microk8s kubectl create secret generic gitea-credentials \
-n flux-system \
--from-literal=username=your-username \
--from-literal=password=your-password
```
### Phase 3: Monitoring
```bash
# Apply OpenTelemetry configuration
microk8s kubectl apply -f infrastructure/ci-cd/monitoring/otel-collector.yaml
```
## Usage
### Triggering a Pipeline
1. **Manual trigger**:
```bash
# Create a PipelineRun manually
microk8s kubectl create -f - <<EOF
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: manual-ci-run
namespace: tekton-pipelines
spec:
pipelineRef:
name: bakery-ia-ci
workspaces:
- name: shared-workspace
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi
- name: docker-credentials
secret:
secretName: gitea-registry-credentials
params:
- name: git-url
value: "http://gitea.bakery-ia.local/bakery-admin/bakery-ia.git"
- name: git-revision
value: "main"
EOF
```
2. **Automatic trigger**: Push code to the repository and the webhook will trigger the pipeline automatically.
### Monitoring Pipeline Runs
```bash
# List all PipelineRuns
microk8s kubectl get pipelineruns -n tekton-pipelines
# View logs for a specific PipelineRun
microk8s kubectl logs -n tekton-pipelines <pipelinerun-pod> -c <step-name>
# View Tekton dashboard
microk8s kubectl port-forward -n tekton-pipelines svc/tekton-dashboard 9097:9097
```
## Troubleshooting
### Common Issues
1. **Pipeline not triggering**:
- Check Gitea webhook logs
- Verify EventListener pods are running
- Check TriggerBinding configuration
2. **Build failures**:
- Check Kaniko logs for build errors
- Verify Dockerfile paths are correct
- Ensure registry credentials are valid
3. **Flux not applying changes**:
- Check GitRepository status
- Verify Kustomization reconciliation
- Check Flux logs for errors
### Debugging Commands
```bash
# Check Tekton controller logs
microk8s kubectl logs -n tekton-pipelines -l app=tekton-pipelines-controller
# Check Flux reconciliation
microk8s kubectl get kustomizations -n flux-system -o yaml
# Check Gitea webhook delivery
microk8s kubectl logs -n tekton-pipelines -l app=tekton-triggers-controller
```
## Security Considerations
1. **Secrets Management**:
- Use Kubernetes secrets for sensitive data
- Rotate credentials regularly
- Use RBAC for namespace isolation
2. **Network Security**:
- Configure network policies
- Use internal DNS names
- Restrict ingress access
3. **Registry Security**:
- Enable image scanning
- Use image signing
- Implement cleanup policies
## Maintenance
### Upgrading Components
```bash
# Upgrade Tekton
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
# Upgrade Flux
microk8s helm upgrade fluxcd fluxcd/flux2 -n flux-system
# Upgrade Gitea
microk8s helm upgrade gitea gitea/gitea -n gitea -f infrastructure/ci-cd/gitea/values.yaml
```
### Backup Procedures
```bash
# Backup Gitea
microk8s kubectl exec -n gitea gitea-0 -- gitea dump -c /data/gitea/conf/app.ini
# Backup Flux configurations
microk8s kubectl get all -n flux-system -o yaml > flux-backup.yaml
# Backup Tekton configurations
microk8s kubectl get all -n tekton-pipelines -o yaml > tekton-backup.yaml
```
## Performance Optimization
1. **Resource Management**:
- Set appropriate resource limits
- Limit concurrent builds
- Use node selectors for build pods
2. **Caching**:
- Configure Kaniko cache
- Use persistent volumes for dependencies
- Cache Docker layers
3. **Parallelization**:
- Build independent services in parallel
- Use matrix builds for different architectures
- Optimize task dependencies
## Integration with Existing System
The CI/CD system integrates with:
- **SigNoz**: For monitoring and observability
- **MicroK8s**: For cluster management
- **Existing Kubernetes manifests**: In `infrastructure/kubernetes/`
- **Current services**: All 19 microservices in `services/`
## Migration Plan
1. **Phase 1**: Set up infrastructure (Gitea, Tekton, Flux)
2. **Phase 2**: Configure pipelines and triggers
3. **Phase 3**: Test with non-critical services
4. **Phase 4**: Gradual rollout to all services
5. **Phase 5**: Decommission old deployment methods
## Support
For issues with the CI/CD system:
- Check logs and monitoring first
- Review the troubleshooting section
- Consult the original implementation plan
- Refer to component documentation:
- [Tekton Documentation](https://tekton.dev/docs/)
- [Flux CD Documentation](https://fluxcd.io/docs/)
- [Gitea Documentation](https://docs.gitea.io/)

View File

@@ -0,0 +1,6 @@
apiVersion: v2
name: flux-cd
description: A Helm chart for deploying Flux CD GitOps toolkit for Bakery-IA
type: application
version: 0.1.0
appVersion: "2.2.3"

View File

@@ -0,0 +1,15 @@
{{- if .Values.gitRepository }}
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: {{ .Values.gitRepository.name }}
namespace: {{ .Values.gitRepository.namespace }}
spec:
interval: {{ .Values.gitRepository.interval }}
url: {{ .Values.gitRepository.url }}
ref:
branch: {{ .Values.gitRepository.ref.branch }}
secretRef:
name: {{ .Values.gitRepository.secretRef.name }}
timeout: {{ .Values.gitRepository.timeout }}
{{- end }}

View File

@@ -0,0 +1,43 @@
{{- if .Values.kustomization }}
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: {{ .Values.kustomization.name }}
namespace: {{ .Values.kustomization.namespace }}
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: flux
spec:
# Wait for GitRepository to be ready before reconciling
dependsOn: []
interval: {{ .Values.kustomization.interval }}
path: {{ .Values.kustomization.path }}
prune: {{ .Values.kustomization.prune }}
sourceRef:
kind: {{ .Values.kustomization.sourceRef.kind }}
name: {{ .Values.kustomization.sourceRef.name }}
targetNamespace: {{ .Values.kustomization.targetNamespace }}
timeout: {{ .Values.kustomization.timeout }}
retryInterval: {{ .Values.kustomization.retryInterval }}
wait: {{ .Values.kustomization.wait }}
{{- if .Values.kustomization.healthChecks }}
healthChecks:
{{- range .Values.kustomization.healthChecks }}
- apiVersion: {{ .apiVersion }}
kind: {{ .kind }}
name: {{ .name }}
namespace: {{ .namespace }}
{{- end }}
{{- end }}
{{- if .Values.kustomization.postBuild }}
postBuild:
substituteFrom:
{{- range .Values.kustomization.postBuild.substituteFrom }}
- kind: {{ .kind }}
name: {{ .name }}
{{- if .optional }}
optional: {{ .optional }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,9 @@
{{- if .Values.createNamespace | default false }}
apiVersion: v1
kind: Namespace
metadata:
name: {{ .Values.gitRepository.namespace }}
labels:
app.kubernetes.io/name: flux
kubernetes.io/metadata.name: {{ .Values.gitRepository.namespace }}
{{- end }}

View File

@@ -0,0 +1,73 @@
# Default values for flux-cd
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
gitRepository:
name: bakery-ia
namespace: flux-system
interval: 1m
url: http://gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bakery-ia.git
ref:
branch: main
secretRef:
name: gitea-credentials
timeout: 60s
kustomization:
name: bakery-ia-prod
namespace: flux-system
interval: 5m
path: ./infrastructure/environments/prod
prune: true
sourceRef:
kind: GitRepository
name: bakery-ia
targetNamespace: bakery-ia
timeout: 10m
retryInterval: 1m
wait: true
healthChecks:
# Core Infrastructure
- apiVersion: apps/v1
kind: Deployment
name: gateway
namespace: bakery-ia
# Authentication & Authorization
- apiVersion: apps/v1
kind: Deployment
name: auth-service
namespace: bakery-ia
- apiVersion: apps/v1
kind: Deployment
name: tenant-service
namespace: bakery-ia
# Core Business Services
- apiVersion: apps/v1
kind: Deployment
name: inventory-service
namespace: bakery-ia
- apiVersion: apps/v1
kind: Deployment
name: orders-service
namespace: bakery-ia
- apiVersion: apps/v1
kind: Deployment
name: pos-service
namespace: bakery-ia
# Data Services
- apiVersion: apps/v1
kind: Deployment
name: forecasting-service
namespace: bakery-ia
- apiVersion: apps/v1
kind: Deployment
name: notification-service
namespace: bakery-ia
postBuild:
substituteFrom:
- kind: ConfigMap
name: bakery-ia-config
optional: true
- kind: Secret
name: bakery-ia-secrets
optional: true

View File

@@ -0,0 +1,151 @@
# Gitea Automatic Repository Creation - Implementation Summary
## Overview
This implementation adds automatic repository creation to the Gitea Helm chart configuration for the Bakery-IA project. When Gitea is installed or upgraded via Helm, it will automatically create a `bakery-ia` repository with the specified configuration.
## Changes Made
### 1. Updated Helm Values (`values.yaml`)
Added the `initialRepositories` configuration under the `gitea:` section:
```yaml
# Initial repositories to create automatically after Gitea installation
# These will be created with the admin user as owner
gitea:
initialRepositories:
- name: bakery-ia
description: "Main repository for Bakery IA project - Automatically created by Helm"
private: false
auto_init: true
default_branch: main
owner: "{{ .Values.gitea.admin.username }}"
# Enable issues, wiki, and other features
enable_issues: true
enable_wiki: true
enable_pull_requests: true
enable_projects: true
```
### 2. Created Setup Script (`setup-gitea-repository.sh`)
A comprehensive bash script that:
- Checks if Gitea is accessible
- Verifies if the repository exists (creates it if not)
- Configures the local Git repository
- Pushes the existing code to the new Gitea repository
### 3. Created Test Script (`test-repository-creation.sh`)
A test script that verifies:
- Gitea accessibility
- Repository existence
- Repository configuration (issues, wiki, pull requests)
- Provides detailed repository information
### 4. Created Documentation
- **README.md**: Complete guide on installation, usage, and troubleshooting
- **IMPLEMENTATION_SUMMARY.md**: This file, summarizing the implementation
## How It Works
### Automatic Repository Creation Flow
1. **Helm Installation**: When `helm install` or `helm upgrade` is executed with the updated values
2. **Gitea Initialization**: Gitea starts and creates the admin user
3. **Repository Creation**: Gitea processes the `initialRepositories` configuration and creates the specified repositories
4. **Completion**: The repository is ready for use immediately after Gitea is fully initialized
### Key Features
- **Automatic**: No manual intervention required after Helm installation
- **Idempotent**: Safe to run multiple times (won't duplicate repositories)
- **Configurable**: All repository settings are defined in Helm values
- **Integrated**: Uses native Gitea Helm chart features
## Usage
### Installation
```bash
# Install Gitea with automatic repository creation
helm install gitea gitea/gitea -n gitea \
-f infrastructure/cicd/gitea/values.yaml \
--set gitea.admin.password=your-secure-password
```
### Push Existing Code
```bash
export GITEA_ADMIN_PASSWORD="your-secure-password"
./infrastructure/cicd/gitea/setup-gitea-repository.sh
```
### Verify Repository
```bash
export GITEA_ADMIN_PASSWORD="your-secure-password"
./infrastructure/cicd/gitea/test-repository-creation.sh
```
## Repository Configuration
The automatically created repository includes:
| Feature | Enabled | Description |
|---------|---------|-------------|
| Name | bakery-ia | Main project repository |
| Description | Main repository for Bakery IA project | Clear identification |
| Visibility | Public | Accessible without authentication |
| Auto Init | Yes | Creates initial README.md |
| Default Branch | main | Standard branch naming |
| Issues | Yes | Bug and feature tracking |
| Wiki | Yes | Project documentation |
| Pull Requests | Yes | Code review workflow |
| Projects | Yes | Project management |
## CI/CD Integration
The repository is ready for immediate CI/CD integration:
- **Repository URL**: `https://gitea.bakery-ia.local/bakery-admin/bakery-ia.git`
- **Clone URL**: `https://gitea.bakery-ia.local/bakery-admin/bakery-ia.git`
- **SSH URL**: `git@gitea.bakery-ia.local:bakery-admin/bakery-ia.git`
## Benefits
1. **Automation**: Eliminates manual repository creation step
2. **Consistency**: Ensures all environments have the same repository structure
3. **Reliability**: Uses Helm's declarative configuration management
4. **Documentation**: Clear repository purpose and features
5. **CI/CD Ready**: Repository is immediately available for pipeline configuration
## Troubleshooting
### Repository Not Created
1. **Check Helm Values**: Ensure the `initialRepositories` section is correctly formatted
2. **Verify Gitea Logs**: `kubectl logs -n gitea -l app.kubernetes.io/name=gitea`
3. **Manual Creation**: Use the setup script to create the repository manually
### Authentication Issues
1. **Verify Password**: Ensure `GITEA_ADMIN_PASSWORD` is correct
2. **Check Accessibility**: Confirm Gitea service is running and accessible
3. **Network Configuration**: Verify ingress and DNS settings
## Future Enhancements
Potential improvements for future iterations:
1. **Multiple Repositories**: Add more repositories for different components
2. **Webhooks**: Automatically configure webhooks for CI/CD triggers
3. **Teams and Permissions**: Set up teams and access controls
4. **Template Repositories**: Create repository templates with standard files
5. **Backup Configuration**: Add automatic backup configuration
## Conclusion
This implementation provides a robust, automated solution for Gitea repository creation in the Bakery-IA project. It leverages Helm's native capabilities to ensure consistent, reliable repository setup across all environments.

View File

@@ -0,0 +1,188 @@
# Gitea Configuration for Bakery-IA CI/CD
This directory contains the Helm values and scripts for setting up Gitea as the Git server for the Bakery-IA project.
## Features
- **Automatic Admin User**: Admin user is created automatically from Kubernetes secret
- **Automatic Repository Creation**: The `bakery-ia` repository is created via a Kubernetes Job after Gitea starts
- **Registry Support**: Container registry enabled for storing Docker images
- **Tekton Integration**: Webhook automatically configured if Tekton is installed
## Quick Start
### Development
```bash
# 1. Setup secrets and init job (uses default dev password)
./infrastructure/cicd/gitea/setup-admin-secret.sh
# 2. Install Gitea
helm repo add gitea https://dl.gitea.io/charts
helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml
# 3. Wait for everything to be ready
kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s
# 4. Check init job completed
kubectl logs -n gitea -l app.kubernetes.io/component=init --tail=50
```
### Production
```bash
# 1. Generate and export secure password
export GITEA_ADMIN_PASSWORD=$(openssl rand -base64 32)
# 2. Setup secrets with production flag (requires GITEA_ADMIN_PASSWORD)
./infrastructure/cicd/gitea/setup-admin-secret.sh --production
# 3. Install Gitea with production values
helm repo add gitea https://dl.gitea.io/charts
helm upgrade --install gitea gitea/gitea -n gitea \
-f infrastructure/cicd/gitea/values.yaml \
-f infrastructure/cicd/gitea/values-prod.yaml
# 4. Wait for everything to be ready
kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s
# 5. Install Tekton CI/CD (see tekton-helm/README.md for details)
export TEKTON_WEBHOOK_TOKEN=$(openssl rand -hex 32)
helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \
-n tekton-pipelines \
-f infrastructure/cicd/tekton-helm/values.yaml \
-f infrastructure/cicd/tekton-helm/values-prod.yaml \
--set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \
--set secrets.registry.password=$GITEA_ADMIN_PASSWORD \
--set secrets.git.password=$GITEA_ADMIN_PASSWORD
```
## Files
| File | Description |
|------|-------------|
| `values.yaml` | Helm values for Gitea chart |
| `values-prod.yaml` | Production Helm values |
| `setup-admin-secret.sh` | Creates secrets and applies init job |
| `gitea-init-job.yaml` | Kubernetes Job to create initial repository |
| `setup-gitea-repository.sh` | Helper to push local code to Gitea |
## How It Works
### 1. Admin User Initialization
The Gitea Helm chart automatically creates the admin user on first install. Credentials are read from a Kubernetes secret:
```yaml
gitea:
admin:
username: bakery-admin
email: admin@bakery-ia.local
existingSecret: gitea-admin-secret # Secret with username/password keys
passwordMode: keepUpdated # Sync password changes from secret
```
The `setup-admin-secret.sh` script creates this secret before Helm install.
### 2. Repository Initialization
Since the Gitea Helm chart doesn't support automatic repository creation, we use a Kubernetes Job (`gitea-init-job.yaml`) that:
1. Waits for Gitea to be ready
2. Creates the `bakery-ia` repository via Gitea API
3. Optionally configures a webhook for Tekton CI/CD
The Job is idempotent - it skips creation if the repository already exists.
## Detailed Installation
### Step 1: Create Secrets
```bash
# Using default password (for dev environments)
./infrastructure/cicd/gitea/setup-admin-secret.sh
# Or specify a custom password
./infrastructure/cicd/gitea/setup-admin-secret.sh "your-secure-password"
# Or use environment variable
export GITEA_ADMIN_PASSWORD="your-secure-password"
./infrastructure/cicd/gitea/setup-admin-secret.sh
```
This creates:
- `gitea-admin-secret` in `gitea` namespace - used by Gitea for admin credentials
- `gitea-registry-secret` in `bakery-ia` namespace - used for `imagePullSecrets`
- Applies `gitea-init-job.yaml` (ConfigMap + Job)
### Step 2: Install Gitea
```bash
helm repo add gitea https://dl.gitea.io/charts
helm repo update
helm install gitea gitea/gitea -n gitea \
-f infrastructure/cicd/gitea/values.yaml
```
### Step 3: Verify Installation
```bash
# Wait for Gitea pod
kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s
# Check init job logs
kubectl logs -n gitea job/gitea-init-repo
# Verify repository was created
curl -u bakery-admin:pvYUkGWJijqc0QfIZEXw \
https://gitea.bakery-ia.local/api/v1/repos/bakery-admin/bakery-ia
```
## CI/CD Integration
Repository URL:
```
https://gitea.bakery-ia.local/bakery-admin/bakery-ia.git
```
Internal cluster URL (for pipelines):
```
http://gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bakery-ia.git
```
## Troubleshooting
### Init Job Failed
```bash
# Check job status
kubectl get jobs -n gitea
# View logs
kubectl logs -n gitea job/gitea-init-repo
# Re-run the job
kubectl delete job gitea-init-repo -n gitea
kubectl apply -f infrastructure/cicd/gitea/gitea-init-job.yaml
```
### Repository Not Created
1. Check if Gitea is ready: `kubectl get pods -n gitea`
2. Check init job logs: `kubectl logs -n gitea job/gitea-init-repo`
3. Manually create via API or use `setup-gitea-repository.sh`
### Authentication Issues
1. Verify secret exists: `kubectl get secret gitea-admin-secret -n gitea`
2. Check credentials: `kubectl get secret gitea-admin-secret -n gitea -o jsonpath='{.data.password}' | base64 -d`
## Upgrading
```bash
helm upgrade gitea gitea/gitea -n gitea \
-f infrastructure/cicd/gitea/values.yaml
```
Repositories and data are preserved during upgrades (stored in PVC).

View File

@@ -0,0 +1,176 @@
# Gitea Initialization Job
# This Job runs after Gitea is installed to create the initial repository
# It uses the same admin credentials from gitea-admin-secret
#
# Apply after Gitea is ready:
# kubectl apply -f gitea-init-job.yaml -n gitea
#
# To re-run (if needed):
# kubectl delete job gitea-init-repo -n gitea
# kubectl apply -f gitea-init-job.yaml -n gitea
---
apiVersion: v1
kind: ConfigMap
metadata:
name: gitea-init-script
namespace: gitea
labels:
app.kubernetes.io/name: gitea
app.kubernetes.io/component: init
data:
init-repo.sh: |
#!/bin/sh
set -e
GITEA_URL="http://gitea-http.gitea.svc.cluster.local:3000"
REPO_NAME="bakery-ia"
MAX_RETRIES=30
RETRY_INTERVAL=10
echo "=== Gitea Repository Initialization ==="
echo "Gitea URL: $GITEA_URL"
echo "Repository: $REPO_NAME"
echo "Admin User: $GITEA_ADMIN_USER"
# Wait for Gitea to be ready
echo ""
echo "Waiting for Gitea to be ready..."
RETRIES=0
until curl -sf "$GITEA_URL/api/v1/version" > /dev/null 2>&1; do
RETRIES=$((RETRIES + 1))
if [ $RETRIES -ge $MAX_RETRIES ]; then
echo "ERROR: Gitea did not become ready after $MAX_RETRIES attempts"
exit 1
fi
echo " Attempt $RETRIES/$MAX_RETRIES - Gitea not ready, waiting ${RETRY_INTERVAL}s..."
sleep $RETRY_INTERVAL
done
echo "Gitea is ready!"
# Check if repository already exists
echo ""
echo "Checking if repository '$REPO_NAME' exists..."
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
-u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME")
if [ "$HTTP_CODE" = "200" ]; then
echo "Repository '$REPO_NAME' already exists. Nothing to do."
exit 0
fi
# Create the repository
echo "Creating repository '$REPO_NAME'..."
RESPONSE=$(curl -s -w "\n%{http_code}" \
-u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
-X POST "$GITEA_URL/api/v1/user/repos" \
-H "Content-Type: application/json" \
-d '{
"name": "'"$REPO_NAME"'",
"description": "Main repository for Bakery IA project - Automatically created",
"private": false,
"auto_init": true,
"default_branch": "main",
"readme": "Default"
}')
HTTP_CODE=$(echo "$RESPONSE" | tail -1)
BODY=$(echo "$RESPONSE" | sed '$d')
if [ "$HTTP_CODE" = "201" ]; then
echo "Repository '$REPO_NAME' created successfully!"
echo ""
echo "Repository URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME"
echo "Clone URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME.git"
else
echo "ERROR: Failed to create repository (HTTP $HTTP_CODE)"
echo "Response: $BODY"
exit 1
fi
# Configure webhook for Tekton (optional - if Tekton is installed)
echo ""
echo "Checking if Tekton EventListener is available..."
TEKTON_URL="http://el-bakery-ia-listener.tekton-pipelines.svc.cluster.local:8080"
if curl -sf "$TEKTON_URL" > /dev/null 2>&1; then
echo "Tekton EventListener found. Creating webhook..."
WEBHOOK_RESPONSE=$(curl -s -w "\n%{http_code}" \
-u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
-X POST "$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME/hooks" \
-H "Content-Type: application/json" \
-d '{
"type": "gitea",
"config": {
"url": "'"$TEKTON_URL"'",
"content_type": "json"
},
"events": ["push"],
"active": true
}')
WEBHOOK_CODE=$(echo "$WEBHOOK_RESPONSE" | tail -1)
if [ "$WEBHOOK_CODE" = "201" ]; then
echo "Webhook created successfully!"
else
echo "Warning: Could not create webhook (HTTP $WEBHOOK_CODE). You may need to configure it manually."
fi
else
echo "Tekton EventListener not available. Skipping webhook creation."
fi
echo ""
echo "=== Initialization Complete ==="
---
apiVersion: batch/v1
kind: Job
metadata:
name: gitea-init-repo
namespace: gitea
labels:
app.kubernetes.io/name: gitea
app.kubernetes.io/component: init
annotations:
# Helm hook annotations (if used with Helm)
helm.sh/hook: post-install,post-upgrade
helm.sh/hook-weight: "10"
helm.sh/hook-delete-policy: before-hook-creation
spec:
ttlSecondsAfterFinished: 300
backoffLimit: 3
template:
metadata:
labels:
app.kubernetes.io/name: gitea
app.kubernetes.io/component: init
spec:
restartPolicy: OnFailure
containers:
- name: init-repo
image: curlimages/curl:8.5.0
command: ["/bin/sh", "/scripts/init-repo.sh"]
env:
- name: GITEA_ADMIN_USER
valueFrom:
secretKeyRef:
name: gitea-admin-secret
key: username
- name: GITEA_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: gitea-admin-secret
key: password
volumeMounts:
- name: init-script
mountPath: /scripts
resources:
limits:
cpu: 100m
memory: 64Mi
requests:
cpu: 50m
memory: 32Mi
volumes:
- name: init-script
configMap:
name: gitea-init-script
defaultMode: 0755

View File

@@ -0,0 +1,209 @@
#!/bin/bash
# Setup Gitea Admin Secret and Initialize Gitea
#
# This script:
# 1. Creates gitea-admin-secret (gitea namespace) - Used by Gitea Helm chart for admin credentials
# 2. Creates gitea-registry-secret (bakery-ia namespace) - Used by pods for imagePullSecrets
# 3. Applies the gitea-init-job.yaml to create the initial repository
#
# Usage:
# Development:
# ./setup-admin-secret.sh # Uses default dev password
# ./setup-admin-secret.sh [password] # Uses provided password
# ./setup-admin-secret.sh --secrets-only # Only create secrets, skip init job
#
# Production:
# export GITEA_ADMIN_PASSWORD=$(openssl rand -base64 32)
# ./setup-admin-secret.sh --production
# ./setup-admin-secret.sh --production --secrets-only
#
# Environment variables:
# GITEA_ADMIN_PASSWORD - Password to use (required for --production)
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
KUBECTL="kubectl"
GITEA_NAMESPACE="gitea"
BAKERY_NAMESPACE="bakery-ia"
REGISTRY_HOST="registry.bakery-ia.local"
ADMIN_USERNAME="bakery-admin"
# Default password for dev environment only
# For PRODUCTION: Always set GITEA_ADMIN_PASSWORD environment variable
# Generate secure password with: openssl rand -base64 32
DEV_DEFAULT_PASSWORD="pvYUkGWJijqc0QfIZEXw"
SECRETS_ONLY=false
IS_PRODUCTION=false
# Check if running in microk8s
if command -v microk8s &> /dev/null; then
KUBECTL="microk8s kubectl"
fi
# Parse arguments
for arg in "$@"; do
case $arg in
--secrets-only)
SECRETS_ONLY=true
;;
--production)
IS_PRODUCTION=true
REGISTRY_HOST="registry.bakewise.ai"
;;
*)
if [ -z "$ADMIN_PASSWORD" ] && [ "$arg" != "--secrets-only" ] && [ "$arg" != "--production" ]; then
ADMIN_PASSWORD="$arg"
fi
;;
esac
done
# Get password from argument, environment variable, or use default (dev only)
if [ -z "$ADMIN_PASSWORD" ]; then
if [ -n "$GITEA_ADMIN_PASSWORD" ]; then
ADMIN_PASSWORD="$GITEA_ADMIN_PASSWORD"
echo "Using password from GITEA_ADMIN_PASSWORD environment variable"
elif [ "$IS_PRODUCTION" = true ]; then
echo "ERROR: Production deployment requires GITEA_ADMIN_PASSWORD environment variable"
echo "Generate a secure password with: openssl rand -base64 32"
echo ""
echo "Usage for production:"
echo " export GITEA_ADMIN_PASSWORD=\$(openssl rand -base64 32)"
echo " ./setup-admin-secret.sh --production"
exit 1
else
ADMIN_PASSWORD="$DEV_DEFAULT_PASSWORD"
echo "WARNING: Using default dev password. For production, set GITEA_ADMIN_PASSWORD"
fi
fi
# Validate password strength for production
if [ "$IS_PRODUCTION" = true ] && [ ${#ADMIN_PASSWORD} -lt 16 ]; then
echo "ERROR: Production password must be at least 16 characters"
exit 1
fi
# Create namespaces if they don't exist
$KUBECTL create namespace "$GITEA_NAMESPACE" --dry-run=client -o yaml | $KUBECTL apply -f -
$KUBECTL create namespace "$BAKERY_NAMESPACE" --dry-run=client -o yaml | $KUBECTL apply -f -
# 1. Create gitea-admin-secret for Gitea Helm chart
echo "Creating gitea-admin-secret in $GITEA_NAMESPACE namespace..."
$KUBECTL create secret generic gitea-admin-secret \
--namespace "$GITEA_NAMESPACE" \
--from-literal=username="$ADMIN_USERNAME" \
--from-literal=password="$ADMIN_PASSWORD" \
--dry-run=client -o yaml | $KUBECTL apply -f -
# 2. Create gitea-registry-secret for imagePullSecrets
echo "Creating gitea-registry-secret in $BAKERY_NAMESPACE namespace..."
# Create Docker config JSON for registry authentication
# Include both external (ingress) and internal (cluster) registry URLs
AUTH_BASE64=$(echo -n "${ADMIN_USERNAME}:${ADMIN_PASSWORD}" | base64)
INTERNAL_REGISTRY_HOST="gitea-http.gitea.svc.cluster.local:3000"
DOCKER_CONFIG_JSON=$(cat <<EOF
{
"auths": {
"${REGISTRY_HOST}": {
"username": "${ADMIN_USERNAME}",
"password": "${ADMIN_PASSWORD}",
"auth": "${AUTH_BASE64}"
},
"${INTERNAL_REGISTRY_HOST}": {
"username": "${ADMIN_USERNAME}",
"password": "${ADMIN_PASSWORD}",
"auth": "${AUTH_BASE64}"
}
}
}
EOF
)
# Base64 encode the entire config (use -w0 on Linux, no flag needed on macOS)
if [[ "$OSTYPE" == "darwin"* ]]; then
DOCKER_CONFIG_BASE64=$(echo -n "$DOCKER_CONFIG_JSON" | base64)
else
DOCKER_CONFIG_BASE64=$(echo -n "$DOCKER_CONFIG_JSON" | base64 -w0)
fi
# Create the registry secret
cat <<EOF | $KUBECTL apply -f -
apiVersion: v1
kind: Secret
metadata:
name: gitea-registry-secret
namespace: ${BAKERY_NAMESPACE}
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: registry
app.kubernetes.io/managed-by: setup-admin-secret
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: ${DOCKER_CONFIG_BASE64}
EOF
echo ""
echo "=========================================="
echo "Gitea secrets created successfully!"
echo "=========================================="
echo ""
echo "Environment: $([ "$IS_PRODUCTION" = true ] && echo "PRODUCTION" || echo "Development")"
echo ""
echo "Credentials:"
echo " Username: $ADMIN_USERNAME"
if [ "$IS_PRODUCTION" = true ]; then
echo " Password: (stored in secret, not displayed for security)"
else
echo " Password: $ADMIN_PASSWORD"
fi
echo ""
echo "Secrets created:"
echo " 1. gitea-admin-secret (namespace: $GITEA_NAMESPACE) - For Gitea Helm chart"
echo " 2. gitea-registry-secret (namespace: $BAKERY_NAMESPACE) - For imagePullSecrets"
echo ""
echo "Registry URLs:"
echo " External: https://$REGISTRY_HOST"
echo " Internal: $INTERNAL_REGISTRY_HOST"
echo ""
# Apply the init job ConfigMap and Job (but Job won't run until Gitea is installed)
if [ "$SECRETS_ONLY" = false ]; then
INIT_JOB_FILE="$SCRIPT_DIR/gitea-init-job.yaml"
if [ -f "$INIT_JOB_FILE" ]; then
echo "Applying Gitea initialization resources..."
$KUBECTL apply -f "$INIT_JOB_FILE"
echo ""
echo "Init job will create the 'bakery-ia' repository once Gitea is ready."
else
echo "Warning: gitea-init-job.yaml not found at $INIT_JOB_FILE"
fi
echo ""
fi
echo "Next steps:"
if [ "$IS_PRODUCTION" = true ]; then
echo " 1. Install Gitea for production:"
echo " helm upgrade --install gitea gitea/gitea -n gitea \\"
echo " -f infrastructure/cicd/gitea/values.yaml \\"
echo " -f infrastructure/cicd/gitea/values-prod.yaml"
echo ""
echo " 2. Install Tekton CI/CD for production:"
echo " export TEKTON_WEBHOOK_TOKEN=\$(openssl rand -hex 32)"
echo " helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \\"
echo " -n tekton-pipelines \\"
echo " -f infrastructure/cicd/tekton-helm/values.yaml \\"
echo " -f infrastructure/cicd/tekton-helm/values-prod.yaml \\"
echo " --set secrets.webhook.token=\$TEKTON_WEBHOOK_TOKEN \\"
echo " --set secrets.registry.password=\$GITEA_ADMIN_PASSWORD \\"
echo " --set secrets.git.password=\$GITEA_ADMIN_PASSWORD"
else
echo " 1. Install Gitea (if not already installed):"
echo " helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml"
fi
echo ""
echo " $([ "$IS_PRODUCTION" = true ] && echo "3" || echo "2"). Wait for Gitea to be ready:"
echo " kubectl wait --for=condition=ready pod -n gitea -l app.kubernetes.io/name=gitea --timeout=300s"
echo ""
echo " $([ "$IS_PRODUCTION" = true ] && echo "4" || echo "3"). Check init job status:"
echo " kubectl logs -n gitea -l app.kubernetes.io/component=init --tail=50"

View File

@@ -0,0 +1,119 @@
#!/bin/bash
# Script to setup and push code to the automatically created Gitea repository
# This script should be run after Gitea is installed and the repository is created
set -e
echo "=== Gitea Repository Setup Script ==="
echo "This script will configure the bakery-ia repository in Gitea"
echo
# Configuration - update these values as needed
GITEA_URL="https://gitea.bakery-ia.local"
GITEA_ADMIN_USER="bakery-admin"
REPO_NAME="bakery-ia"
LOCAL_DIR="/Users/urtzialfaro/Documents/bakery-ia"
# Check if Gitea admin password is set
if [ -z "$GITEA_ADMIN_PASSWORD" ]; then
echo "Error: GITEA_ADMIN_PASSWORD environment variable is not set"
echo "Please set it to the admin password you used during Gitea installation"
exit 1
fi
echo "Checking if Gitea is accessible..."
if ! curl -s -o /dev/null -w "%{http_code}" "$GITEA_URL" | grep -q "200"; then
echo "Error: Cannot access Gitea at $GITEA_URL"
echo "Please ensure Gitea is running and accessible"
exit 1
fi
echo "✓ Gitea is accessible"
echo "Checking if repository $REPO_NAME exists..."
REPO_CHECK=$(curl -s -w "%{http_code}" -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME" | tail -1)
if [ "$REPO_CHECK" != "200" ]; then
echo "Repository $REPO_NAME does not exist or is not accessible"
echo "Attempting to create it..."
CREATE_RESPONSE=$(curl -s -w "%{http_code}" -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
-X POST "$GITEA_URL/api/v1/user/repos" \
-H "Content-Type: application/json" \
-d '{
"name": "'"$REPO_NAME"'",
"description": "Main repository for Bakery IA project",
"private": false,
"auto_init": true,
"default_branch": "main"
}')
HTTP_CODE=$(echo "$CREATE_RESPONSE" | tail -1)
RESPONSE_BODY=$(echo "$CREATE_RESPONSE" | sed '$d')
if [ "$HTTP_CODE" != "201" ]; then
echo "Error creating repository: HTTP $HTTP_CODE"
echo "Response: $RESPONSE_BODY"
exit 1
fi
echo "✓ Repository $REPO_NAME created successfully"
else
echo "✓ Repository $REPO_NAME already exists"
fi
echo "Configuring Git repository..."
cd "$LOCAL_DIR"
# Check if this is already a git repository
if [ ! -d ".git" ]; then
echo "Initializing Git repository..."
git init
git branch -M main
else
echo "Git repository already initialized"
fi
# Configure Git user if not already set
if [ -z "$(git config user.name)" ]; then
git config user.name "$GITEA_ADMIN_USER"
git config user.email "admin@bakery-ia.local"
echo "✓ Configured Git user: $GITEA_ADMIN_USER"
fi
# Set the remote URL
GIT_REMOTE_URL="$GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME.git"
if git remote | grep -q "origin"; then
CURRENT_REMOTE=$(git remote get-url origin)
if [ "$CURRENT_REMOTE" != "$GIT_REMOTE_URL" ]; then
echo "Updating remote origin to: $GIT_REMOTE_URL"
git remote set-url origin "$GIT_REMOTE_URL"
else
echo "Remote origin is already set correctly"
fi
else
echo "Setting remote origin to: $GIT_REMOTE_URL"
git remote add origin "$GIT_REMOTE_URL"
fi
echo "Checking if there are changes to commit..."
if [ -n "$(git status --porcelain)" ]; then
echo "Committing changes..."
git add .
git commit -m "Initial commit - Bakery IA project setup"
echo "✓ Changes committed"
else
echo "No changes to commit"
fi
echo "Pushing to Gitea repository..."
git push --set-upstream origin main
echo "✓ Code pushed successfully to Gitea!"
echo "Repository URL: $GIT_REMOTE_URL"
echo "You can now configure your CI/CD pipelines to use this repository."
echo "=== Setup Complete ==="

View File

@@ -0,0 +1,84 @@
#!/bin/bash
# Test script to verify that the Gitea repository was created successfully
set -e
echo "=== Gitea Repository Creation Test ==="
echo
# Configuration - update these values as needed
GITEA_URL="https://gitea.bakery-ia.local"
GITEA_ADMIN_USER="bakery-admin"
REPO_NAME="bakery-ia"
# Check if Gitea admin password is set
if [ -z "$GITEA_ADMIN_PASSWORD" ]; then
echo "Error: GITEA_ADMIN_PASSWORD environment variable is not set"
echo "Please set it to the admin password you used during Gitea installation"
exit 1
fi
echo "Testing Gitea accessibility..."
if ! curl -s -o /dev/null -w "%{http_code}" "$GITEA_URL" | grep -q "200"; then
echo "❌ Error: Cannot access Gitea at $GITEA_URL"
echo "Please ensure Gitea is running and accessible"
exit 1
fi
echo "✅ Gitea is accessible"
echo "Testing repository existence..."
REPO_CHECK=$(curl -s -w "%{http_code}" -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME" | tail -1)
if [ "$REPO_CHECK" == "200" ]; then
echo "✅ Repository '$REPO_NAME' exists"
# Get repository details
REPO_DETAILS=$(curl -s -u "$GITEA_ADMIN_USER:$GITEA_ADMIN_PASSWORD" \
"$GITEA_URL/api/v1/repos/$GITEA_ADMIN_USER/$REPO_NAME")
REPO_DESCRIPTION=$(echo "$REPO_DETAILS" | jq -r '.description')
REPO_PRIVATE=$(echo "$REPO_DETAILS" | jq -r '.private')
REPO_DEFAULT_BRANCH=$(echo "$REPO_DETAILS" | jq -r '.default_branch')
echo "Repository Details:"
echo " - Name: $REPO_NAME"
echo " - Description: $REPO_DESCRIPTION"
echo " - Private: $REPO_PRIVATE"
echo " - Default Branch: $REPO_DEFAULT_BRANCH"
echo " - URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME"
echo " - Clone URL: $GITEA_URL/$GITEA_ADMIN_USER/$REPO_NAME.git"
# Test if repository has issues enabled
if echo "$REPO_DETAILS" | jq -e '.has_issues == true' > /dev/null; then
echo "✅ Issues are enabled"
else
echo "❌ Issues are not enabled"
fi
# Test if repository has wiki enabled
if echo "$REPO_DETAILS" | jq -e '.has_wiki == true' > /dev/null; then
echo "✅ Wiki is enabled"
else
echo "❌ Wiki is not enabled"
fi
# Test if repository has pull requests enabled
if echo "$REPO_DETAILS" | jq -e '.has_pull_requests == true' > /dev/null; then
echo "✅ Pull requests are enabled"
else
echo "❌ Pull requests are not enabled"
fi
echo
echo "✅ All tests passed! Repository is ready for use."
else
echo "❌ Repository '$REPO_NAME' does not exist"
echo "Expected HTTP 200, got: $REPO_CHECK"
exit 1
fi
echo
echo "=== Test Complete ==="

View File

@@ -0,0 +1,65 @@
# Gitea Helm values for Production environment
# This file overrides values.yaml for production deployment
#
# Installation:
# helm upgrade --install gitea gitea/gitea -n gitea \
# -f infrastructure/cicd/gitea/values.yaml \
# -f infrastructure/cicd/gitea/values-prod.yaml
ingress:
enabled: true
className: nginx
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "500m"
nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
cert-manager.io/cluster-issuer: "letsencrypt-production"
hosts:
- host: gitea.bakewise.ai
paths:
- path: /
pathType: Prefix
tls:
- secretName: gitea-tls-cert
hosts:
- gitea.bakewise.ai
apiIngress:
enabled: true
className: nginx
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "500m"
cert-manager.io/cluster-issuer: "letsencrypt-production"
hosts:
- host: registry.bakewise.ai
paths:
- path: /
pathType: Prefix
tls:
- secretName: registry-tls-cert
hosts:
- registry.bakewise.ai
gitea:
admin:
email: admin@bakewise.ai
config:
server:
DOMAIN: gitea.bakewise.ai
SSH_DOMAIN: gitea.bakewise.ai
ROOT_URL: https://gitea.bakewise.ai
# Production resources - adjust based on expected load
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 200m
memory: 512Mi
# Larger storage for production
persistence:
size: 50Gi

View File

@@ -0,0 +1,132 @@
# Gitea Helm values configuration for Bakery-IA CI/CD
# This configuration sets up Gitea with registry support and appropriate storage
#
# Prerequisites:
# 1. Run setup-admin-secret.sh to create the gitea-admin-secret
# 2. Apply the post-install job: kubectl apply -f gitea-init-job.yaml
#
# Installation:
# helm repo add gitea https://dl.gitea.io/charts
# helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml
#
# NOTE: The namespace is determined by the -n flag during helm install, not in this file.
# Use regular Gitea image instead of rootless to ensure registry functionality
# Rootless images don't support container registry due to security restrictions
image:
rootless: false
service:
http:
type: ClusterIP
port: 3000
ssh:
type: ClusterIP
port: 2222
# NOTE: Gitea's container registry is served on port 3000 (same as HTTP) under /v2/
# The registry.PORT in gitea config is NOT used for external access
# Registry authentication and API is handled by the main HTTP service
ingress:
enabled: true
className: nginx
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "500m"
nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
hosts:
- host: gitea.bakery-ia.local
paths:
- path: /
pathType: Prefix
tls:
- secretName: bakery-dev-tls-cert
hosts:
- gitea.bakery-ia.local
- registry.bakery-ia.local
persistence:
enabled: true
size: 10Gi
# Use standard storage class (works with Kind's default provisioner)
# For microk8s: storageClass: "microk8s-hostpath"
# For Kind: leave empty or use "standard"
storageClass: ""
# =============================================================================
# ADMIN USER CONFIGURATION
# =============================================================================
# The admin user is automatically created on first install.
# Credentials are read from the 'gitea-admin-secret' Kubernetes secret.
#
# Create the secret BEFORE installing Gitea:
# ./setup-admin-secret.sh
#
# The secret must contain:
# - username: admin username (default: bakery-admin)
# - password: admin password
# =============================================================================
gitea:
admin:
username: bakery-admin
email: admin@bakery-ia.local
# Use existing secret for admin credentials (created by setup-admin-secret.sh)
existingSecret: gitea-admin-secret
# keepUpdated ensures password changes in secret are applied
passwordMode: keepUpdated
config:
server:
DOMAIN: gitea.bakery-ia.local
SSH_DOMAIN: gitea.bakery-ia.local
SSH_PORT: 2222
# Use HTTPS for external access; TLS termination happens at ingress
ROOT_URL: https://gitea.bakery-ia.local
HTTP_PORT: 3000
# Disable built-in HTTPS since ingress handles TLS
PROTOCOL: http
repository:
ENABLE_PUSH_CREATE_USER: true
ENABLE_PUSH_CREATE_ORG: true
DEFAULT_BRANCH: main
packages:
ENABLED: true
webhook:
ALLOWED_HOST_LIST: "*"
# Allow internal cluster URLs for Tekton EventListener
SKIP_TLS_VERIFY: true
service:
DISABLE_REGISTRATION: false
REQUIRE_SIGNIN_VIEW: false
# Use embedded SQLite for simpler local development
# For production, enable postgresql
postgresql:
enabled: false
# Use embedded in-memory cache for local dev
redis-cluster:
enabled: false
# Resource configuration for local development
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 256Mi
# Init containers timeout
initContainers:
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 50m
memory: 64Mi

View File

@@ -0,0 +1,15 @@
apiVersion: v2
name: tekton-cicd
description: Tekton CI/CD infrastructure for Bakery-IA
type: application
version: 0.1.0
appVersion: "0.57.0"
maintainers:
- name: Bakery-IA Team
email: team@bakery-ia.local
annotations:
category: Infrastructure
app.kubernetes.io/name: tekton-cicd
app.kubernetes.io/instance: tekton-cicd
app.kubernetes.io/version: "0.57.0"
app.kubernetes.io/part-of: bakery-ia

View File

@@ -0,0 +1,145 @@
# Gitea Admin Secret Integration for Tekton
This document explains how Tekton CI/CD integrates with the existing Gitea admin secret to ensure credential consistency across the system.
## Architecture Overview
```mermaid
graph TD
A[Gitea Admin Secret] --> B[Tekton Registry Credentials]
A --> C[Tekton Git Credentials]
A --> D[Flux Git Credentials]
B --> E[Kaniko Build Task]
C --> F[GitOps Update Task]
D --> G[Flux GitRepository]
```
## How It Works
The system uses Helm's `lookup` function to reference the existing `gitea-admin-secret` from the Gitea namespace, ensuring that:
1. **Single Source of Truth**: All CI/CD components use the same credentials as Gitea
2. **Automatic Synchronization**: When Gitea admin password changes, all CI/CD components automatically use the new credentials
3. **Reduced Maintenance**: No need to manually update credentials in multiple places
## Secret Reference Flow
```
Gitea Namespace: gitea-admin-secret
└── username: bakery-admin
└── password: [secure-password]
Tekton Namespace:
├── gitea-registry-credentials (dockerconfigjson)
│ └── references gitea-admin-secret.password
├── gitea-git-credentials (opaque)
│ └── references gitea-admin-secret.password
└── gitea-credentials (opaque) [flux-system namespace]
└── references gitea-admin-secret.password
```
## Deployment Requirements
### Prerequisites
1. **Gitea must be installed first**: The `gitea-admin-secret` must exist before deploying Tekton
2. **Same username**: All components use `bakery-admin` as the username
3. **Namespace access**: Tekton service account needs read access to Gitea namespace secrets
### Installation Steps
1. **Install Gitea with admin secret**:
```bash
# Run the setup script to create gitea-admin-secret
./infrastructure/cicd/gitea/setup-admin-secret.sh your-secure-password
# Install Gitea Helm chart
helm install gitea gitea/gitea -n gitea -f infrastructure/cicd/gitea/values.yaml
```
2. **Install Tekton with secret references**:
```bash
# Install Tekton - it will automatically reference the Gitea admin secret
helm install tekton-cicd infrastructure/cicd/tekton-helm \
--namespace tekton-pipelines \
--set secrets.webhook.token="your-webhook-token"
```
## Troubleshooting
### Common Issues
1. **Secret not found error**:
- Ensure Gitea is installed before Tekton
- Verify the `gitea-admin-secret` exists in the `gitea` namespace
- Check that Tekton service account has RBAC permissions to read Gitea secrets
2. **Authentication failures**:
- Verify the Gitea admin password is correct
- Ensure the username is `bakery-admin` (matching the Gitea admin)
- Check that the password hasn't been manually changed in Gitea UI
### Debugging Commands
```bash
# Check if gitea-admin-secret exists
kubectl get secret gitea-admin-secret -n gitea
# Verify Tekton secrets were created correctly
kubectl get secret gitea-registry-credentials -n tekton-pipelines -o yaml
kubectl get secret gitea-git-credentials -n tekton-pipelines -o yaml
kubectl get secret gitea-credentials -n flux-system -o yaml
# Check RBAC permissions
kubectl get role,rolebinding,clusterrole,clusterrolebinding -n tekton-pipelines
```
## Security Considerations
### Benefits
1. **Reduced attack surface**: Fewer secrets to manage and rotate
2. **Automatic rotation**: Changing Gitea admin password automatically updates all CI/CD components
3. **Consistent access control**: Single point for credential management
### Best Practices
1. **Use strong passwords**: Generate secure random passwords for Gitea admin
2. **Rotate regularly**: Change the Gitea admin password periodically
3. **Limit access**: Restrict who can read the `gitea-admin-secret`
4. **Audit logs**: Monitor access to the admin secret
## Manual Override
If you need to use different credentials for specific components, you can override the values:
```bash
helm install tekton-cicd infrastructure/cicd/tekton-helm \
--namespace tekton-pipelines \
--set secrets.webhook.token="your-webhook-token" \
--set secrets.registry.password="custom-registry-password" \
--set secrets.git.password="custom-git-password"
```
However, this is **not recommended** as it breaks the single source of truth principle.
## Helm Template Details
The integration uses Helm's `lookup` function with `b64dec` to decode the base64-encoded password:
```yaml
password: {{ .Values.secrets.git.password | default (lookup "v1" "Secret" "gitea" "gitea-admin-secret").data.password | b64dec | quote }}
```
This means:
1. Look up the `gitea-admin-secret` in the `gitea` namespace
2. Get the `password` field from the secret's `data` section
3. Base64 decode it (Kubernetes stores secret data as base64)
4. Use it as the password value
5. If `.Values.secrets.git.password` is provided, use that instead (for manual override)
## Conclusion
This integration provides a robust, secure way to manage credentials across the CI/CD pipeline while maintaining consistency with Gitea's admin credentials.

View File

@@ -0,0 +1,83 @@
# Tekton CI/CD Helm Chart
This Helm chart deploys the Tekton CI/CD infrastructure for the Bakery-IA project.
## Prerequisites
- Kubernetes 1.20+
- Tekton Pipelines installed (v0.57.0 or later)
- Helm 3.0+
## Installation
Before installing this chart, Tekton Pipelines must be installed separately:
```bash
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
```
Then install the chart:
### Development Installation
```bash
helm install tekton-cicd infrastructure/cicd/tekton-helm \
--namespace tekton-pipelines \
--create-namespace
```
### Production Installation
**Important**: Never use default secrets in production. Always provide secure credentials.
```bash
# Generate secure webhook token
export TEKTON_WEBHOOK_TOKEN=$(openssl rand -hex 32)
# Use the same password as Gitea admin (from GITEA_ADMIN_PASSWORD)
helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \
-n tekton-pipelines \
-f infrastructure/cicd/tekton-helm/values.yaml \
-f infrastructure/cicd/tekton-helm/values-prod.yaml \
--set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \
--set secrets.registry.password=$GITEA_ADMIN_PASSWORD \
--set secrets.git.password=$GITEA_ADMIN_PASSWORD
```
## Configuration
The following table lists the configurable parameters of the tekton-cicd chart and their default values.
| Parameter | Description | Default |
|-----------|-------------|---------|
| `global.registry.url` | Container registry URL | `"gitea.bakery-ia.local:5000"` |
| `global.git.branch` | Git branch name | `"main"` |
| `global.git.userName` | Git user name | `"bakery-ia-ci"` |
| `global.git.userEmail` | Git user email | `"ci@bakery-ia.local"` |
| `pipeline.build.cacheTTL` | Build cache TTL | `"24h"` |
| `pipeline.build.verbosity` | Build verbosity level | `"info"` |
| `pipeline.test.skipTests` | Skip tests flag | `"false"` |
| `pipeline.test.skipLint` | Skip lint flag | `"false"` |
| `pipeline.deployment.namespace` | Deployment namespace | `"bakery-ia"` |
| `pipeline.deployment.fluxNamespace` | Flux namespace | `"flux-system"` |
| `pipeline.workspace.size` | Workspace size | `"5Gi"` |
| `pipeline.workspace.storageClass` | Workspace storage class | `"standard"` |
| `secrets.webhook.token` | Webhook validation token | `"example-webhook-token-do-not-use-in-production"` |
| `secrets.registry.username` | Registry username | `"example-user"` |
| `secrets.registry.password` | Registry password | `"example-password"` |
| `secrets.registry.registryUrl` | Registry URL | `"gitea.bakery-ia.local:5000"` |
| `secrets.git.username` | Git username | `"example-user"` |
| `secrets.git.password` | Git password | `"example-password"` |
| `namespace` | Namespace for Tekton resources | `"tekton-pipelines"` |
## Uninstallation
To uninstall/delete the `tekton-cicd` release:
```bash
helm delete tekton-cicd --namespace tekton-pipelines
```
## Values
For a detailed list of configurable values, see the `values.yaml` file.

View File

@@ -0,0 +1,22 @@
Thank you for installing {{ .Chart.Name }}.
This chart deploys the Tekton CI/CD infrastructure for Bakery-IA.
IMPORTANT: Tekton Pipelines must be installed separately before deploying this chart.
To install Tekton Pipelines, run:
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
To verify Tekton is running:
kubectl get pods -n tekton-pipelines
After Tekton is installed, this chart will deploy:
- ConfigMaps with pipeline configuration
- RBAC resources for triggers and pipelines
- Secrets for registry and Git credentials
- Tasks, Pipelines, and Triggers for CI/CD
To check the status of deployed resources:
kubectl get all -n {{ .Release.Namespace }}
For more information about Tekton, visit: https://tekton.dev/

View File

@@ -0,0 +1,80 @@
# ClusterRole for Tekton Triggers to create PipelineRuns
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tekton-triggers-role
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
rules:
# Ability to create PipelineRuns from triggers
- apiGroups: ["tekton.dev"]
resources: ["pipelineruns", "taskruns"]
verbs: ["create", "get", "list", "watch"]
# Ability to read pipelines and tasks
- apiGroups: ["tekton.dev"]
resources: ["pipelines", "tasks", "clustertasks"]
verbs: ["get", "list", "watch"]
# Ability to manage PVCs for workspaces
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["create", "get", "list", "watch", "delete"]
# Ability to read secrets for credentials
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch"]
# Ability to read configmaps
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch"]
# Ability to manage events for logging
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "patch"]
# Ability to list cluster-scoped trigger resources (needed for Tekton Triggers controller)
- apiGroups: ["triggers.tekton.dev"]
resources: ["clustertriggerbindings", "clusterinterceptors"]
verbs: ["get", "list", "watch"]
---
# ClusterRole for Pipeline execution (needed for git operations and deployments)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tekton-pipeline-role
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: pipeline
rules:
# Ability to read/update deployments for GitOps
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "patch", "update"]
# Ability to read secrets for credentials
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch"]
# Ability to read configmaps
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch"]
# Ability to manage pods for build operations
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list", "watch"]
---
# Role for EventListener to access triggers resources
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: tekton-triggers-eventlistener-role
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
rules:
- apiGroups: ["triggers.tekton.dev"]
resources: ["eventlisteners", "triggerbindings", "triggertemplates", "triggers", "interceptors"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list", "watch"]

View File

@@ -0,0 +1,32 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: pipeline-config
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: config
data:
# Container Registry Configuration
REGISTRY_URL: "{{ .Values.global.registry.url }}"
# Git Configuration
GIT_BRANCH: "{{ .Values.global.git.branch }}"
GIT_USER_NAME: "{{ .Values.global.git.userName }}"
GIT_USER_EMAIL: "{{ .Values.global.git.userEmail }}"
# Build Configuration
BUILD_CACHE_TTL: "{{ .Values.pipeline.build.cacheTTL }}"
BUILD_VERBOSITY: "{{ .Values.pipeline.build.verbosity }}"
# Test Configuration
SKIP_TESTS: "{{ .Values.pipeline.test.skipTests }}"
SKIP_LINT: "{{ .Values.pipeline.test.skipLint }}"
# Deployment Configuration
DEPLOY_NAMESPACE: "{{ .Values.pipeline.deployment.namespace }}"
FLUX_NAMESPACE: "{{ .Values.pipeline.deployment.fluxNamespace }}"
# Workspace Configuration
WORKSPACE_SIZE: "{{ .Values.pipeline.workspace.size }}"
WORKSPACE_STORAGE_CLASS: "{{ .Values.pipeline.workspace.storageClass }}"

View File

@@ -0,0 +1,32 @@
# Tekton EventListener for Bakery-IA CI/CD
# This listener receives webhook events and triggers pipelines
apiVersion: triggers.tekton.dev/v1beta1
kind: EventListener
metadata:
name: bakery-ia-event-listener
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
spec:
serviceAccountName: {{ .Values.serviceAccounts.triggers.name }}
triggers:
- name: bakery-ia-gitea-trigger
interceptors:
- ref:
name: "cel"
params:
- name: "filter"
value: "has(body.repository) && body.ref.contains('main')"
- ref:
name: "bitbucket"
params:
- name: "secretRef"
value:
secretName: gitea-webhook-secret
secretKey: secretToken
bindings:
- ref: bakery-ia-trigger-binding
template:
ref: bakery-ia-trigger-template

View File

@@ -0,0 +1,9 @@
{{- if .Values.namespace }}
apiVersion: v1
kind: Namespace
metadata:
name: {{ .Values.namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: {{ .Values.labels.app.component }}
{{- end }}

View File

@@ -0,0 +1,164 @@
# Main CI Pipeline for Bakery-IA
# This pipeline orchestrates the build, test, and deploy process
# Includes: fetch -> detect changes -> test -> build -> update gitops
# Supports environment-configurable base images for dev/prod flexibility
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: bakery-ia-ci
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: pipeline
spec:
workspaces:
- name: shared-workspace
description: Shared workspace for source code
- name: docker-credentials
description: Docker registry credentials
- name: git-credentials
description: Git credentials for pushing GitOps updates
optional: true
params:
- name: git-url
type: string
description: Repository URL
- name: git-revision
type: string
description: Git revision/commit hash
- name: registry
type: string
description: Container registry URL for pushing built images
- name: git-branch
type: string
description: Target branch for GitOps updates
default: "main"
- name: skip-tests
type: string
description: Skip tests if "true"
default: "false"
- name: dry-run
type: string
description: Dry run mode - don't push changes
default: "false"
# Base image configuration for environment-specific builds
- name: base-registry
type: string
description: "Base image registry URL (e.g., docker.io for prod, localhost:5000 for dev)"
default: "{{ .Values.pipeline.build.baseRegistry }}"
- name: python-image
type: string
description: "Python base image name and tag (e.g., python:3.11-slim for prod)"
default: "{{ .Values.pipeline.build.pythonImage }}"
tasks:
# Stage 1: Fetch source code
- name: fetch-source
taskRef:
name: git-clone
workspaces:
- name: output
workspace: shared-workspace
params:
- name: url
value: $(params.git-url)
- name: revision
value: $(params.git-revision)
# Stage 2: Detect which services changed
- name: detect-changes
runAfter: [fetch-source]
taskRef:
name: detect-changed-services
workspaces:
- name: source
workspace: shared-workspace
# Stage 3: Run tests on changed services
- name: run-tests
runAfter: [detect-changes]
taskRef:
name: run-tests
when:
- input: "$(tasks.detect-changes.results.changed-services)"
operator: notin
values: ["none", "infrastructure"]
- input: "$(params.skip-tests)"
operator: notin
values: ["true"]
workspaces:
- name: source
workspace: shared-workspace
params:
- name: services
value: $(tasks.detect-changes.results.changed-services)
- name: skip-tests
value: $(params.skip-tests)
# Stage 4: Build and push container images
- name: build-and-push
runAfter: [run-tests]
taskRef:
name: kaniko-build
when:
- input: "$(tasks.detect-changes.results.changed-services)"
operator: notin
values: ["none", "infrastructure"]
workspaces:
- name: source
workspace: shared-workspace
- name: docker-credentials
workspace: docker-credentials
params:
- name: services
value: $(tasks.detect-changes.results.changed-services)
- name: registry
value: $(params.registry)
- name: git-revision
value: $(params.git-revision)
# Environment-configurable base images
- name: base-registry
value: $(params.base-registry)
- name: python-image
value: $(params.python-image)
# Stage 5: Update GitOps manifests
- name: update-gitops-manifests
runAfter: [build-and-push]
taskRef:
name: update-gitops
when:
- input: "$(tasks.detect-changes.results.changed-services)"
operator: notin
values: ["none", "infrastructure"]
- input: "$(tasks.build-and-push.results.build-status)"
operator: in
values: ["success", "partial"]
workspaces:
- name: source
workspace: shared-workspace
- name: git-credentials
workspace: git-credentials
params:
- name: services
value: $(tasks.detect-changes.results.changed-services)
- name: registry
value: $(params.registry)
- name: git-revision
value: $(params.git-revision)
- name: git-branch
value: $(params.git-branch)
- name: dry-run
value: $(params.dry-run)
# Final tasks that run regardless of pipeline success/failure
finally:
- name: pipeline-summary
taskRef:
name: pipeline-summary
params:
- name: changed-services
value: $(tasks.detect-changes.results.changed-services)
- name: git-revision
value: $(params.git-revision)

View File

@@ -0,0 +1,51 @@
# ClusterRoleBinding for Tekton Triggers
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tekton-triggers-binding
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
subjects:
- kind: ServiceAccount
name: {{ .Values.serviceAccounts.triggers.name }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: tekton-triggers-role
apiGroup: rbac.authorization.k8s.io
---
# ClusterRoleBinding for Pipeline execution
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tekton-pipeline-binding
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: pipeline
subjects:
- kind: ServiceAccount
name: {{ .Values.serviceAccounts.pipeline.name }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: tekton-pipeline-role
apiGroup: rbac.authorization.k8s.io
---
# RoleBinding for EventListener
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: tekton-triggers-eventlistener-binding
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
subjects:
- kind: ServiceAccount
name: {{ .Values.serviceAccounts.triggers.name }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: Role
name: tekton-triggers-eventlistener-role
apiGroup: rbac.authorization.k8s.io

View File

@@ -0,0 +1,87 @@
# Secret for Gitea webhook validation
# Used by EventListener to validate incoming webhooks
apiVersion: v1
kind: Secret
metadata:
name: gitea-webhook-secret
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
annotations:
note: "Webhook secret for validating incoming webhooks"
type: Opaque
stringData:
secretToken: {{ .Values.secrets.webhook.token | quote }}
---
# Secret for Gitea container registry credentials
# Used by Kaniko to push images to Gitea registry
# References the existing gitea-admin-secret for consistency
{{- $giteaSecret := (lookup "v1" "Secret" "gitea" "gitea-admin-secret") }}
{{- $giteaPassword := "" }}
{{- if and $giteaSecret $giteaSecret.data (index $giteaSecret.data "password") }}
{{- $giteaPassword = index $giteaSecret.data "password" | b64dec }}
{{- end }}
apiVersion: v1
kind: Secret
metadata:
name: gitea-registry-credentials
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: build
annotations:
note: "Registry credentials for pushing images - references gitea-admin-secret"
type: kubernetes.io/dockerconfigjson
stringData:
{{- $registryPassword := .Values.secrets.registry.password | default $giteaPassword | default "PLACEHOLDER_PASSWORD" }}
{{- if and .Values.secrets.registry.registryUrl .Values.secrets.registry.username }}
.dockerconfigjson: |
{
"auths": {
{{ .Values.secrets.registry.registryUrl | quote }}: {
"username": {{ .Values.secrets.registry.username | quote }},
"password": {{ $registryPassword | quote }}
}
}
}
{{- else }}
.dockerconfigjson: '{"auths":{}}'
{{- end }}
---
# Secret for Git credentials (used by pipeline to push GitOps updates)
# References the existing gitea-admin-secret for consistency
apiVersion: v1
kind: Secret
metadata:
name: gitea-git-credentials
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: gitops
annotations:
note: "Git credentials for GitOps updates - references gitea-admin-secret"
type: Opaque
stringData:
{{- $gitPassword := .Values.secrets.git.password | default $giteaPassword | default "PLACEHOLDER_PASSWORD" }}
username: {{ .Values.secrets.git.username | quote }}
password: {{ $gitPassword | quote }}
---
# Secret for Flux GitRepository access
# Used by Flux to pull from Gitea repository
# References the existing gitea-admin-secret for consistency
apiVersion: v1
kind: Secret
metadata:
name: gitea-credentials
namespace: {{ .Values.pipeline.deployment.fluxNamespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: flux
annotations:
note: "Credentials for Flux GitRepository access - references gitea-admin-secret"
type: Opaque
stringData:
{{- $fluxPassword := .Values.secrets.git.password | default $giteaPassword | default "PLACEHOLDER_PASSWORD" }}
username: {{ .Values.secrets.git.username | quote }}
password: {{ $fluxPassword | quote }}

View File

@@ -0,0 +1,19 @@
# ServiceAccount for Tekton Triggers EventListener
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ .Values.serviceAccounts.triggers.name }}
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
---
# ServiceAccount for Pipeline execution
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ .Values.serviceAccounts.pipeline.name }}
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: pipeline

View File

@@ -0,0 +1,87 @@
# Tekton Task to Detect Changed Services
# This task analyzes git changes to determine which services need to be built
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: detect-changed-services
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: detection
spec:
workspaces:
- name: source
description: Workspace containing the source code
results:
- name: changed-services
description: Comma-separated list of changed services
steps:
- name: detect-changes
image: alpine/git
script: |
#!/bin/bash
set -e
cd $(workspaces.source.path)
# Get the list of changed files
CHANGED_FILES=$(git diff --name-only HEAD~1 HEAD 2>/dev/null || git diff --name-only $(git rev-parse --abbrev-ref HEAD)@{upstream} HEAD 2>/dev/null || echo "")
if [ -z "$CHANGED_FILES" ]; then
# No changes detected, assume all services need building
echo "No git changes detected, building all services"
echo "all" > $(results.changed-services.path)
exit 0
fi
# Initialize an array to collect changed services
declare -a changed_services=()
# Check for changes in services/ directory
while IFS= read -r service_dir; do
if [ -n "$service_dir" ]; then
service_name=$(basename "$service_dir")
if [[ ! " ${changed_services[@]} " =~ " ${service_name} " ]]; then
changed_services+=("$service_name")
fi
fi
done < <(echo "$CHANGED_FILES" | grep '^services/' | cut -d'/' -f2 | sort -u)
# Check for changes in gateway/ directory
if echo "$CHANGED_FILES" | grep -q '^gateway/'; then
if [[ ! " ${changed_services[@]} " =~ " gateway " ]]; then
changed_services+=("gateway")
fi
fi
# Check for changes in frontend/ directory
if echo "$CHANGED_FILES" | grep -q '^frontend/'; then
if [[ ! " ${changed_services[@]} " =~ " frontend " ]]; then
changed_services+=("frontend")
fi
fi
# Check for changes in shared/ directory (might affect multiple services)
if echo "$CHANGED_FILES" | grep -q '^shared/'; then
if [[ ! " ${changed_services[@]} " =~ " shared " ]]; then
changed_services+=("shared")
fi
fi
# Convert array to comma-separated string
CHANGED_SERVICES=""
for service in "${changed_services[@]}"; do
if [ -z "$CHANGED_SERVICES" ]; then
CHANGED_SERVICES="$service"
else
CHANGED_SERVICES="$CHANGED_SERVICES,$service"
fi
done
if [ -z "$CHANGED_SERVICES" ]; then
# Changes are in infrastructure or other non-service files
echo "infrastructure" > $(results.changed-services.path)
else
echo "$CHANGED_SERVICES" > $(results.changed-services.path)
fi

View File

@@ -0,0 +1,95 @@
# Tekton Git Clone Task for Bakery-IA CI/CD
# This task clones the source code repository
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: git-clone
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: source
spec:
workspaces:
- name: output
description: Workspace to clone the repository into
params:
- name: url
type: string
description: Repository URL to clone
- name: revision
type: string
description: Git revision to checkout
default: "main"
- name: depth
type: string
description: Git clone depth (0 for full history)
default: "1"
results:
- name: commit-sha
description: The commit SHA that was checked out
- name: commit-message
description: The commit message
steps:
- name: clone
image: alpine/git:2.43.0
script: |
#!/bin/sh
set -e
URL="$(params.url)"
REVISION="$(params.revision)"
DEPTH="$(params.depth)"
OUTPUT_PATH="$(workspaces.output.path)"
echo "============================================"
echo "Git Clone Task"
echo "============================================"
echo "URL: $URL"
echo "Revision: $REVISION"
echo "Depth: $DEPTH"
echo "============================================"
# Clone with depth for faster checkout
if [ "$DEPTH" = "0" ]; then
echo "Cloning full repository..."
git clone "$URL" "$OUTPUT_PATH"
else
echo "Cloning with depth $DEPTH..."
git clone --depth "$DEPTH" "$URL" "$OUTPUT_PATH"
fi
cd "$OUTPUT_PATH"
# Fetch the specific revision if needed
if [ "$REVISION" != "main" ] && [ "$REVISION" != "master" ]; then
echo "Fetching revision: $REVISION"
git fetch --depth 1 origin "$REVISION" 2>/dev/null || true
fi
# Checkout the revision
echo "Checking out: $REVISION"
git checkout "$REVISION" 2>/dev/null || git checkout "origin/$REVISION"
# Get commit info
COMMIT_SHA=$(git rev-parse HEAD)
COMMIT_MSG=$(git log -1 --pretty=format:"%s")
echo ""
echo "============================================"
echo "Clone Complete"
echo "============================================"
echo "Commit: $COMMIT_SHA"
echo "Message: $COMMIT_MSG"
echo "============================================"
# Write results
echo -n "$COMMIT_SHA" > $(results.commit-sha.path)
echo -n "$COMMIT_MSG" > $(results.commit-message.path)
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi

View File

@@ -0,0 +1,103 @@
# Tekton Kaniko Build Task for Bakery-IA CI/CD
# This task builds and pushes container images using Kaniko
# Supports environment-configurable base images via build-args
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: kaniko-build
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: build
spec:
workspaces:
- name: source
description: Workspace containing the source code
- name: docker-credentials
description: Docker registry credentials
params:
- name: services
type: string
description: Comma-separated list of services to build
- name: registry
type: string
description: Container registry URL for pushing built images
- name: git-revision
type: string
description: Git revision to tag images with
- name: base-registry
type: string
description: Base image registry URL (e.g., docker.io, ghcr.io/org)
default: "gitea-http.gitea.svc.cluster.local:3000/bakery-admin"
- name: python-image
type: string
description: Python base image name and tag
default: "python_3.11-slim"
results:
- name: build-status
description: Status of the build operation
steps:
- name: build-and-push
image: gcr.io/kaniko-project/executor:v1.15.0
env:
- name: DOCKER_CONFIG
value: /tekton/home/.docker
script: |
#!/bin/bash
set -e
echo "==================================================================="
echo "Kaniko Build Configuration"
echo "==================================================================="
echo "Target Registry: $(params.registry)"
echo "Base Registry: $(params.base-registry)"
echo "Python Image: $(params.python-image)"
echo "Git Revision: $(params.git-revision)"
echo "==================================================================="
# Split services parameter by comma
IFS=',' read -ra SERVICES <<< "$(params.services)"
# Build each service
for service in "${SERVICES[@]}"; do
service=$(echo "$service" | xargs) # Trim whitespace
if [ -n "$service" ] && [ "$service" != "none" ]; then
echo ""
echo "Building service: $service"
echo "-------------------------------------------------------------------"
# Determine Dockerfile path (services vs gateway vs frontend)
if [ "$service" = "gateway" ]; then
DOCKERFILE_PATH="$(workspaces.source.path)/gateway/Dockerfile"
elif [ "$service" = "frontend" ]; then
DOCKERFILE_PATH="$(workspaces.source.path)/frontend/Dockerfile.kubernetes"
else
DOCKERFILE_PATH="$(workspaces.source.path)/services/$service/Dockerfile"
fi
/kaniko/executor \
--dockerfile="$DOCKERFILE_PATH" \
--destination="$(params.registry)/$service:$(params.git-revision)" \
--context="$(workspaces.source.path)" \
--build-arg="BASE_REGISTRY=$(params.base-registry)" \
--build-arg="PYTHON_IMAGE=$(params.python-image)" \
--cache=true \
--cache-repo="$(params.registry)/cache"
echo "Successfully built: $(params.registry)/$service:$(params.git-revision)"
fi
done
echo ""
echo "==================================================================="
echo "Build completed successfully!"
echo "==================================================================="
echo "success" > $(results.build-status.path)
resources:
limits:
cpu: 2000m
memory: 4Gi
requests:
cpu: 500m
memory: 1Gi

View File

@@ -0,0 +1,33 @@
# Tekton Task for Pipeline Summary
# This task generates a summary of the pipeline execution
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: pipeline-summary
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: summary
spec:
params:
- name: changed-services
type: string
description: Services that were changed
- name: git-revision
type: string
description: Git revision being processed
steps:
- name: generate-summary
image: alpine
script: |
#!/bin/bash
set -e
echo "=== Bakery-IA CI Pipeline Summary ==="
echo "Git Revision: $(params.git-revision)"
echo "Changed Services: $(params.changed-services)"
echo "Pipeline completed successfully"
# Log summary to stdout for visibility
echo "Summary generated"

View File

@@ -0,0 +1,86 @@
# Tekton Run Tests Task for Bakery-IA CI/CD
# This task runs tests on the source code
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: run-tests
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: test
spec:
workspaces:
- name: source
description: Workspace containing the source code
params:
- name: services
type: string
description: Comma-separated list of services to test
- name: skip-tests
type: string
description: Skip tests if "true"
default: "false"
steps:
- name: run-unit-tests
image: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/python_3.11-slim:latest
workingDir: $(workspaces.source.path)
script: |
#!/bin/bash
set -e
echo "============================================"
echo "Running Unit Tests"
echo "Services: $(params.services)"
echo "Skip tests: $(params.skip-tests)"
echo "============================================"
if [ "$(params.skip-tests)" = "true" ]; then
echo "Skipping tests as requested"
exit 0
fi
# Install dependencies if requirements file exists
if [ -f "requirements.txt" ]; then
pip install --no-cache-dir -r requirements.txt
fi
# Run unit tests
python -m pytest tests/unit/ -v
echo "Unit tests completed successfully"
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 200m
memory: 512Mi
- name: run-integration-tests
image: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/python_3.11-slim:latest
workingDir: $(workspaces.source.path)
script: |
#!/bin/bash
set -e
echo "============================================"
echo "Running Integration Tests"
echo "Services: $(params.services)"
echo "============================================"
if [ "$(params.skip-tests)" = "true" ]; then
echo "Skipping integration tests as requested"
exit 0
fi
# Run integration tests
python -m pytest tests/integration/ -v
echo "Integration tests completed successfully"
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 200m
memory: 512Mi

View File

@@ -0,0 +1,153 @@
# Tekton Update GitOps Task for Bakery-IA CI/CD
# This task updates GitOps manifests with new image tags
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: update-gitops
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: gitops
spec:
workspaces:
- name: source
description: Workspace containing the source code
- name: git-credentials
description: Git credentials for pushing changes
params:
- name: services
type: string
description: Comma-separated list of services to update
- name: registry
type: string
description: Container registry URL
- name: git-revision
type: string
description: Git revision to tag images with
- name: git-branch
type: string
description: Git branch to push changes to
- name: dry-run
type: string
description: Dry run mode - don't push changes
default: "false"
steps:
- name: update-manifests
image: alpine/git:2.43.0
workingDir: $(workspaces.source.path)
env:
- name: GIT_USERNAME
valueFrom:
secretKeyRef:
name: gitea-git-credentials
key: username
- name: GIT_PASSWORD
valueFrom:
secretKeyRef:
name: gitea-git-credentials
key: password
script: |
#!/bin/bash
set -e
echo "============================================"
echo "Updating GitOps Manifests"
echo "Services: $(params.services)"
echo "Registry: $(params.registry)"
echo "Revision: $(params.git-revision)"
echo "Branch: $(params.git-branch)"
echo "Dry run: $(params.dry-run)"
echo "============================================"
# Configure git
git config --global user.email "ci@bakery-ia.local"
git config --global user.name "bakery-ia-ci"
# Clone the main repository (not a separate gitops repo)
# Use internal cluster DNS which works in all environments
REPO_URL="https://${GIT_USERNAME}:${GIT_PASSWORD}@gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bakery-ia.git"
git clone "$REPO_URL" /tmp/gitops
cd /tmp/gitops
# Switch to target branch
git checkout "$(params.git-branch)" || git checkout -b "$(params.git-branch)"
# Update image tags in Kubernetes manifests
for service in $(echo "$(params.services)" | tr ',' '\n'); do
service=$(echo "$service" | xargs) # Trim whitespace
if [ -n "$service" ] && [ "$service" != "none" ] && [ "$service" != "infrastructure" ] && [ "$service" != "shared" ]; then
echo "Updating manifest for service: $service"
# Format service name for directory (convert from kebab-case to snake_case if needed)
# Handle special cases like demo-session -> demo_session, alert-processor -> alert_processor, etc.
formatted_service=$(echo "$service" | sed 's/-/_/g')
# For gateway and frontend, they have different directory structures
if [ "$service" = "gateway" ]; then
MANIFEST_PATH="infrastructure/platform/gateway/gateway-service.yaml"
IMAGE_NAME="gateway" # gateway image name is just "gateway"
elif [ "$service" = "frontend" ]; then
MANIFEST_PATH="infrastructure/services/microservices/frontend/frontend-service.yaml"
IMAGE_NAME="dashboard" # frontend service uses "dashboard" as image name
else
# For microservices, look in the microservices directory
# Convert service name to directory format (kebab-case)
service_dir=$(echo "$service" | sed 's/_/-/g')
# Check for different possible manifest file names
if [ -f "infrastructure/services/microservices/$service_dir/deployment.yaml" ]; then
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/deployment.yaml"
elif [ -f "infrastructure/services/microservices/$service_dir/${formatted_service}-service.yaml" ]; then
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/${formatted_service}-service.yaml"
elif [ -f "infrastructure/services/microservices/$service_dir/${service_dir}-service.yaml" ]; then
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/${service_dir}-service.yaml"
else
# Default to the standard naming pattern
MANIFEST_PATH="infrastructure/services/microservices/$service_dir/${formatted_service}-service.yaml"
fi
# For most services, the image name follows the pattern service-name-service
IMAGE_NAME="${service_dir}-service"
fi
# Update the image tag in the deployment YAML
if [ -f "$MANIFEST_PATH" ]; then
# Update image reference from bakery/image_name:tag to registry/image_name:git_revision
# Handle various image name formats that might exist in the manifests
sed -i "s|image: bakery/${IMAGE_NAME}:.*|image: $(params.registry)/${IMAGE_NAME}:$(params.git-revision)|g" "$MANIFEST_PATH"
# Also handle the case where the image name might be formatted differently
sed -i "s|image: bakery/${service}:.*|image: $(params.registry)/${service}:$(params.git-revision)|g" "$MANIFEST_PATH"
sed -i "s|image: bakery/${formatted_service}:.*|image: $(params.registry)/${formatted_service}:$(params.git-revision)|g" "$MANIFEST_PATH"
echo "Updated image in: $MANIFEST_PATH for image: bakery/${IMAGE_NAME}:* -> $(params.registry)/${IMAGE_NAME}:$(params.git-revision)"
else
echo "Warning: Manifest file not found: $MANIFEST_PATH"
fi
fi
done
# Commit and push changes (unless dry-run)
if [ "$(params.dry-run)" != "true" ]; then
git add .
git status
if ! git diff --cached --quiet; then
git commit -m "Update images for services: $(params.services) [skip ci]"
git push origin "$(params.git-branch)"
echo "GitOps manifests updated successfully"
else
echo "No changes to commit"
fi
else
echo "Dry run mode - changes not pushed"
git status
git diff
fi
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi

View File

@@ -0,0 +1,23 @@
# Tekton TriggerBinding for Bakery-IA CI/CD
# This binding extracts parameters from incoming webhook payloads
apiVersion: triggers.tekton.dev/v1beta1
kind: TriggerBinding
metadata:
name: bakery-ia-trigger-binding
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
spec:
params:
- name: git-repo-url
value: "{{"{{ .payload.repository.clone_url }}"}}"
- name: git-revision
value: "{{"{{ .payload.after }}"}}"
- name: git-branch
value: "{{"{{ .payload.ref }}" | replace "refs/heads/" "" | replace "refs/tags/" "" }}"
- name: git-repo-name
value: "{{"{{ .payload.repository.name }}"}}"
- name: git-repo-full-name
value: "{{"{{ .payload.repository.full_name }}"}}"

View File

@@ -0,0 +1,79 @@
# Tekton TriggerTemplate for Bakery-IA CI/CD
# This template defines how PipelineRuns are created when triggers fire
apiVersion: triggers.tekton.dev/v1beta1
kind: TriggerTemplate
metadata:
name: bakery-ia-trigger-template
namespace: {{ .Release.Namespace }}
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
app.kubernetes.io/component: triggers
spec:
params:
- name: git-repo-url
description: The git repository URL
- name: git-revision
description: The git revision/commit hash
- name: git-branch
description: The git branch name
default: "main"
- name: git-repo-name
description: The git repository name
default: "bakery-ia"
- name: git-repo-full-name
description: The full repository name (org/repo)
default: "bakery-admin/bakery-ia"
# Registry URL - keep in sync with pipeline-config ConfigMap
- name: registry-url
description: Container registry URL
default: {{ .Values.global.registry.url | quote }}
resourcetemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: bakery-ia-ci-run-
labels:
app.kubernetes.io/name: {{ .Values.labels.app.name }}
tekton.dev/pipeline: bakery-ia-ci
triggers.tekton.dev/trigger: bakery-ia-gitea-trigger
annotations:
# Track the source commit
bakery-ia.io/git-revision: $(tt.params.git-revision)
bakery-ia.io/git-branch: $(tt.params.git-branch)
spec:
pipelineRef:
name: bakery-ia-ci
serviceAccountName: {{ .Values.serviceAccounts.pipeline.name }}
workspaces:
- name: shared-workspace
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: {{ .Values.pipeline.workspace.size }}
- name: docker-credentials
secret:
secretName: gitea-registry-credentials
- name: git-credentials
secret:
secretName: gitea-git-credentials
params:
- name: git-url
value: $(tt.params.git-repo-url)
- name: git-revision
value: $(tt.params.git-revision)
- name: git-branch
value: $(tt.params.git-branch)
# Use template parameter for registry URL
- name: registry
value: $(tt.params.registry-url)
- name: skip-tests
value: "false"
- name: dry-run
value: "false"
# Timeout for the entire pipeline run
timeouts:
pipeline: "1h0m0s"
tasks: "45m0s"

View File

@@ -0,0 +1,81 @@
# Production values for tekton-cicd Helm chart
# This file overrides values.yaml for production deployment
#
# Installation:
# helm upgrade --install tekton-cicd infrastructure/cicd/tekton-helm \
# -n tekton-pipelines \
# -f infrastructure/cicd/tekton-helm/values.yaml \
# -f infrastructure/cicd/tekton-helm/values-prod.yaml \
# --set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN \
# --set secrets.registry.password=$GITEA_ADMIN_PASSWORD \
# --set secrets.git.password=$GITEA_ADMIN_PASSWORD
#
# Required environment variables:
# TEKTON_WEBHOOK_TOKEN - Secure webhook token (generate with: openssl rand -hex 32)
# GITEA_ADMIN_PASSWORD - Gitea admin password (must match gitea-admin-secret)
# Global settings for production
global:
# Git configuration
git:
userEmail: "ci@bakewise.ai"
# Pipeline configuration for production
pipeline:
# Build configuration
build:
verbosity: "warn" # Less verbose in production
# Test configuration
test:
skipTests: "false"
skipLint: "false"
# Workspace configuration - ensure storage class exists in production cluster
workspace:
size: "10Gi"
storageClass: "standard" # Adjust to your production storage class
# Tekton controller settings - increased resources for production
controller:
replicas: 2
resources:
limits:
cpu: 2000m
memory: 2Gi
requests:
cpu: 200m
memory: 256Mi
# Tekton webhook settings - increased resources for production
webhook:
replicas: 2
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
# Secrets configuration
# IMPORTANT: These MUST be overridden via --set flags during deployment
# DO NOT commit actual secrets to this file
secrets:
# Webhook secret for validating incoming webhooks
# Override with: --set secrets.webhook.token=$TEKTON_WEBHOOK_TOKEN
webhook:
token: "" # MUST be set via --set flag
# Registry credentials for pushing images
# Override with: --set secrets.registry.password=$GITEA_ADMIN_PASSWORD
registry:
username: "bakery-admin"
password: "" # MUST be set via --set flag
registryUrl: "gitea-http.gitea.svc.cluster.local:3000"
# Git credentials for GitOps updates
# Override with: --set secrets.git.password=$GITEA_ADMIN_PASSWORD
git:
username: "bakery-admin"
password: "" # MUST be set via --set flag

View File

@@ -0,0 +1,99 @@
# Default values for tekton-cicd Helm chart
# This file contains configurable values for the CI/CD pipeline
# Global settings
global:
# Registry configuration
registry:
url: "gitea-http.gitea.svc.cluster.local:3000/bakery-admin"
# Git configuration
git:
branch: "main"
userName: "bakery-ia-ci"
userEmail: "ci@bakery-ia.local"
# Pipeline configuration
pipeline:
# Build configuration
build:
cacheTTL: "24h"
verbosity: "info"
# Base image registry configuration
# For dev: localhost:5000 with python_3.11-slim
# For prod: gitea registry with python_3.11-slim
baseRegistry: "gitea-http.gitea.svc.cluster.local:3000/bakery-admin"
pythonImage: "python_3.11-slim"
# Test configuration
test:
skipTests: "false"
skipLint: "false"
# Deployment configuration
deployment:
namespace: "bakery-ia"
fluxNamespace: "flux-system"
# Workspace configuration
workspace:
size: "5Gi"
storageClass: "standard"
# Tekton controller settings
controller:
replicas: 1
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
# Tekton webhook settings
webhook:
replicas: 1
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 50m
memory: 64Mi
# Namespace for Tekton resources
# Set to empty/false to skip namespace creation (namespace is created by Tekton installation)
namespace: ""
# Secrets configuration
secrets:
# Webhook secret for validating incoming webhooks
webhook:
token: "secure-webhook-token-replace-with-actual-value"
# Registry credentials for pushing images
# Uses the same credentials as Gitea admin for consistency
registry:
username: "bakery-admin"
password: "" # Will be populated from gitea-admin-secret
registryUrl: "gitea-http.gitea.svc.cluster.local:3000"
# Git credentials for GitOps updates
# Uses the same credentials as Gitea admin for consistency
git:
username: "bakery-admin"
password: "" # Will be populated from gitea-admin-secret
# Service accounts
serviceAccounts:
triggers:
name: "tekton-triggers-sa"
pipeline:
name: "tekton-pipeline-sa"
# Labels to apply to resources
labels:
app:
name: "bakery-ia-cicd"
component: "tekton"

View File

@@ -0,0 +1,491 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: bakery-config
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: config
data:
# ENVIRONMENT & BUILD SETTINGS
# ================================================================
ENVIRONMENT: "development"
DEBUG: "false"
LOG_LEVEL: "INFO"
# Observability Settings - SigNoz enabled
# Note: Detailed OTEL configuration is in the OBSERVABILITY section below
ENABLE_TRACING: "true"
ENABLE_METRICS: "true"
ENABLE_LOGS: "true"
ENABLE_OTEL_METRICS: "true"
ENABLE_SYSTEM_METRICS: "true"
OTEL_LOGS_EXPORTER: "otlp"
# Database initialization settings
# IMPORTANT: Services NEVER run migrations - they only verify DB is ready
# Migrations are handled by dedicated migration jobs
# DB_FORCE_RECREATE only affects migration jobs, not services
DB_FORCE_RECREATE: "false"
BUILD_DATE: "2024-01-20T10:00:00Z"
VCS_REF: "latest"
IMAGE_TAG: "latest"
DOMAIN: "bakewise.ai"
AUTO_RELOAD: "false"
PROFILING_ENABLED: "false"
MOCK_EXTERNAL_APIS: "false"
TESTING: "false"
# ================================================================
# SERVICE DISCOVERY (KUBERNETES INTERNAL)
# ================================================================
REDIS_HOST: "redis-service"
REDIS_PORT: "6379"
RABBITMQ_HOST: "rabbitmq-service"
RABBITMQ_PORT: "5672"
RABBITMQ_MANAGEMENT_PORT: "15672"
RABBITMQ_VHOST: "/"
# Database Hosts (Kubernetes Services)
AUTH_DB_HOST: "auth-db-service"
TENANT_DB_HOST: "tenant-db-service"
TRAINING_DB_HOST: "training-db-service"
FORECASTING_DB_HOST: "forecasting-db-service"
SALES_DB_HOST: "sales-db-service"
EXTERNAL_DB_HOST: "external-db-service"
NOTIFICATION_DB_HOST: "notification-db-service"
INVENTORY_DB_HOST: "inventory-db-service"
RECIPES_DB_HOST: "recipes-db-service"
SUPPLIERS_DB_HOST: "suppliers-db-service"
POS_DB_HOST: "pos-db-service"
ORDERS_DB_HOST: "orders-db-service"
PRODUCTION_DB_HOST: "production-db-service"
PROCUREMENT_DB_HOST: "procurement-db-service"
ORCHESTRATOR_DB_HOST: "orchestrator-db-service"
ALERT_PROCESSOR_DB_HOST: "alert-processor-db-service"
AI_INSIGHTS_DB_HOST: "ai-insights-db-service"
DISTRIBUTION_DB_HOST: "distribution-db-service"
DEMO_SESSION_DB_HOST: "demo-session-db-service"
# MinIO Configuration
MINIO_ENDPOINT: "minio.bakery-ia.svc.cluster.local:9000"
MINIO_USE_SSL: "true"
MINIO_MODEL_BUCKET: "training-models"
MINIO_CONSOLE_PORT: "9001"
MINIO_API_PORT: "9000"
MINIO_REGION: "us-east-1"
MINIO_MODEL_LIFECYCLE_DAYS: "90"
MINIO_CACHE_TTL_SECONDS: "3600"
# Database Configuration
DB_PORT: "5432"
AUTH_DB_NAME: "auth_db"
TENANT_DB_NAME: "tenant_db"
TRAINING_DB_NAME: "training_db"
FORECASTING_DB_NAME: "forecasting_db"
SALES_DB_NAME: "sales_db"
EXTERNAL_DB_NAME: "external_db"
NOTIFICATION_DB_NAME: "notification_db"
INVENTORY_DB_NAME: "inventory_db"
RECIPES_DB_NAME: "recipes_db"
SUPPLIERS_DB_NAME: "suppliers_db"
POS_DB_NAME: "pos_db"
ORDERS_DB_NAME: "orders_db"
PRODUCTION_DB_NAME: "production_db"
PROCUREMENT_DB_NAME: "procurement_db"
ORCHESTRATOR_DB_NAME: "orchestrator_db"
ALERT_PROCESSOR_DB_NAME: "alert_processor_db"
AI_INSIGHTS_DB_NAME: "ai_insights_db"
DISTRIBUTION_DB_NAME: "distribution_db"
POSTGRES_INITDB_ARGS: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C"
# ================================================================
# SERVICE URLS (KUBERNETES INTERNAL)
# ================================================================
GATEWAY_URL: "http://gateway-service:8000"
AUTH_SERVICE_URL: "http://auth-service:8000"
TENANT_SERVICE_URL: "http://tenant-service:8000"
TRAINING_SERVICE_URL: "http://training-service:8000"
FORECASTING_SERVICE_URL: "http://forecasting-service:8000"
SALES_SERVICE_URL: "http://sales-service:8000"
EXTERNAL_SERVICE_URL: "http://external-service:8000"
NOTIFICATION_SERVICE_URL: "http://notification-service:8000"
INVENTORY_SERVICE_URL: "http://inventory-service:8000"
RECIPES_SERVICE_URL: "http://recipes-service:8000"
SUPPLIERS_SERVICE_URL: "http://suppliers-service:8000"
POS_SERVICE_URL: "http://pos-service:8000"
ORDERS_SERVICE_URL: "http://orders-service:8000"
PRODUCTION_SERVICE_URL: "http://production-service:8000"
ALERT_PROCESSOR_SERVICE_URL: "http://alert-processor:8000"
ORCHESTRATOR_SERVICE_URL: "http://orchestrator-service:8000"
AI_INSIGHTS_SERVICE_URL: "http://ai-insights-service:8000"
DISTRIBUTION_SERVICE_URL: "http://distribution-service:8000"
# ================================================================
# AUTHENTICATION & SECURITY SETTINGS
# ================================================================
JWT_ALGORITHM: "HS256"
JWT_ACCESS_TOKEN_EXPIRE_MINUTES: "240"
JWT_REFRESH_TOKEN_EXPIRE_DAYS: "7"
ENABLE_SERVICE_AUTH: "false"
PASSWORD_MIN_LENGTH: "8"
PASSWORD_REQUIRE_UPPERCASE: "true"
PASSWORD_REQUIRE_LOWERCASE: "true"
PASSWORD_REQUIRE_NUMBERS: "true"
PASSWORD_REQUIRE_SYMBOLS: "false"
BCRYPT_ROUNDS: "12"
MAX_LOGIN_ATTEMPTS: "5"
LOCKOUT_DURATION_MINUTES: "30"
# ================================================================
# CORS & API CONFIGURATION
# ================================================================
CORS_ORIGINS: "https://bakery.yourdomain.com,http://frontend-service:3000"
CORS_ALLOW_CREDENTIALS: "true"
RATE_LIMIT_ENABLED: "true"
RATE_LIMIT_REQUESTS: "100"
RATE_LIMIT_WINDOW: "60"
RATE_LIMIT_BURST: "10"
API_DOCS_ENABLED: "true"
# ================================================================
# HTTP CLIENT SETTINGS
# ================================================================
HTTP_TIMEOUT: "30000"
HTTP_RETRIES: "3"
HTTP_RETRY_DELAY: "1.0"
# ================================================================
# EXTERNAL API CONFIGURATION
# ================================================================
AEMET_BASE_URL: "https://opendata.aemet.es/opendata"
AEMET_TIMEOUT: "90"
AEMET_RETRY_ATTEMPTS: "5"
MADRID_OPENDATA_BASE_URL: "https://datos.madrid.es"
MADRID_OPENDATA_TIMEOUT: "30"
# ================================================================
# PAYMENT CONFIGURATION
# ================================================================
STRIPE_PUBLISHABLE_KEY: "pk_live_your_stripe_publishable_key_here"
SQUARE_APPLICATION_ID: "your-square-application-id"
SQUARE_ENVIRONMENT: "production"
TOAST_ENVIRONMENT: "production"
LIGHTSPEED_ENVIRONMENT: "production"
# ================================================================
# EMAIL CONFIGURATION
# ================================================================
SMTP_HOST: "mailu-postfix.bakery-ia.svc.cluster.local"
SMTP_PORT: "587"
SMTP_TLS: "true"
SMTP_SSL: "false"
DEFAULT_FROM_EMAIL: "noreply@bakewise.ai"
DEFAULT_FROM_NAME: "Bakery-Forecast"
EMAIL_FROM_ADDRESS: "alerts@bakewise.ai"
EMAIL_FROM_NAME: "Bakery Alert System"
# ================================================================
# WHATSAPP CONFIGURATION
# ================================================================
WHATSAPP_BASE_URL: "https://api.twilio.com"
WHATSAPP_FROM_NUMBER: "whatsapp:+14155238886"
# ================================================================
# ALERT SYSTEM CONFIGURATION
# ================================================================
ALERT_PROCESSOR_INSTANCES: "2"
ALERT_PROCESSOR_MAX_MEMORY: "512M"
ALERT_BATCH_SIZE: "10"
ALERT_PROCESSING_TIMEOUT: "30"
EMAIL_ENABLED: "true"
WHATSAPP_ENABLED: "true"
SSE_ENABLED: "true"
PUSH_NOTIFICATIONS_ENABLED: "false"
ALERT_DEDUPLICATION_WINDOW_MINUTES: "15"
RECOMMENDATION_DEDUPLICATION_WINDOW_MINUTES: "60"
# Alert Enrichment Configuration (Unified Alert Service)
# Priority scoring weights (must sum to 1.0)
BUSINESS_IMPACT_WEIGHT: "0.4"
URGENCY_WEIGHT: "0.3"
USER_AGENCY_WEIGHT: "0.2"
CONFIDENCE_WEIGHT: "0.1"
# Priority thresholds (0-100 scale)
CRITICAL_THRESHOLD: "90"
IMPORTANT_THRESHOLD: "70"
STANDARD_THRESHOLD: "50"
# Timing intelligence
BUSINESS_HOURS_START: "6"
BUSINESS_HOURS_END: "22"
PEAK_HOURS_START: "7"
PEAK_HOURS_END: "11"
PEAK_HOURS_EVENING_START: "17"
PEAK_HOURS_EVENING_END: "19"
# Alert grouping
GROUPING_TIME_WINDOW_MINUTES: "15"
MAX_ALERTS_PER_GROUP: "5"
# Email digest
DIGEST_SEND_TIME: "18:00"
# ================================================================
# CHECK FREQUENCIES (CRON EXPRESSIONS)
# ================================================================
STOCK_CHECK_FREQUENCY: "*/5"
EXPIRY_CHECK_FREQUENCY: "*/2"
TEMPERATURE_CHECK_FREQUENCY: "*/2"
PRODUCTION_DELAY_CHECK_FREQUENCY: "*/5"
CAPACITY_CHECK_FREQUENCY: "*/10"
INVENTORY_OPTIMIZATION_FREQUENCY: "*/30"
EFFICIENCY_RECOMMENDATIONS_FREQUENCY: "*/30"
ENERGY_RECOMMENDATIONS_FREQUENCY: "0"
WASTE_REDUCTION_FREQUENCY: "0"
# ================================================================
# MODEL STORAGE & TRAINING
# ================================================================
# Model storage is handled by MinIO (see MinIO Configuration section)
MODEL_STORAGE_BACKEND: "minio"
MODEL_BACKUP_ENABLED: "true"
MODEL_VERSIONING_ENABLED: "true"
MAX_TRAINING_TIME_MINUTES: "30"
MAX_CONCURRENT_TRAINING_JOBS: "3"
MIN_TRAINING_DATA_DAYS: "30"
TRAINING_BATCH_SIZE: "1000"
# ================================================================
# OPTIMIZATION SETTINGS
# ================================================================
ENABLE_HYPERPARAMETER_OPTIMIZATION: "true"
ENABLE_PRODUCT_SPECIFIC_PARAMS: "true"
ENABLE_DYNAMIC_PARAM_SELECTION: "true"
OPTUNA_N_TRIALS: "50"
OPTUNA_CV_FOLDS: "3"
OPTUNA_TIMEOUT_MINUTES: "10"
HIGH_VOLUME_THRESHOLD: "1.0"
INTERMITTENT_THRESHOLD: "0.6"
# ================================================================
# PROPHET PARAMETERS
# ================================================================
PROPHET_SEASONALITY_MODE: "additive"
PROPHET_CHANGEPOINT_PRIOR_SCALE: "0.05"
PROPHET_SEASONALITY_PRIOR_SCALE: "10.0"
PROPHET_HOLIDAYS_PRIOR_SCALE: "10.0"
PROPHET_DAILY_SEASONALITY: "true"
PROPHET_WEEKLY_SEASONALITY: "true"
PROPHET_YEARLY_SEASONALITY: "true"
# ================================================================
# BUSINESS CONFIGURATION
# ================================================================
SERVICE_VERSION: "1.0.0"
TIMEZONE: "Europe/Madrid"
LOCALE: "es_ES.UTF-8"
CURRENCY: "EUR"
BUSINESS_HOUR_START: "7"
BUSINESS_HOUR_END: "20"
ENABLE_SPANISH_HOLIDAYS: "true"
ENABLE_MADRID_HOLIDAYS: "true"
SCHOOL_CALENDAR_ENABLED: "true"
WEATHER_IMPACT_ENABLED: "true"
# ================================================================
# MONITORING & LOGGING
# ================================================================
LOG_FORMAT: "json"
LOG_FILE_ENABLED: "false"
LOG_FILE_PATH: "/app/logs"
LOG_ROTATION_SIZE: "100MB"
LOG_RETENTION_DAYS: "30"
HEALTH_CHECK_TIMEOUT: "30"
HEALTH_CHECK_INTERVAL: "30"
# Monitoring Configuration - SigNoz
SIGNOZ_ROOT_URL: "https://monitoring.bakery-ia.local"
# ================================================================
# DATA COLLECTION SETTINGS
# ================================================================
WEATHER_COLLECTION_INTERVAL_HOURS: "1"
TRAFFIC_COLLECTION_INTERVAL_HOURS: "1"
EVENTS_COLLECTION_INTERVAL_HOURS: "6"
DATA_VALIDATION_ENABLED: "true"
OUTLIER_DETECTION_ENABLED: "true"
DATA_COMPLETENESS_THRESHOLD: "0.8"
DEFAULT_LATITUDE: "40.4168"
DEFAULT_LONGITUDE: "-3.7038"
LOCATION_RADIUS_KM: "50.0"
# ================================================================
# NOTIFICATION SETTINGS
# ================================================================
ENABLE_EMAIL_NOTIFICATIONS: "true"
ENABLE_WHATSAPP_NOTIFICATIONS: "true"
ENABLE_PUSH_NOTIFICATIONS: "false"
MAX_RETRY_ATTEMPTS: "3"
RETRY_DELAY_SECONDS: "60"
NOTIFICATION_BATCH_SIZE: "100"
EMAIL_RATE_LIMIT_PER_HOUR: "1000"
WHATSAPP_RATE_LIMIT_PER_HOUR: "100"
DEFAULT_LANGUAGE: "es"
DATE_FORMAT: "%d/%m/%Y"
TIME_FORMAT: "%H:%M"
EMAIL_TEMPLATES_PATH: "/app/templates/email"
WHATSAPP_TEMPLATES_PATH: "/app/templates/whatsapp"
IMMEDIATE_DELIVERY: "true"
SCHEDULED_DELIVERY_ENABLED: "true"
DELIVERY_TRACKING_ENABLED: "true"
OPEN_TRACKING_ENABLED: "true"
CLICK_TRACKING_ENABLED: "true"
# ================================================================
# FORECASTING SETTINGS
# ================================================================
MAX_FORECAST_DAYS: "30"
MIN_HISTORICAL_DAYS: "60"
PREDICTION_CONFIDENCE_THRESHOLD: "0.8"
PREDICTION_CACHE_TTL_HOURS: "6"
FORECAST_BATCH_SIZE: "100"
# ================================================================
# BUSINESS RULES
# ================================================================
WEEKEND_ADJUSTMENT_FACTOR: "0.8"
HOLIDAY_ADJUSTMENT_FACTOR: "0.5"
TEMPERATURE_THRESHOLD_COLD: "10.0"
TEMPERATURE_THRESHOLD_HOT: "30.0"
RAIN_IMPACT_FACTOR: "0.7"
HIGH_DEMAND_THRESHOLD: "1.5"
LOW_DEMAND_THRESHOLD: "0.5"
STOCKOUT_RISK_THRESHOLD: "0.9"
# ================================================================
# CACHE SETTINGS
# ================================================================
REDIS_TLS_ENABLED: "true"
REDIS_MAX_MEMORY: "512mb"
REDIS_MAX_CONNECTIONS: "50"
REDIS_DB: "1"
WEATHER_CACHE_TTL_HOURS: "1"
TRAFFIC_CACHE_TTL_HOURS: "1"
# ================================================================
# FRONTEND CONFIGURATION
# ================================================================
VITE_APP_TITLE: "PanIA Dashboard"
VITE_APP_VERSION: "1.0.0"
VITE_API_URL: "/api"
VITE_ENVIRONMENT: "production"
# Pilot Program Configuration
VITE_PILOT_MODE_ENABLED: "true"
VITE_PILOT_COUPON_CODE: "PILOT2025"
VITE_PILOT_TRIAL_MONTHS: "3"
VITE_STRIPE_PUBLISHABLE_KEY: "pk_test_51QuxKyIzCdnBmAVTGM8fvXYkItrBUILz6lHYwhAva6ZAH1HRi0e8zDRgZ4X3faN0zEABp5RHjCVBmMJL3aKXbaC200fFrSNnPl"
# ================================================================
# LOCATION SETTINGS (Nominatim Geocoding)
# ================================================================
NOMINATIM_SERVICE_URL: "http://nominatim-service:8080"
NOMINATIM_PBF_URL: "http://download.geofabrik.de/europe/spain-latest.osm.pbf"
NOMINATIM_MEMORY_LIMIT: "8G"
NOMINATIM_CPU_LIMIT: "4"
# ================================================================
# OBSERVABILITY - SigNoz (Unified Monitoring)
# ================================================================
# OpenTelemetry Configuration - Direct to SigNoz OTel Collector
#
# ENDPOINT CONFIGURATION:
# - OTEL_EXPORTER_OTLP_ENDPOINT: Base gRPC endpoint (host:port format, NO http:// prefix)
# Used by traces and metrics (gRPC) by default
# Format: "host:4317" (gRPC port)
#
# PROTOCOL USAGE:
# - Traces: gRPC (port 4317) - High performance, low latency
# - Metrics: gRPC (port 4317) - Efficient batch export
# - Logs: HTTP (port 4318) - Required for OTLP log protocol
#
# The monitoring library automatically handles:
# - Converting gRPC endpoint (4317) to HTTP endpoint (4318) for logs
# - Adding proper paths (/v1/traces, /v1/metrics, /v1/logs)
# - Protocol prefixes (http:// for HTTP, none for gRPC)
#
# Base OTLP endpoint (gRPC format - used by traces and metrics)
OTEL_EXPORTER_OTLP_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
# Protocol configuration (gRPC is recommended for better performance)
OTEL_EXPORTER_OTLP_PROTOCOL: "grpc"
# Optional: Signal-specific endpoint overrides (if different from base)
# OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
# OTEL_EXPORTER_OTLP_METRICS_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
# OTEL_EXPORTER_OTLP_LOGS_ENDPOINT: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318"
# Gateway telemetry proxy configuration
SIGNOZ_OTEL_COLLECTOR_URL: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318"
# Optional: Protocol overrides per signal
# OTEL_EXPORTER_OTLP_TRACES_PROTOCOL: "grpc"
# OTEL_EXPORTER_OTLP_METRICS_PROTOCOL: "grpc"
# Note: Logs always use HTTP protocol regardless of this setting
# Resource attributes (added to all telemetry signals)
OTEL_SERVICE_NAME: "bakery-ia"
OTEL_RESOURCE_ATTRIBUTES: "deployment.environment=development"
# SigNoz service endpoints (for UI and API access)
SIGNOZ_ENDPOINT: "http://signoz.bakery-ia.svc.cluster.local:8080"
SIGNOZ_FRONTEND_URL: "https://monitoring.bakery-ia.local"
# ================================================================
# DISTRIBUTION & ROUTING OPTIMIZATION SETTINGS
# ================================================================
VRP_TIME_LIMIT_SECONDS: "30"
VRP_DEFAULT_VEHICLE_CAPACITY_KG: "1000"
VRP_AVERAGE_SPEED_KMH: "30"
# ================================================================
# REPLENISHMENT PLANNING SETTINGS
# ================================================================
REPLENISHMENT_PROJECTION_HORIZON_DAYS: "7"
REPLENISHMENT_SERVICE_LEVEL: "0.95"
REPLENISHMENT_BUFFER_DAYS: "1"
# Safety Stock
SAFETY_STOCK_SERVICE_LEVEL: "0.95"
SAFETY_STOCK_METHOD: "statistical"
# MOQ
MOQ_CONSOLIDATION_WINDOW_DAYS: "7"
MOQ_ALLOW_EARLY_ORDERING: "true"
# Supplier Selection
SUPPLIER_PRICE_WEIGHT: "0.40"
SUPPLIER_LEAD_TIME_WEIGHT: "0.20"
SUPPLIER_QUALITY_WEIGHT: "0.20"
SUPPLIER_RELIABILITY_WEIGHT: "0.20"
SUPPLIER_DIVERSIFICATION_THRESHOLD: "1000"
SUPPLIER_MAX_SINGLE_PERCENTAGE: "0.70"
# Circuit Breakers
CIRCUIT_BREAKER_FAILURE_THRESHOLD: "5"
CIRCUIT_BREAKER_TIMEOUT_DURATION: "60"
CIRCUIT_BREAKER_SUCCESS_THRESHOLD: "2"
# Saga
SAGA_TIMEOUT_SECONDS: "600"
SAGA_ENABLE_COMPENSATION: "true"
# ================================================================
# EXTERNAL DATA SERVICE V2 SETTINGS
# ================================================================
EXTERNAL_ENABLED_CITIES: "madrid"
EXTERNAL_RETENTION_MONTHS: "6" # Reduced from 24 to avoid memory issues during init
EXTERNAL_CACHE_TTL_DAYS: "7"
EXTERNAL_REDIS_URL: "rediss://redis-service:6379/0?ssl_cert_reqs=none"

View File

@@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- configmap.yaml
- secrets.yaml

View File

@@ -0,0 +1,226 @@
# NOTE: gitea-registry-secret is dynamically created by:
# infrastructure/cicd/gitea/sync-registry-secret.sh
# This script is automatically run by Tiltfile after Gitea setup.
# The secret uses the same credentials as gitea-admin-secret in the gitea namespace.
# DO NOT define gitea-registry-secret here to avoid credential sync issues.
---
apiVersion: v1
kind: Secret
metadata:
name: database-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: database
type: Opaque
data:
# Database Users (base64 encoded from .env)
AUTH_DB_USER: YXV0aF91c2Vy # auth_user
TENANT_DB_USER: dGVuYW50X3VzZXI= # tenant_user
TRAINING_DB_USER: dHJhaW5pbmdfdXNlcg== # training_user
FORECASTING_DB_USER: Zm9yZWNhc3RpbmdfdXNlcg== # forecasting_user
SALES_DB_USER: c2FsZXNfdXNlcg== # sales_user
EXTERNAL_DB_USER: ZXh0ZXJuYWxfdXNlcg== # external_user
NOTIFICATION_DB_USER: bm90aWZpY2F0aW9uX3VzZXI= # notification_user
INVENTORY_DB_USER: aW52ZW50b3J5X3VzZXI= # inventory_user
RECIPES_DB_USER: cmVjaXBlc191c2Vy # recipes_user
SUPPLIERS_DB_USER: c3VwcGxpZXJzX3VzZXI= # suppliers_user
POS_DB_USER: cG9zX3VzZXI= # pos_user
ORDERS_DB_USER: b3JkZXJzX3VzZXI= # orders_user
PRODUCTION_DB_USER: cHJvZHVjdGlvbl91c2Vy # production_user
ALERT_PROCESSOR_DB_USER: YWxlcnRfcHJvY2Vzc29yX3VzZXI= # alert_processor_user
DEMO_SESSION_DB_USER: ZGVtb19zZXNzaW9uX3VzZXI= # demo_session_user
ORCHESTRATOR_DB_USER: b3JjaGVzdHJhdG9yX3VzZXI= # orchestrator_user
PROCUREMENT_DB_USER: cHJvY3VyZW1lbnRfdXNlcg== # procurement_user
AI_INSIGHTS_DB_USER: YWlfaW5zaWdodHNfdXNlcg== # ai_insights_user
DISTRIBUTION_DB_USER: ZGlzdHJpYnV0aW9uX3VzZXI= # distribution_user
# Database Passwords (base64 encoded - URL-SAFE PRODUCTION PASSWORDS)
AUTH_DB_PASSWORD: RThLejQ3WW1WekRsSEdzMU05d0FiSnp4Y0tuR09OQ1Q= # E8Kz47YmVzDlHGs1M9wAbJzxcKnGONCT
TENANT_DB_PASSWORD: VW5tV0VBNlJkaWZncGdoV2N4Zkh2ME1veVVnbUY0ekg= # UnmWEA6RdifgpghWcxfHv0MoyUgmF4zH
TRAINING_DB_PASSWORD: WnZhMzNoaVBJc2ZtV3RxUlBWV29taTRYZ2xLTlZPcHY= # Zva33hiPIsfmWtqRPVWomi4XglKNVOpv
FORECASTING_DB_PASSWORD: QU9CN0Z1SkczVFFSWXptdFJXZHZja3JuQzdsSGtJSHQ= # AOB7FuJG3TQRYzmtRWdvckrnC7lHkIHt
SALES_DB_PASSWORD: NlN1R1lETFRiZjdjWGJZb1RETGlGU2ZSZDBmU2FpMXA= # 6SuGYDLTbf7cXbYoTDLiFSfRd0fSai1p
EXTERNAL_DB_PASSWORD: anlOZE1YRWVBdnhLZWxHOElqMVptRjk4c3l2R3JicTc= # jyNdMXEeAvxKelG8Ij1ZmF98syvGrbq7
NOTIFICATION_DB_PASSWORD: NWJ0YzVZWExjUnZBaGE3dzFaNExNNnNoSmRxU21oVGQ= # 5btc5YXLcRvAha7w1Z4LM6shJdqSmhTd
INVENTORY_DB_PASSWORD: NU5hc09uR1M1RTlXbkV0cDNDcFBvUEVpUWxGQXdlWEQ= # 5NasOnGS5E9WnEtp3CpPoPEiQlFAweXD
RECIPES_DB_PASSWORD: QlRvc2IzMDlpc05DeHFmV25WZFhQZ0xMTUI5VmM5RXQ= # BTosb309isNCxqfWnVdXPgLLMB9Vc9Et
SUPPLIERS_DB_PASSWORD: ZjVUQzd1ekVUblI0ZkowWWdPNFRoMDQ1QkN4Mk9CcWs= # f5TC7uzETnR4fJ0YgO4Th045BCx2OBqk
POS_DB_PASSWORD: Q1hIdE5nTTFEYmRiR2VGYTdRWE5lTkttbVAxVWRsc08= # CXHtNgM1DbdbGeFa7QXNeNKmmP1UdlsO
ORDERS_DB_PASSWORD: emU1aVJncVpVTm1DaHNRbjV3MGFDWFBqb3h1MXdNSDk= # ze5iRgqZUNmChsQn5w0aCXPjoxu1wMH9
PRODUCTION_DB_PASSWORD: SVpaUjZ5dzFqUmFPM29iVUtBQWJaODNLMEdmeTNqbWI= # IZZR6yw1jRaO3obUKAAbZ83K0Gfy3jmb
ALERT_PROCESSOR_DB_PASSWORD: WklyWjBNQnFsRHZsTXJtcndndnZ2UUwzNm5yWFFqdDU= # ZIrZ0MBqlDvlMrmrwgvvvQL36nrXQjt5
DEMO_SESSION_DB_PASSWORD: R291ZWlkcWFSNDhJejJFMDdmT0tyd3BSeXBtMjV1cW4= # GoueidqaR48Iz2E07fOKrwpRypm25uqn
ORCHESTRATOR_DB_PASSWORD: cndCZTdZck5GMVRCMkE3N3U5cUVVTGtWdEJlbU1xdm8= # rwBe7YrNF1TB2A77u9qEULkVtBemMqvo
PROCUREMENT_DB_PASSWORD: dUNhRHllZm5aMXhpd21TcDRNMnQ3QzQ1bkJieGltT1g= # uCaDyefnZ1xiwmSp4M2t7C45nBbximOX
AI_INSIGHTS_DB_PASSWORD: ZGp6M2M1T09KYkJOT28yd2VTY0l0dmlra0pyV2l5dUw= # djz3c5OOJbBNOo2weScItvikkJrWiyuL
DISTRIBUTION_DB_PASSWORD: ZGp6M2M1T09KYkJOT28yd2VTY0l0dmlra0pyV2l5dUw= # djz3c5OOJbBNOo2weScItvikkJrWiyuL
# Database URLs (base64 encoded - with strong passwords)
AUTH_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vYXV0aF91c2VyOkU4S3o0N1ltVnpEbEhHczFNOXdBYkp6eGNLbkdPTkNUQGF1dGgtZGItc2VydmljZTo1NDMyL2F1dGhfZGI=
TENANT_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vdGVuYW50X3VzZXI6VW5tV0VBNlJkaWZncGdoV2N4Zkh2ME1veVVnbUY0ekhAdGVuYW50LWRiLXNlcnZpY2U6NTQzMi90ZW5hbnRfZGI=
TRAINING_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vdHJhaW5pbmdfdXNlcjpadmEzM2hpUElzZm1XdHFSUFZXb21pNFhnbEtOVk9wdkB0cmFpbmluZy1kYi1zZXJ2aWNlOjU0MzIvdHJhaW5pbmdfZGI=
FORECASTING_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZm9yZWNhc3RpbmdfdXNlcjpBT0I3RnVKRzNUUVJZem10UldkdmNrcm5DN2xIa0lIdEBmb3JlY2FzdGluZy1kYi1zZXJ2aWNlOjU0MzIvZm9yZWNhc3RpbmdfZGI=
SALES_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vc2FsZXNfdXNlcjo2U3VHWURMVGJmN2NYYllvVERMaUZTZlJkMGZTYWkxcEBzYWxlcy1kYi1zZXJ2aWNlOjU0MzIvc2FsZXNfZGI=
EXTERNAL_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZXh0ZXJuYWxfdXNlcjpqeU5kTVhFZUF2eEtlbEc4SWoxWm1GOThzeXZHcmJxN0BleHRlcm5hbC1kYi1zZXJ2aWNlOjU0MzIvZXh0ZXJuYWxfZGI=
NOTIFICATION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vbm90aWZpY2F0aW9uX3VzZXI6NWJ0YzVZWExjUnZBaGE3dzFaNExNNnNoSmRxU21oVGRAbm90aWZpY2F0aW9uLWRiLXNlcnZpY2U6NTQzMi9ub3RpZmljYXRpb25fZGI=
INVENTORY_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vaW52ZW50b3J5X3VzZXI6NU5hc09uR1M1RTlXbkV0cDNDcFBvUEVpUWxGQXdlWERAaW52ZW50b3J5LWRiLXNlcnZpY2U6NTQzMi9pbnZlbnRvcnlfZGI=
RECIPES_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcmVjaXBlc191c2VyOkJUb3NiMzA5aXNOQ3hxZlduVmRYUGdMTE1COVZjOUV0QHJlY2lwZXMtZGItc2VydmljZTo1NDMyL3JlY2lwZXNfZGI=
SUPPLIERS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vc3VwcGxpZXJzX3VzZXI6ZjVUQzd1ekVUblI0ZkowWWdPNFRoMDQ1QkN4Mk9CcWtAc3VwcGxpZXJzLWRiLXNlcnZpY2U6NTQzMi9zdXBwbGllcnNfZGI=
POS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcG9zX3VzZXI6Q1hIdE5nTTFEYmRiR2VGYTdRWE5lTkttbVAxVWRsc09AcG9zLWRiLXNlcnZpY2U6NTQzMi9wb3NfZGI=
ORDERS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vb3JkZXJzX3VzZXI6emU1aVJncVpVTm1DaHNRbjV3MGFDWFBqb3h1MXdNSDlAb3JkZXJzLWRiLXNlcnZpY2U6NTQzMi9vcmRlcnNfZGI=
PRODUCTION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcHJvZHVjdGlvbl91c2VyOklaWlI2eXcxalJhTzNvYlVLQUFiWjgzSzBHZnkzam1iQHByb2R1Y3Rpb24tZGItc2VydmljZTo1NDMyL3Byb2R1Y3Rpb25fZGI=
ALERT_PROCESSOR_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vYWxlcnRfcHJvY2Vzc29yX3VzZXI6WklyWjBNQnFsRHZsTXJtcndndnZ2UUwzNm5yWFFqdDVAYWxlcnQtcHJvY2Vzc29yLWRiLXNlcnZpY2U6NTQzMi9hbGVydF9wcm9jZXNzb3JfZGI=
DEMO_SESSION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZGVtb19zZXNzaW9uX3VzZXI6R291ZWlkcWFSNDhJejJFMDdmT0tyd3BSeXBtMjV1cW5AZGVtby1zZXNzaW9uLWRiLXNlcnZpY2U6NTQzMi9kZW1vX3Nlc3Npb25fZGI=
ORCHESTRATOR_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vb3JjaGVzdHJhdG9yX3VzZXI6cndCZTdZck5GMVRCMkE3N3U5cUVVTGtWdEJlbU1xdm9Ab3JjaGVzdHJhdG9yLWRiLXNlcnZpY2U6NTQzMi9vcmNoZXN0cmF0b3JfZGI=
PROCUREMENT_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vcHJvY3VyZW1lbnRfdXNlcjp1Q2FEeWVmbloxeGl3bVNwNE0ydDdDNDVuQmJ4aW1PWEBwcm9jdXJlbWVudC1kYi1zZXJ2aWNlOjU0MzIvcHJvY3VyZW1lbnRfZGI=
AI_INSIGHTS_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vYWlfaW5zaWdodHNfdXNlcjpkanozYzVPT0piQk5PbzJ3ZVNjSXR2aWtrSnJXaXl1TEBhaS1pbnNpZ2h0cy1kYi1zZXJ2aWNlOjU0MzIvYWlfaW5zaWdodHNfZGI=
DISTRIBUTION_DATABASE_URL: cG9zdGdyZXNxbCthc3luY3BnOi8vZGlzdHJpYnV0aW9uX3VzZXI6ZGp6M2M1T09KYkJOT28yd2VTY0l0dmlra0pyV2l5dUxAZGlzdHJpYnV0aW9uLWRiLXNlcnZpY2U6NTQzMi9kaXN0cmlidXRpb25fZGI=
# PostgreSQL Monitoring User (for SigNoz metrics collection)
POSTGRES_MONITOR_USER: bW9uaXRvcmluZw== # monitoring
POSTGRES_MONITOR_PASSWORD: bW9uaXRvcmluZ18zNjlmOWMwMDFmMjQyYjA3ZWY5ZTI4MjZlMTcxNjljYQ== # monitoring_369f9c001f242b07ef9e2826e17169ca
# Redis URL (URL-safe password)
REDIS_URL: cmVkaXM6Ly86SjNsa2x4cHU5QzlPTElLdkJteFVIT2h0czFnc0lvM0FAcmVkaXMtc2VydmljZTo2Mzc5LzA= # redis://:J3lklxpu9C9OLIKvBmxUHOhts1gsIo3A@redis-service:6379/0
---
apiVersion: v1
kind: Secret
metadata:
name: redis-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: redis
type: Opaque
data:
REDIS_PASSWORD: SjNsa2x4cHU5QzlPTElLdkJteFVIT2h0czFnc0lvM0E= # J3lklxpu9C9OLIKvBmxUHOhts1gsIo3A
---
apiVersion: v1
kind: Secret
metadata:
name: rabbitmq-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: rabbitmq
type: Opaque
data:
RABBITMQ_USER: YmFrZXJ5 # bakery
RABBITMQ_PASSWORD: VzJYS2tSdUxpT25ZS2RCWVFTQXJvbjFpeWtFU1M1b2I= # W2XKkRuLiOnYKdBYQSAron1iykESS5ob
RABBITMQ_ERLANG_COOKIE: YzU4MzQ2NzBhYjU1OTA1MTUzZTM1Yjg3ZmVhOTZkNWMxNGM4ODExZjIwM2E3YWI3NmE5MWRjMGE5MWQ4ZDBiNA== # c5834670ab55905153e35b87fea96d5c14c8811f203a7ab76a91dc0a91d8d0b4
---
apiVersion: v1
kind: Secret
metadata:
name: jwt-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: auth
type: Opaque
data:
JWT_SECRET_KEY: dXNNSHc5a1FDUW95cmM3d1BtTWkzYkNscjBsVFk5d3Z6Wm1jVGJBRHZMMD0= # usMHw9kQCQoyrc7wPmMi3bClr0lTY9wvzZmcTbADvL0=
JWT_REFRESH_SECRET_KEY: b2ZPRUlUWHBEUXM0a0pGcERTVWt4bDUwSmkxWUJKUmd3T0V5bStGRWNIST0= # ofOEITXpDQs4kJFpDSUkxl50Ji1YBJRgwOEym+FEcHI=
SERVICE_API_KEY: Y2IyNjFiOTM0ZDQ3MDI5YTY0MTE3YzBlNDExMGM5M2Y2NmJiY2Y1ZWFhMTVjODRjNDI3MjdmYWQ3OGY3MTk2Yw== # cb261b934d47029a64117c0e4110c93f66bbcf5eaa15c84c42727fad78f7196c
---
apiVersion: v1
kind: Secret
metadata:
name: external-api-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: external-apis
type: Opaque
data:
AEMET_API_KEY: ZXlKaGJHY2lPaUpJVXpJMU5pSjkuZXlKemRXSWlPaUoxWVd4bVlYSnZRR2R0WVdsc0xtTnZiU0lzSW1wMGFTSTZJakV3TjJObE9XVmlMVGxoTm1ZdE5EQmpZeTA1WWpoaUxUTTFOV05pWkRZNU5EazJOeUlzSW1semN5STZJa0ZGVFVWVUlpd2lhV0YwSWpveE56VTVPREkwT0RNekxDSjFjMlZ5U1dRaU9pSXhNRGRqWlRsbFlpMDVZVFptTFRRd1kyTXRPV0k0WWkwek5UVmpZbVEyT1RRNU5qY2lMQ0p5YjJ4bElqb2lJbjAuamtjX3hCc0pDc204ZmRVVnhESW1mb2x5UE5pazF4MTd6c1UxZEZKR09iWQ==
MADRID_OPENDATA_API_KEY: eW91ci1tYWRyaWQtb3BlbmRhdGEta2V5LWhlcmU= # your-madrid-opendata-key-here
---
apiVersion: v1
kind: Secret
metadata:
name: payment-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: payments
type: Opaque
data:
STRIPE_SECRET_KEY: c2tfdGVzdF81MVF1eEt5SXpDZG5CbUFWVG5QYzhVWThZTW1qdUJjaTk0RzRqc2lzMVQzMFU1anV5ZmxhQkJxYThGb2xEdTBFMlNnOUZFcVNUakFxenUwa0R6eTROUUN3ejAwOGtQUFF6WGM= # sk_test_51QuxKyIzCdnBmAVTnPc8UY8YMmjuBci94G4jsis1T30U5juyflaBBqa8FolDu0E2Sg9FEqSTjAqzu0kDzy4NQCwz008kPPQzXc
STRIPE_WEBHOOK_SECRET: d2hzZWNfOWI1NGM2ZDQ2ZjhlN2E4NWQzZWZmNmI5MWQyMzg3NGQ3N2Q5NjBlZGUyYWQzNTBkOWY3MWY5ZjBmYTlkM2VjNQ== # whsec_9b54c6d46f8e7a85d3eff6b91d23874d77d960ede2ad350d9f71f9f0fa9d3ec5
---
apiVersion: v1
kind: Secret
metadata:
name: email-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: notifications
type: Opaque
data:
# SMTP credentials for internal Mailu server (Helm deployment)
# These are used by notification-service to send emails via mailu-postfix
SMTP_USER: cG9zdG1hc3RlckBiYWtld2lzZS5haQ== # postmaster@bakewise.ai
SMTP_PASSWORD: VzJYS2tSdUxpT25ZS2RCWVFTQXJvbjFpeWtFU1M1b2I= # W2XKkRuLiOnYKdBYQSAron1iykESS5ob
# Dovecot admin password for IMAP management
DOVEADM_PASSWORD: WnZhMzNoaVBJc2ZtV3RxUlBWV29taTRYZ2xLTlZPcHY= # Zva33hiPIsfmWtqRPVWomi4XglKNVOpv
---
apiVersion: v1
kind: Secret
metadata:
name: monitoring-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: monitoring
type: Opaque
data:
GRAFANA_ADMIN_USER: YWRtaW4= # admin
GRAFANA_ADMIN_PASSWORD: YWRtaW4xMjM= # admin123
GRAFANA_SECRET_KEY: Z3JhZmFuYS1zZWNyZXQta2V5LWNoYW5nZS1pbi1wcm9kdWN0aW9u # grafana-secret-key-change-in-production
PGADMIN_EMAIL: YWRtaW5AYmFrZXJ5LmxvY2Fs # admin@bakery.local
PGADMIN_PASSWORD: YWRtaW4xMjM= # admin123
REDIS_COMMANDER_USER: YWRtaW4= # admin
REDIS_COMMANDER_PASSWORD: YWRtaW4xMjM= # admin123
---
apiVersion: v1
kind: Secret
metadata:
name: pos-integration-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: pos
type: Opaque
data:
SQUARE_ACCESS_TOKEN: eW91ci1zcXVhcmUtYWNjZXNzLXRva2Vu # your-square-access-token
SQUARE_WEBHOOK_SECRET: eW91ci1zcXVhcmUtd2ViaG9vay1zZWNyZXQ= # your-square-webhook-secret
TOAST_API_KEY: eW91ci10b2FzdC1hcGkta2V5 # your-toast-api-key
TOAST_API_SECRET: eW91ci10b2FzdC1hcGktc2VjcmV0 # your-toast-api-secret
TOAST_WEBHOOK_SECRET: eW91ci10b2FzdC13ZWJob29rLXNlY3JldA== # your-toast-webhook-secret
LIGHTSPEED_API_KEY: eW91ci1saWdodHNwZWVkLWFwaS1rZXk= # your-lightspeed-api-key
LIGHTSPEED_API_SECRET: eW91ci1saWdodHNwZWVkLWFwaS1zZWNyZXQ= # your-lightspeed-api-secret
LIGHTSPEED_WEBHOOK_SECRET: eW91ci1saWdodHNwZWVkLXdlYmhvb2stc2VjcmV0 # your-lightspeed-webhook-secret
---
apiVersion: v1
kind: Secret
metadata:
name: whatsapp-secrets
namespace: bakery-ia
labels:
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/component: notifications
type: Opaque
data:
WHATSAPP_API_KEY: eW91ci13aGF0c2FwcC1hcGkta2V5LWhlcmU= # your-whatsapp-api-key-here

View File

@@ -0,0 +1,56 @@
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: bakery-dev-tls-cert
namespace: bakery-ia
spec:
# Self-signed certificate for local development
secretName: bakery-dev-tls-cert
# Certificate duration
duration: 2160h # 90 days
renewBefore: 360h # 15 days
# Subject configuration
subject:
organizations:
- Bakery IA Development
# Common name
commonName: localhost
# DNS names this certificate is valid for
dnsNames:
- localhost
- bakery-ia.local
- api.bakery-ia.local
- monitoring.bakery-ia.local
- gitea.bakery-ia.local
- registry.bakery-ia.local
- "*.bakery-ia.local"
- "mail.bakery-ia.dev"
- "*.bakery-ia.dev"
# IP addresses (for localhost)
ipAddresses:
- 127.0.0.1
- ::1
# Use self-signed issuer for development
issuerRef:
name: selfsigned-issuer
kind: ClusterIssuer
group: cert-manager.io
# Private key configuration
privateKey:
algorithm: RSA
encoding: PKCS1
size: 2048
# Usages
usages:
- server auth
- client auth
- digital signature
- key encipherment

View File

@@ -0,0 +1,104 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
metadata:
name: bakery-ia-dev
# NOTE: Do NOT set a global namespace here.
# Each resource already has its namespace explicitly defined.
# A global namespace would incorrectly transform cluster-scoped resources
# like cert-manager namespaces.
resources:
- ../../../environments/common/configs
# NOTE: nominatim is NOT included here - it's deployed manually via Tilt trigger 'nominatim-helm'
# - ../../../platform/nominatim
- ../../../platform/gateway
- ../../../platform/cert-manager
- ../../../platform/networking/ingress/overlays/dev
- ../../../platform/storage
- ../../../services/databases
- ../../../services/microservices
# NOTE: cicd is NOT included here - it's deployed manually via Tilt triggers
# Run 'tilt trigger tekton-install' followed by 'tilt trigger tekton-pipelines-deploy'
# - ../../../cicd
- dev-certificate.yaml
# Dev-specific patches
patches:
- target:
kind: ConfigMap
name: bakery-config
patch: |-
- op: replace
path: /data/ENVIRONMENT
value: "development"
- op: replace
path: /data/DEBUG
value: "true"
# NOTE: nominatim patches removed - nominatim is now deployed via Helm (tilt trigger nominatim-helm)
labels:
- includeSelectors: true
pairs:
environment: development
tier: local
# Dev image overrides - use Kind registry to avoid Docker Hub rate limits
# IMPORTANT: All image names must be lowercase (Docker requirement)
# The prepull-base-images.sh script pushes images to localhost:5000/ with format: <repo>_<tag>
# Format: localhost:5000/<package-name>_<tag>:latest
images:
# Database images
- name: postgres
newName: localhost:5000/postgres_17_alpine
newTag: latest
- name: redis
newName: localhost:5000/redis_7_4_alpine
newTag: latest
- name: rabbitmq
newName: localhost:5000/rabbitmq_4_1_management_alpine
newTag: latest
# Utility images
- name: busybox
newName: localhost:5000/busybox_1_36
newTag: latest
- name: curlimages/curl
newName: localhost:5000/curlimages_curl_latest
newTag: latest
- name: bitnami/kubectl
newName: localhost:5000/bitnami_kubectl_latest
newTag: latest
# Alpine variants
- name: alpine
newName: localhost:5000/alpine_3_19
newTag: latest
- name: alpine/git
newName: localhost:5000/alpine_git_2_43_0
newTag: latest
# CI/CD images (cached in Kind registry for consistency)
- name: gcr.io/kaniko-project/executor
newName: localhost:5000/gcr_io_kaniko_project_executor_v1_23_0
newTag: latest
- name: gcr.io/go-containerregistry/crane
newName: localhost:5000/gcr_io_go_containerregistry_crane_latest
newTag: latest
- name: registry.k8s.io/kustomize/kustomize
newName: localhost:5000/registry_k8s_io_kustomize_kustomize_v5_3_0
newTag: latest
# Storage images
- name: minio/minio
newName: localhost:5000/minio_minio_release_2024_11_07t00_52_20z
newTag: latest
- name: minio/mc
newName: localhost:5000/minio_mc_release_2024_11_17t19_35_25z
newTag: latest
# NOTE: nominatim image override removed - nominatim is now deployed via Helm
# Python base image
- name: python
newName: localhost:5000/python_3_11_slim
newTag: latest

View File

@@ -0,0 +1,347 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
metadata:
name: bakery-ia-prod
# NOTE: Do NOT set a global namespace here.
# Each resource already has its namespace explicitly defined.
# A global namespace would incorrectly transform cluster-scoped resources
# like flux-system and cert-manager namespaces.
resources:
- ../../../environments/common/configs
- ../../../platform/cert-manager
- ../../../platform/networking/ingress/overlays/prod
- ../../../platform/gateway
- ../../../platform/storage
- ../../../services/databases
- ../../../services/microservices
# NOTE: CI/CD (gitea, tekton, flux) deployed via Helm, not kustomize
- prod-certificate.yaml
# SigNoz is managed via Helm deployment (see infrastructure/helm/deploy-signoz.sh)
# Monitoring is handled by SigNoz (no separate monitoring components needed)
# SigNoz paths are now included in the main ingress (ingress-https.yaml)
labels:
- includeSelectors: false
pairs:
environment: production
tier: production
# Production configuration patches
patches:
# Override ConfigMap values for production
- target:
kind: ConfigMap
name: bakery-config
patch: |-
- op: replace
path: /data/ENVIRONMENT
value: "production"
- op: replace
path: /data/DEBUG
value: "false"
- op: replace
path: /data/LOG_LEVEL
value: "INFO"
- op: replace
path: /data/PROFILING_ENABLED
value: "false"
- op: replace
path: /data/MOCK_EXTERNAL_APIS
value: "false"
- op: add
path: /data/REQUEST_TIMEOUT
value: "30"
- op: add
path: /data/MAX_CONNECTIONS
value: "100"
- op: replace
path: /data/ENABLE_TRACING
value: "true"
- op: replace
path: /data/ENABLE_METRICS
value: "true"
- op: replace
path: /data/ENABLE_LOGS
value: "true"
- op: add
path: /data/OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
- op: add
path: /data/OTEL_EXPORTER_OTLP_PROTOCOL
value: "grpc"
- op: add
path: /data/OTEL_SERVICE_NAME
value: "bakery-ia"
- op: add
path: /data/OTEL_RESOURCE_ATTRIBUTES
value: "deployment.environment=production,cluster.name=bakery-ia-prod"
- op: add
path: /data/SIGNOZ_ENDPOINT
value: "http://signoz.signoz.svc.cluster.local:8080"
- op: add
path: /data/SIGNOZ_FRONTEND_URL
value: "https://monitoring.bakewise.ai"
- op: add
path: /data/SIGNOZ_ROOT_URL
value: "https://monitoring.bakewise.ai"
- op: add
path: /data/RATE_LIMIT_ENABLED
value: "true"
- op: add
path: /data/RATE_LIMIT_PER_MINUTE
value: "60"
- op: add
path: /data/CORS_ORIGINS
value: "https://bakewise.ai"
- op: add
path: /data/CORS_ALLOW_CREDENTIALS
value: "true"
- op: add
path: /data/VITE_API_URL
value: "/api"
- op: add
path: /data/VITE_ENVIRONMENT
value: "production"
# Add imagePullSecrets to all Deployments for gitea registry authentication
- target:
kind: Deployment
patch: |-
- op: add
path: /spec/template/spec/imagePullSecrets
value:
- name: gitea-registry-secret
# Add imagePullSecrets to all StatefulSets for gitea registry authentication
- target:
kind: StatefulSet
patch: |-
- op: add
path: /spec/template/spec/imagePullSecrets
value:
- name: gitea-registry-secret
# Add imagePullSecrets to all Jobs for gitea registry authentication
- target:
kind: Job
patch: |-
- op: add
path: /spec/template/spec/imagePullSecrets
value:
- name: gitea-registry-secret
# Add imagePullSecrets to all CronJobs for gitea registry authentication
- target:
kind: CronJob
patch: |-
- op: add
path: /spec/jobTemplate/spec/template/spec/imagePullSecrets
value:
- name: gitea-registry-secret
# SigNoz resource patches for production
# SigNoz ClickHouse production configuration
- target:
group: apps
version: v1
kind: StatefulSet
name: signoz-clickhouse
namespace: bakery-ia
patch: |-
- op: replace
path: /spec/replicas
value: 2
- op: replace
path: /spec/template/spec/containers/0/resources
value:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "1000m"
# SigNoz Main Service production configuration (v0.106.0+ unified service)
- target:
group: apps
version: v1
kind: StatefulSet
name: signoz
namespace: bakery-ia
patch: |-
- op: replace
path: /spec/replicas
value: 2
- op: replace
path: /spec/template/spec/containers/0/resources
value:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
# SigNoz AlertManager production configuration
- target:
group: apps
version: v1
kind: Deployment
name: signoz-alertmanager
namespace: bakery-ia
patch: |-
- op: replace
path: /spec/replicas
value: 2
- op: replace
path: /spec/template/spec/containers/0/resources
value:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
images:
# Application services
- name: bakery/auth-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/auth-service
newTag: latest
- name: bakery/tenant-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/tenant-service
newTag: latest
- name: bakery/training-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/training-service
newTag: latest
- name: bakery/forecasting-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/forecasting-service
newTag: latest
- name: bakery/sales-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/sales-service
newTag: latest
- name: bakery/external-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/external-service
newTag: latest
- name: bakery/notification-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/notification-service
newTag: latest
- name: bakery/inventory-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/inventory-service
newTag: latest
- name: bakery/recipes-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/recipes-service
newTag: latest
- name: bakery/suppliers-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/suppliers-service
newTag: latest
- name: bakery/pos-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/pos-service
newTag: latest
- name: bakery/orders-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/orders-service
newTag: latest
- name: bakery/production-service
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/production-service
newTag: latest
- name: bakery/alert-processor
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/alert-processor
newTag: latest
- name: bakery/gateway
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/gateway
newTag: latest
- name: bakery/dashboard
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/dashboard
newTag: latest
# =============================================================================
# Database images (cached in gitea registry for consistency)
- name: postgres
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/postgres
newTag: "17-alpine"
- name: redis
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/redis
newTag: "7.4-alpine"
- name: rabbitmq
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/rabbitmq
newTag: "4.1-management-alpine"
# Utility images
- name: busybox
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/busybox
newTag: "1.36"
- name: curlimages/curl
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/curlimages-curl
newTag: latest
- name: bitnami/kubectl
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/bitnami-kubectl
newTag: latest
# Alpine variants
- name: alpine
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/alpine
newTag: "3.19"
- name: alpine/git
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/alpine-git
newTag: 2.43.0
# CI/CD images (cached in gitea registry for consistency)
- name: gcr.io/kaniko-project/executor
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/gcr.io-kaniko-project-executor
newTag: v1.23.0
- name: gcr.io/go-containerregistry/crane
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/gcr.io-go-containerregistry-crane
newTag: latest
- name: registry.k8s.io/kustomize/kustomize
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/registry.k8s.io-kustomize-kustomize
newTag: v5.3.0
# Storage images
- name: minio/minio
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/minio-minio
newTag: RELEASE.2024-11-07T00-52-20Z
- name: minio/mc
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/minio-mc
newTag: RELEASE.2024-11-17T19-35-25Z
# NOTE: nominatim image override removed - nominatim is now deployed via Helm
# Python base image
- name: python
newName: gitea-http.gitea.svc.cluster.local:3000/bakery-admin/python
newTag: 3.11-slim
replicas:
- name: auth-service
count: 3
- name: tenant-service
count: 2
- name: training-service
count: 3 # Safe with MinIO storage - no PVC conflicts
- name: forecasting-service
count: 3
- name: sales-service
count: 2
- name: external-service
count: 2
- name: notification-service
count: 3
- name: inventory-service
count: 2
- name: recipes-service
count: 2
- name: suppliers-service
count: 2
- name: pos-service
count: 2
- name: orders-service
count: 3
- name: production-service
count: 2
- name: alert-processor
count: 3
- name: procurement-service
count: 2
- name: orchestrator-service
count: 2
- name: ai-insights-service
count: 2
- name: gateway
count: 3
- name: frontend
count: 2

View File

@@ -0,0 +1,49 @@
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: bakery-ia-prod-tls-cert
namespace: bakery-ia
spec:
# Let's Encrypt certificate for production
secretName: bakery-ia-prod-tls-cert
# Certificate duration and renewal
duration: 2160h # 90 days (Let's Encrypt default)
renewBefore: 360h # 15 days before expiry
# Subject configuration
subject:
organizations:
- Bakery IA
# Common name
commonName: bakewise.ai
# DNS names this certificate is valid for
dnsNames:
- bakewise.ai
- www.bakewise.ai
- mail.bakewise.ai
- monitoring.bakewise.ai
- gitea.bakewise.ai
- registry.bakewise.ai
- api.bakewise.ai
# Use Let's Encrypt production issuer
issuerRef:
name: letsencrypt-production
kind: ClusterIssuer
group: cert-manager.io
# Private key configuration
privateKey:
algorithm: RSA
encoding: PKCS1
size: 2048
# Usages
usages:
- server auth
- client auth
- digital signature
- key encipherment

View File

@@ -0,0 +1,47 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: bakery-config
namespace: bakery-ia
data:
# Environment
ENVIRONMENT: "production"
DEBUG: "false"
LOG_LEVEL: "INFO"
# Profiling and Development Features (disabled in production)
PROFILING_ENABLED: "false"
MOCK_EXTERNAL_APIS: "false"
# Performance and Security
REQUEST_TIMEOUT: "30"
MAX_CONNECTIONS: "100"
# Monitoring - SigNoz (Unified Observability)
ENABLE_TRACING: "true"
ENABLE_METRICS: "true"
ENABLE_LOGS: "true"
# OpenTelemetry Configuration - Direct to SigNoz
# IMPORTANT: gRPC endpoints should NOT include http:// prefix
OTEL_EXPORTER_OTLP_ENDPOINT: "signoz-otel-collector.bakery-ia.svc.cluster.local:4317"
OTEL_EXPORTER_OTLP_PROTOCOL: "grpc"
OTEL_SERVICE_NAME: "bakery-ia"
OTEL_RESOURCE_ATTRIBUTES: "deployment.environment=production,cluster.name=bakery-ia-prod"
# SigNoz Endpoints (v0.106.0+ unified service)
SIGNOZ_ENDPOINT: "http://signoz.bakery-ia.svc.cluster.local:8080"
SIGNOZ_FRONTEND_URL: "https://monitoring.bakewise.ai"
SIGNOZ_ROOT_URL: "https://monitoring.bakewise.ai"
# Rate Limiting (stricter in production)
RATE_LIMIT_ENABLED: "true"
RATE_LIMIT_PER_MINUTE: "60"
# CORS Configuration for Production
CORS_ORIGINS: "https://bakewise.ai"
CORS_ALLOW_CREDENTIALS: "true"
# Frontend Configuration
VITE_API_URL: "/api"
VITE_ENVIRONMENT: "production"

View File

@@ -0,0 +1,616 @@
# SigNoz Helm Deployment for Bakery IA
This directory contains Helm configurations and deployment scripts for SigNoz observability platform.
## Overview
SigNoz is deployed using the official Helm chart with environment-specific configurations optimized for:
- **Development**: Colima + Kind (Kubernetes in Docker) with Tilt
- **Production**: VPS on clouding.io with MicroK8s
## Prerequisites
### Required Tools
- **kubectl** 1.22+
- **Helm** 3.8+
- **Docker** (for development)
- **Kind/MicroK8s** (environment-specific)
### Docker Hub Authentication
SigNoz uses images from Docker Hub. Set up authentication to avoid rate limits:
```bash
# Option 1: Environment variables (recommended)
export DOCKERHUB_USERNAME='your-username'
export DOCKERHUB_PASSWORD='your-personal-access-token'
# Option 2: Docker login
docker login
```
## Quick Start
### Development Deployment
```bash
# Deploy SigNoz to development environment
./deploy-signoz.sh dev
# Verify deployment
./verify-signoz.sh dev
# Access SigNoz UI
# Via ingress: http://monitoring.bakery-ia.local
# Or port-forward:
kubectl port-forward -n signoz svc/signoz 8080:8080
# Then open: http://localhost:8080
```
### Production Deployment
```bash
# Deploy SigNoz to production environment
./deploy-signoz.sh prod
# Verify deployment
./verify-signoz.sh prod
# Access SigNoz UI
# https://monitoring.bakewise.ai
```
## Configuration Files
### signoz-values-dev.yaml
Development environment configuration with:
- Single replica for most components
- Reduced resource requests (optimized for local Kind cluster)
- 7-day data retention
- Batch size: 10,000 events
- ClickHouse 25.5.6, OTel Collector v0.129.12
- PostgreSQL, Redis, and RabbitMQ receivers configured
### signoz-values-prod.yaml
Production environment configuration with:
- High availability: 2+ replicas for critical components
- 3 Zookeeper replicas (required for production)
- 30-day data retention
- Batch size: 50,000 events (high-performance)
- Cold storage enabled with 30-day TTL
- Horizontal Pod Autoscaler (HPA) enabled
- TLS/SSL with cert-manager
- Enhanced security with pod anti-affinity rules
## Key Configuration Changes (v0.89.0+)
⚠️ **BREAKING CHANGE**: SigNoz Helm chart v0.89.0+ uses a unified component structure.
**Old Structure (deprecated):**
```yaml
frontend:
replicaCount: 2
queryService:
replicaCount: 2
```
**New Structure (current):**
```yaml
signoz:
replicaCount: 2
# Combines frontend + query service
```
## Component Architecture
### Core Components
1. **SigNoz** (unified component)
- Frontend UI + Query Service
- Port 8080 (HTTP/API), 8085 (internal gRPC)
- Dev: 1 replica, Prod: 2+ replicas with HPA
2. **ClickHouse** (Time-series database)
- Version: 25.5.6
- Stores traces, metrics, and logs
- Dev: 1 replica, Prod: 2 replicas with cold storage
3. **Zookeeper** (ClickHouse coordination)
- Version: 3.7.1
- Dev: 1 replica, Prod: 3 replicas (critical for HA)
4. **OpenTelemetry Collector** (Data ingestion)
- Version: v0.129.12
- Ports: 4317 (gRPC), 4318 (HTTP), 8888 (metrics)
- Dev: 1 replica, Prod: 2+ replicas with HPA
5. **Alertmanager** (Alert management)
- Version: 0.23.5
- Email and Slack integrations configured
- Port: 9093
## Performance Optimizations
### Batch Processing
- **Development**: 10,000 events per batch
- **Production**: 50,000 events per batch (official recommendation)
- Timeout: 1 second for faster processing
### Memory Management
- Memory limiter processor prevents OOM
- Dev: 400 MiB limit, Prod: 1500 MiB limit
- Spike limits configured
### Span Metrics Processor
Automatically generates RED metrics (Rate, Errors, Duration):
- Latency histogram buckets optimized for microservices
- Cache size: 10K (dev), 100K (prod)
### Cold Storage (Production Only)
- Enabled with 30-day TTL
- Automatically moves old data to cold storage
- Keeps 10GB free on primary storage
## OpenTelemetry Endpoints
### From Within Kubernetes Cluster
**Development:**
```
OTLP gRPC: signoz-otel-collector.bakery-ia.svc.cluster.local:4317
OTLP HTTP: signoz-otel-collector.bakery-ia.svc.cluster.local:4318
```
**Production:**
```
OTLP gRPC: signoz-otel-collector.bakery-ia.svc.cluster.local:4317
OTLP HTTP: signoz-otel-collector.bakery-ia.svc.cluster.local:4318
```
### Application Configuration Example
```yaml
# Python with OpenTelemetry
OTEL_EXPORTER_OTLP_ENDPOINT: "http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318"
OTEL_EXPORTER_OTLP_PROTOCOL: "http/protobuf"
```
```javascript
// Node.js with OpenTelemetry
const exporter = new OTLPTraceExporter({
url: 'http://signoz-otel-collector.bakery-ia.svc.cluster.local:4318/v1/traces',
});
```
## Deployment Scripts
### deploy-signoz.sh
Comprehensive deployment script with features:
```bash
# Usage
./deploy-signoz.sh [OPTIONS] ENVIRONMENT
# Options
-h, --help Show help message
-d, --dry-run Show what would be deployed
-u, --upgrade Upgrade existing deployment
-r, --remove Remove deployment
-n, --namespace NS Custom namespace (default: signoz)
# Examples
./deploy-signoz.sh dev # Deploy to dev
./deploy-signoz.sh --upgrade prod # Upgrade prod
./deploy-signoz.sh --dry-run prod # Preview changes
./deploy-signoz.sh --remove dev # Remove dev deployment
```
**Features:**
- Automatic Helm repository setup
- Docker Hub secret creation
- Namespace management
- Deployment verification
- 15-minute timeout with `--wait` flag
### verify-signoz.sh
Verification script to check deployment health:
```bash
# Usage
./verify-signoz.sh [OPTIONS] ENVIRONMENT
# Examples
./verify-signoz.sh dev # Verify dev deployment
./verify-signoz.sh prod # Verify prod deployment
```
**Checks performed:**
1. ✅ Helm release status
2. ✅ Pod health and readiness
3. ✅ Service availability
4. ✅ Ingress configuration
5. ✅ PVC status
6. ✅ Resource usage (if metrics-server available)
7. ✅ Log errors
8. ✅ Environment-specific validations
- Dev: Single replica, resource limits
- Prod: HA config, TLS, Zookeeper replicas, HPA
## Storage Configuration
### Development (Kind)
```yaml
global:
storageClass: "standard" # Kind's default provisioner
```
### Production (MicroK8s)
```yaml
global:
storageClass: "microk8s-hostpath" # Or custom storage class
```
**Storage Requirements:**
- **Development**: ~35 GiB total
- SigNoz: 5 GiB
- ClickHouse: 20 GiB
- Zookeeper: 5 GiB
- Alertmanager: 2 GiB
- **Production**: ~135 GiB total
- SigNoz: 20 GiB
- ClickHouse: 100 GiB
- Zookeeper: 10 GiB
- Alertmanager: 5 GiB
## Resource Requirements
### Development Environment
**Minimum:**
- CPU: 550m (0.55 cores)
- Memory: 1.6 GiB
- Storage: 35 GiB
**Recommended:**
- CPU: 3 cores
- Memory: 3 GiB
- Storage: 50 GiB
### Production Environment
**Minimum:**
- CPU: 3.5 cores
- Memory: 8 GiB
- Storage: 135 GiB
**Recommended:**
- CPU: 12 cores
- Memory: 20 GiB
- Storage: 200 GiB
## Data Retention
### Development
- Traces: 7 days (168 hours)
- Metrics: 7 days (168 hours)
- Logs: 7 days (168 hours)
### Production
- Traces: 30 days (720 hours)
- Metrics: 30 days (720 hours)
- Logs: 30 days (720 hours)
- Cold storage after 30 days
To modify retention, update the environment variables:
```yaml
signoz:
env:
signoz_traces_ttl_duration_hrs: "720" # 30 days
signoz_metrics_ttl_duration_hrs: "720" # 30 days
signoz_logs_ttl_duration_hrs: "168" # 7 days
```
## High Availability (Production)
### Replication Strategy
```yaml
signoz: 2 replicas + HPA (min: 2, max: 5)
clickhouse: 2 replicas
zookeeper: 3 replicas (critical!)
otelCollector: 2 replicas + HPA (min: 2, max: 10)
alertmanager: 2 replicas
```
### Pod Anti-Affinity
Ensures pods are distributed across different nodes:
```yaml
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: query-service
topologyKey: kubernetes.io/hostname
```
### Pod Disruption Budgets
Configured for all critical components:
```yaml
podDisruptionBudget:
enabled: true
minAvailable: 1
```
## Monitoring and Alerting
### Email Alerts (Production)
Configure SMTP in production values (using Mailu Helm with Mailgun relay):
```yaml
signoz:
env:
signoz_smtp_enabled: "true"
signoz_smtp_host: "mailu-postfix.bakery-ia.svc.cluster.local"
signoz_smtp_port: "587"
signoz_smtp_from: "alerts@bakewise.ai"
signoz_smtp_username: "alerts@bakewise.ai"
# Set via secret: signoz_smtp_password
```
**Note**: Signoz now uses the internal Mailu SMTP service (deployed via Helm), which relays to Mailgun for better deliverability and centralized email management.
### Slack Alerts (Production)
Configure webhook in Alertmanager:
```yaml
alertmanager:
config:
receivers:
- name: 'critical-alerts'
slack_configs:
- api_url: '${SLACK_WEBHOOK_URL}'
channel: '#alerts-critical'
```
### Mailgun Integration for Alert Emails
Signoz has been configured to use Mailgun for sending alert emails through the Mailu SMTP service. This provides:
**Benefits:**
- Better email deliverability through Mailgun's infrastructure
- Centralized email management via Mailu
- Improved tracking and analytics for alert emails
- Compliance with email sending best practices
**Architecture:**
```
Signoz Alertmanager → Mailu SMTP → Mailgun Relay → Recipients
```
**Configuration Requirements:**
1. **Mailu Configuration** (deployed via Helm at `infrastructure/platform/mail/mailu-helm/`):
```yaml
externalRelay:
host: "[smtp.mailgun.org]:587"
username: "postmaster@bakewise.ai"
password: "<mailgun-api-key>"
```
2. **DNS Configuration** (required for Mailgun):
```
# MX record
bakewise.ai. IN MX 10 mail.bakewise.ai.
# SPF record (authorize Mailgun)
bakewise.ai. IN TXT "v=spf1 include:mailgun.org ~all"
# DKIM record (provided by Mailgun)
m1._domainkey.bakewise.ai. IN TXT "v=DKIM1; k=rsa; p=<mailgun-public-key>"
# DMARC record
_dmarc.bakewise.ai. IN TXT "v=DMARC1; p=quarantine; rua=mailto:dmarc@bakewise.ai"
```
3. **Signoz SMTP Configuration** (already configured in `signoz-values-prod.yaml`):
```yaml
signoz_smtp_host: "mailu-postfix.bakery-ia.svc.cluster.local"
signoz_smtp_port: "587"
signoz_smtp_from: "alerts@bakewise.ai"
```
**Testing the Integration:**
1. Trigger a test alert from Signoz UI
2. Check Mailu logs: `kubectl logs -f -n bakery-ia deployment/mailu-postfix`
3. Check Mailgun dashboard for delivery status
4. Verify email receipt in destination inbox
**Troubleshooting:**
- **SMTP Authentication Failed**: Verify Mailu credentials and Mailgun API key
- **Email Delivery Delays**: Check Mailu queue with `kubectl exec -it -n bakery-ia deployment/mailu-postfix -- mailq`
- **SPF/DKIM Issues**: Verify DNS records and Mailgun domain verification
### Self-Monitoring
SigNoz monitors itself:
```yaml
selfMonitoring:
enabled: true
serviceMonitor:
enabled: true # Prod only
interval: 30s
```
## Troubleshooting
### Common Issues
**1. Pods not starting**
```bash
# Check pod status
kubectl get pods -n signoz
# Check pod logs
kubectl logs -n signoz <pod-name>
# Describe pod for events
kubectl describe pod -n signoz <pod-name>
```
**2. Docker Hub rate limits**
```bash
# Verify secret exists
kubectl get secret dockerhub-creds -n signoz
# Recreate secret
kubectl delete secret dockerhub-creds -n signoz
export DOCKERHUB_USERNAME='your-username'
export DOCKERHUB_PASSWORD='your-token'
./deploy-signoz.sh dev
```
**3. ClickHouse connection issues**
```bash
# Check ClickHouse pod
kubectl logs -n signoz -l app.kubernetes.io/component=clickhouse
# Check Zookeeper (required by ClickHouse)
kubectl logs -n signoz -l app.kubernetes.io/component=zookeeper
```
**4. OTel Collector not receiving data**
```bash
# Check OTel Collector logs
kubectl logs -n signoz -l app.kubernetes.io/component=otel-collector
# Test connectivity
kubectl port-forward -n signoz svc/signoz-otel-collector 4318:4318
curl -v http://localhost:4318/v1/traces
```
**5. Insufficient storage**
```bash
# Check PVC status
kubectl get pvc -n signoz
# Check storage usage (if metrics-server available)
kubectl top pods -n signoz
```
### Debug Mode
Enable debug exporter in OTel Collector:
```yaml
otelCollector:
config:
exporters:
debug:
verbosity: detailed
sampling_initial: 5
sampling_thereafter: 200
service:
pipelines:
traces:
exporters: [clickhousetraces, debug] # Add debug
```
### Upgrade from Old Version
If upgrading from pre-v0.89.0:
```bash
# 1. Backup data (recommended)
kubectl get all -n signoz -o yaml > signoz-backup.yaml
# 2. Remove old deployment
./deploy-signoz.sh --remove prod
# 3. Deploy new version
./deploy-signoz.sh prod
# 4. Verify
./verify-signoz.sh prod
```
## Security Best Practices
1. **Change default password** immediately after first login
2. **Use TLS/SSL** in production (configured with cert-manager)
3. **Network policies** enabled in production
4. **Run as non-root** (configured in securityContext)
5. **RBAC** with dedicated service account
6. **Secrets management** for sensitive data (SMTP, Slack webhooks)
7. **Image pull secrets** to avoid exposing Docker Hub credentials
## Backup and Recovery
### Backup ClickHouse Data
```bash
# Export ClickHouse data
kubectl exec -n signoz <clickhouse-pod> -- clickhouse-client \
--query="BACKUP DATABASE signoz_traces TO Disk('backups', 'traces_backup.zip')"
# Copy backup out
kubectl cp signoz/<clickhouse-pod>:/var/lib/clickhouse/backups/ ./backups/
```
### Restore from Backup
```bash
# Copy backup in
kubectl cp ./backups/ signoz/<clickhouse-pod>:/var/lib/clickhouse/backups/
# Restore
kubectl exec -n signoz <clickhouse-pod> -- clickhouse-client \
--query="RESTORE DATABASE signoz_traces FROM Disk('backups', 'traces_backup.zip')"
```
## Updating Configuration
To update SigNoz configuration:
1. Edit values file: `signoz-values-{env}.yaml`
2. Apply changes:
```bash
./deploy-signoz.sh --upgrade {env}
```
3. Verify:
```bash
./verify-signoz.sh {env}
```
## Uninstallation
```bash
# Remove SigNoz deployment
./deploy-signoz.sh --remove {env}
# Optionally delete PVCs (WARNING: deletes all data)
kubectl delete pvc -n signoz -l app.kubernetes.io/instance=signoz
# Optionally delete namespace
kubectl delete namespace signoz
```
## References
- [SigNoz Official Documentation](https://signoz.io/docs/)
- [SigNoz Helm Charts Repository](https://github.com/SigNoz/charts)
- [OpenTelemetry Documentation](https://opentelemetry.io/docs/)
- [ClickHouse Documentation](https://clickhouse.com/docs/)
## Support
For issues or questions:
1. Check [SigNoz GitHub Issues](https://github.com/SigNoz/signoz/issues)
2. Review deployment logs: `kubectl logs -n signoz <pod-name>`
3. Run verification script: `./verify-signoz.sh {env}`
4. Check [SigNoz Community Slack](https://signoz.io/slack)
---
**Last Updated**: 2026-01-09
**SigNoz Helm Chart Version**: Latest (v0.129.12 components)
**Maintained by**: Bakery IA Team

View File

@@ -0,0 +1,190 @@
# SigNoz Dashboards for Bakery IA
This directory contains comprehensive SigNoz dashboard configurations for monitoring the Bakery IA system.
## Available Dashboards
### 1. Infrastructure Monitoring
- **File**: `infrastructure-monitoring.json`
- **Purpose**: Monitor Kubernetes infrastructure, pod health, and resource utilization
- **Key Metrics**: CPU usage, memory usage, network traffic, pod status, container health
### 2. Application Performance
- **File**: `application-performance.json`
- **Purpose**: Monitor microservice performance and API metrics
- **Key Metrics**: Request rate, error rate, latency percentiles, endpoint performance
### 3. Database Performance
- **File**: `database-performance.json`
- **Purpose**: Monitor PostgreSQL and Redis database performance
- **Key Metrics**: Connections, query execution time, cache hit ratio, locks, replication status
### 4. API Performance
- **File**: `api-performance.json`
- **Purpose**: Monitor REST and GraphQL API performance
- **Key Metrics**: Request volume, response times, status codes, endpoint analysis
### 5. Error Tracking
- **File**: `error-tracking.json`
- **Purpose**: Track and analyze system errors
- **Key Metrics**: Error rates, error distribution, recent errors, HTTP errors, database errors
### 6. User Activity
- **File**: `user-activity.json`
- **Purpose**: Monitor user behavior and activity patterns
- **Key Metrics**: Active users, sessions, API calls per user, session duration
### 7. System Health
- **File**: `system-health.json`
- **Purpose**: Overall system health monitoring
- **Key Metrics**: Availability, health scores, resource utilization, service status
### 8. Alert Management
- **File**: `alert-management.json`
- **Purpose**: Monitor and manage system alerts
- **Key Metrics**: Active alerts, alert rates, alert distribution, firing alerts
### 9. Log Analysis
- **File**: `log-analysis.json`
- **Purpose**: Search and analyze system logs
- **Key Metrics**: Log volume, error logs, log distribution, log search
## How to Import Dashboards
### Method 1: Using SigNoz UI
1. **Access SigNoz UI**: Open your SigNoz instance in a web browser
2. **Navigate to Dashboards**: Go to the "Dashboards" section
3. **Import Dashboard**: Click on "Import Dashboard" button
4. **Upload JSON**: Select the JSON file from this directory
5. **Configure**: Adjust any variables or settings as needed
6. **Save**: Save the imported dashboard
**Note**: The dashboards now use the correct SigNoz JSON schema with proper filter arrays.
### Method 2: Using SigNoz API
```bash
# Import a single dashboard
curl -X POST "http://<SIGNOZ_HOST>:3301/api/v1/dashboards/import" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d @infrastructure-monitoring.json
# Import all dashboards
for file in *.json; do
curl -X POST "http://<SIGNOZ_HOST>:3301/api/v1/dashboards/import" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <API_KEY>" \
-d @"$file"
done
```
### Method 3: Using Kubernetes ConfigMap
```yaml
# Create a ConfigMap with all dashboards
kubectl create configmap signoz-dashboards \
--from-file=infrastructure-monitoring.json \
--from-file=application-performance.json \
--from-file=database-performance.json \
--from-file=api-performance.json \
--from-file=error-tracking.json \
--from-file=user-activity.json \
--from-file=system-health.json \
--from-file=alert-management.json \
--from-file=log-analysis.json \
-n signoz
```
## Dashboard Variables
Most dashboards include variables that allow you to filter and customize the view:
- **Namespace**: Filter by Kubernetes namespace (e.g., `bakery-ia`, `default`)
- **Service**: Filter by specific microservice
- **Severity**: Filter by error/alert severity
- **Environment**: Filter by deployment environment
- **Time Range**: Adjust the time window for analysis
## Metrics Reference
The dashboards use standard OpenTelemetry metrics. If you need to add custom metrics, ensure they are properly instrumented in your services.
## Troubleshooting
### Dashboard Import Errors
If you encounter errors when importing dashboards:
1. **Validate JSON**: Ensure the JSON files are valid
```bash
jq . infrastructure-monitoring.json
```
2. **Check Metrics**: Verify that the metrics exist in your SigNoz instance
3. **Adjust Time Range**: Try different time ranges if no data appears
4. **Check Filters**: Ensure filters match your actual service names and tags
### "e.filter is not a function" Error
This error occurs when the dashboard JSON uses an incorrect filter format. The fix has been applied:
**Before (incorrect)**:
```json
"filters": {
"namespace": "${namespace}"
}
```
**After (correct)**:
```json
"filters": [
{
"key": "namespace",
"operator": "=",
"value": "${namespace}"
}
]
```
All dashboards in this directory now use the correct array format for filters.
### Missing Data
If dashboards show no data:
1. **Verify Instrumentation**: Ensure your services are properly instrumented with OpenTelemetry
2. **Check Time Range**: Adjust the time range to include recent data
3. **Validate Metrics**: Confirm the metrics are being collected and stored
4. **Review Filters**: Check that filters match your actual deployment
## Customization
You can customize these dashboards by:
1. **Editing JSON**: Modify the JSON files to add/remove panels or adjust queries
2. **Cloning in UI**: Clone existing dashboards and modify them in the SigNoz UI
3. **Adding Variables**: Add new variables for additional filtering options
4. **Adjusting Layout**: Change the grid layout and panel sizes
## Best Practices
1. **Regular Reviews**: Review dashboards regularly to ensure they meet your monitoring needs
2. **Alert Integration**: Set up alerts based on key metrics shown in these dashboards
3. **Team Access**: Share relevant dashboards with appropriate team members
4. **Documentation**: Document any custom metrics or specific monitoring requirements
## Support
For issues with these dashboards:
1. Check the [SigNoz documentation](https://signoz.io/docs/)
2. Review the [Bakery IA monitoring guide](../SIGNOZ_COMPLETE_CONFIGURATION_GUIDE.md)
3. Consult the OpenTelemetry metrics specification
## License
These dashboard configurations are provided under the same license as the Bakery IA project.

View File

@@ -0,0 +1,170 @@
{
"description": "Alert monitoring and management dashboard",
"tags": ["alerts", "monitoring", "management"],
"name": "bakery-ia-alert-management",
"title": "Bakery IA - Alert Management",
"uploadedGrafana": false,
"uuid": "bakery-ia-alerts-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "active-alerts",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "alert-rate",
"moved": false,
"static": false
}
],
"variables": {
"service": {
"id": "service-var",
"name": "service",
"description": "Filter by service name",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'alerts_active' AND value != '' ORDER BY value",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": null
}
},
"widgets": [
{
"id": "active-alerts",
"title": "Active Alerts",
"description": "Number of currently active alerts",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "value",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "alerts_active",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "Active Alerts",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "alert-rate",
"title": "Alert Rate",
"description": "Rate of alerts over time",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "alerts_total",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "alerts/s"
}
]
}

View File

@@ -0,0 +1,351 @@
{
"description": "Comprehensive API performance monitoring for Bakery IA REST and GraphQL endpoints",
"tags": ["api", "performance", "rest", "graphql"],
"name": "bakery-ia-api-performance",
"title": "Bakery IA - API Performance",
"uploadedGrafana": false,
"uuid": "bakery-ia-api-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "request-volume",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "error-rate",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "avg-response-time",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "p95-latency",
"moved": false,
"static": false
}
],
"variables": {
"service": {
"id": "service-var",
"name": "service",
"description": "Filter by API service",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'http_server_requests_seconds_count' AND value != '' ORDER BY value",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": null
}
},
"widgets": [
{
"id": "request-volume",
"title": "Request Volume",
"description": "API request volume by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "http_server_requests_seconds_count",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "service.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "api.name",
"dataType": "string",
"type": "resource",
"isColumn": false
}
],
"legend": "{{api.name}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "req/s"
},
{
"id": "error-rate",
"title": "Error Rate",
"description": "API error rate by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "http_server_requests_seconds_count",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "api.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.api}}"
},
{
"key": {
"key": "status_code",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=~",
"value": "5.."
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "api.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
{
"key": "status_code",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{api.name}} - {{status_code}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "req/s"
},
{
"id": "avg-response-time",
"title": "Average Response Time",
"description": "Average API response time by endpoint",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "http_server_requests_seconds_sum",
"dataType": "float64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "avg",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "api.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.api}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "api.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
{
"key": "endpoint",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{api.name}} - {{endpoint}}",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "seconds"
},
{
"id": "p95-latency",
"title": "P95 Latency",
"description": "95th percentile latency by endpoint",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "histogram_quantile",
"aggregateAttribute": {
"key": "http_server_requests_seconds_bucket",
"dataType": "float64",
"type": "Histogram",
"isColumn": false
},
"timeAggregation": "avg",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "api.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.api}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "api.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
{
"key": "endpoint",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{api.name}} - {{endpoint}}",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "seconds"
}
]
}

View File

@@ -0,0 +1,333 @@
{
"description": "Application performance monitoring dashboard using distributed traces and metrics",
"tags": ["application", "performance", "traces", "apm"],
"name": "bakery-ia-application-performance",
"title": "Bakery IA - Application Performance (APM)",
"uploadedGrafana": false,
"uuid": "bakery-ia-apm-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "latency-p99",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "request-rate",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "error-rate",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "avg-duration",
"moved": false,
"static": false
}
],
"variables": {
"service_name": {
"id": "service-var",
"name": "service_name",
"description": "Filter by service name",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(serviceName) FROM signoz_traces.distributed_signoz_index_v2 ORDER BY serviceName",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": null
}
},
"widgets": [
{
"id": "latency-p99",
"title": "P99 Latency",
"description": "99th percentile latency for selected service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "p99",
"aggregateAttribute": {
"key": "duration_ns",
"dataType": "float64",
"type": "",
"isColumn": true
},
"timeAggregation": "avg",
"spaceAggregation": "p99",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service_name}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "ms"
},
{
"id": "request-rate",
"title": "Request Rate",
"description": "Requests per second for the service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "count",
"aggregateAttribute": {
"key": "",
"dataType": "",
"type": "",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service_name}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "reqps"
},
{
"id": "error-rate",
"title": "Error Rate",
"description": "Error rate percentage for the service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "count",
"aggregateAttribute": {
"key": "",
"dataType": "",
"type": "",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service_name}}"
},
{
"key": {
"key": "status_code",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "STATUS_CODE_ERROR"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "reqps"
},
{
"id": "avg-duration",
"title": "Average Duration",
"description": "Average request duration",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "duration_ns",
"dataType": "float64",
"type": "",
"isColumn": true
},
"timeAggregation": "avg",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service_name}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "ms"
}
]
}

View File

@@ -0,0 +1,425 @@
{
"description": "Comprehensive database performance monitoring for PostgreSQL, Redis, and RabbitMQ",
"tags": ["database", "postgresql", "redis", "rabbitmq", "performance"],
"name": "bakery-ia-database-performance",
"title": "Bakery IA - Database Performance",
"uploadedGrafana": false,
"uuid": "bakery-ia-db-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "pg-connections",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "pg-db-size",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "redis-connected-clients",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "redis-memory",
"moved": false,
"static": false
},
{
"x": 0,
"y": 6,
"w": 6,
"h": 3,
"i": "rabbitmq-messages",
"moved": false,
"static": false
},
{
"x": 6,
"y": 6,
"w": 6,
"h": 3,
"i": "rabbitmq-consumers",
"moved": false,
"static": false
}
],
"variables": {
"database": {
"id": "database-var",
"name": "database",
"description": "Filter by PostgreSQL database name",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(resource_attrs['postgresql.database.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'postgresql.db_size' AND value != '' ORDER BY value",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": null
}
},
"widgets": [
{
"id": "pg-connections",
"title": "PostgreSQL - Active Connections",
"description": "Number of active PostgreSQL connections",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "postgresql.backends",
"dataType": "float64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "postgresql.database.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.database}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "postgresql.database.name",
"dataType": "string",
"type": "resource",
"isColumn": false
}
],
"legend": "{{postgresql.database.name}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "pg-db-size",
"title": "PostgreSQL - Database Size",
"description": "Size of PostgreSQL databases in bytes",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "postgresql.db_size",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "postgresql.database.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.database}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "postgresql.database.name",
"dataType": "string",
"type": "resource",
"isColumn": false
}
],
"legend": "{{postgresql.database.name}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "bytes"
},
{
"id": "redis-connected-clients",
"title": "Redis - Connected Clients",
"description": "Number of clients connected to Redis",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "redis.clients.connected",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "host.name",
"dataType": "string",
"type": "resource",
"isColumn": false
}
],
"legend": "{{host.name}}",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "redis-memory",
"title": "Redis - Memory Usage",
"description": "Redis memory usage in bytes",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "redis.memory.used",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "host.name",
"dataType": "string",
"type": "resource",
"isColumn": false
}
],
"legend": "{{host.name}}",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "bytes"
},
{
"id": "rabbitmq-messages",
"title": "RabbitMQ - Current Messages",
"description": "Number of messages currently in RabbitMQ queues",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "rabbitmq.message.current",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "queue",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "Queue: {{queue}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "rabbitmq-consumers",
"title": "RabbitMQ - Consumer Count",
"description": "Number of consumers per queue",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "rabbitmq.consumer.count",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "queue",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "Queue: {{queue}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
}
]
}

View File

@@ -0,0 +1,348 @@
{
"description": "Comprehensive error tracking and analysis dashboard",
"tags": ["errors", "exceptions", "tracking"],
"name": "bakery-ia-error-tracking",
"title": "Bakery IA - Error Tracking",
"uploadedGrafana": false,
"uuid": "bakery-ia-errors-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "total-errors",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "error-rate",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "http-5xx",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "http-4xx",
"moved": false,
"static": false
}
],
"variables": {
"service": {
"id": "service-var",
"name": "service",
"description": "Filter by service name",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'error_total' AND value != '' ORDER BY value",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": null
}
},
"widgets": [
{
"id": "total-errors",
"title": "Total Errors",
"description": "Total number of errors across all services",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "value",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "error_total",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "sum",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "service.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "Total Errors",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "error-rate",
"title": "Error Rate",
"description": "Error rate over time",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "error_total",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "service.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "errors/s"
},
{
"id": "http-5xx",
"title": "HTTP 5xx Errors",
"description": "Server errors (5xx status codes)",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "http_server_requests_seconds_count",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "sum",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "service.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.service}}"
},
{
"key": {
"key": "status_code",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=~",
"value": "5.."
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
{
"key": "status_code",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{serviceName}} - {{status_code}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "number"
},
{
"id": "http-4xx",
"title": "HTTP 4xx Errors",
"description": "Client errors (4xx status codes)",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "http_server_requests_seconds_count",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "sum",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "service.name",
"dataType": "string",
"type": "resource",
"isColumn": false
},
"op": "=",
"value": "{{.service}}"
},
{
"key": {
"key": "status_code",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=~",
"value": "4.."
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
{
"key": "status_code",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{serviceName}} - {{status_code}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "number"
}
]
}

View File

@@ -0,0 +1,213 @@
{
"name": "Bakery IA Dashboard Collection",
"description": "Complete set of SigNoz dashboards for Bakery IA monitoring",
"version": "1.0.0",
"author": "Bakery IA Team",
"license": "MIT",
"dashboards": [
{
"id": "infrastructure-monitoring",
"name": "Infrastructure Monitoring",
"description": "Kubernetes infrastructure and resource monitoring",
"file": "infrastructure-monitoring.json",
"tags": ["infrastructure", "kubernetes", "system"],
"category": "infrastructure"
},
{
"id": "application-performance",
"name": "Application Performance",
"description": "Microservice performance and API metrics",
"file": "application-performance.json",
"tags": ["application", "performance", "apm"],
"category": "performance"
},
{
"id": "database-performance",
"name": "Database Performance",
"description": "PostgreSQL and Redis database monitoring",
"file": "database-performance.json",
"tags": ["database", "postgresql", "redis"],
"category": "database"
},
{
"id": "api-performance",
"name": "API Performance",
"description": "REST and GraphQL API performance monitoring",
"file": "api-performance.json",
"tags": ["api", "rest", "graphql"],
"category": "api"
},
{
"id": "error-tracking",
"name": "Error Tracking",
"description": "System error tracking and analysis",
"file": "error-tracking.json",
"tags": ["errors", "exceptions", "tracking"],
"category": "monitoring"
},
{
"id": "user-activity",
"name": "User Activity",
"description": "User behavior and activity monitoring",
"file": "user-activity.json",
"tags": ["user", "activity", "behavior"],
"category": "user"
},
{
"id": "system-health",
"name": "System Health",
"description": "Overall system health monitoring",
"file": "system-health.json",
"tags": ["system", "health", "overview"],
"category": "overview"
},
{
"id": "alert-management",
"name": "Alert Management",
"description": "Alert monitoring and management",
"file": "alert-management.json",
"tags": ["alerts", "notifications", "management"],
"category": "alerts"
},
{
"id": "log-analysis",
"name": "Log Analysis",
"description": "Log search and analysis",
"file": "log-analysis.json",
"tags": ["logs", "search", "analysis"],
"category": "logs"
}
],
"categories": [
{
"id": "infrastructure",
"name": "Infrastructure",
"description": "Kubernetes and system infrastructure monitoring"
},
{
"id": "performance",
"name": "Performance",
"description": "Application and service performance monitoring"
},
{
"id": "database",
"name": "Database",
"description": "Database performance and health monitoring"
},
{
"id": "api",
"name": "API",
"description": "API performance and usage monitoring"
},
{
"id": "monitoring",
"name": "Monitoring",
"description": "Error tracking and system monitoring"
},
{
"id": "user",
"name": "User",
"description": "User activity and behavior monitoring"
},
{
"id": "overview",
"name": "Overview",
"description": "System-wide overview and health dashboards"
},
{
"id": "alerts",
"name": "Alerts",
"description": "Alert management and monitoring"
},
{
"id": "logs",
"name": "Logs",
"description": "Log analysis and search"
}
],
"usage": {
"import_methods": [
"ui_import",
"api_import",
"kubernetes_configmap"
],
"recommended_import_order": [
"infrastructure-monitoring",
"system-health",
"application-performance",
"api-performance",
"database-performance",
"error-tracking",
"alert-management",
"log-analysis",
"user-activity"
]
},
"requirements": {
"signoz_version": ">= 0.10.0",
"opentelemetry_collector": ">= 0.45.0",
"metrics": [
"container_cpu_usage_seconds_total",
"container_memory_working_set_bytes",
"http_server_requests_seconds_count",
"http_server_requests_seconds_sum",
"pg_stat_activity_count",
"pg_stat_statements_total_time",
"error_total",
"alerts_total",
"kube_pod_status_phase",
"container_network_receive_bytes_total",
"kube_pod_container_status_restarts_total",
"kube_pod_container_status_ready",
"container_fs_reads_total",
"kube_pod_status_phase",
"kube_pod_container_status_restarts_total",
"kube_pod_container_status_ready",
"container_fs_reads_total",
"kubernetes_events",
"http_server_requests_seconds_bucket",
"http_server_active_requests",
"http_server_up",
"db_query_duration_seconds_sum",
"db_connections_active",
"http_client_request_duration_seconds_count",
"http_client_request_duration_seconds_sum",
"graphql_execution_time_seconds",
"graphql_errors_total",
"pg_stat_database_blks_hit",
"pg_stat_database_xact_commit",
"pg_locks_count",
"pg_table_size_bytes",
"pg_stat_user_tables_seq_scan",
"redis_memory_used_bytes",
"redis_commands_processed_total",
"redis_keyspace_hits",
"pg_stat_database_deadlocks",
"pg_stat_database_conn_errors",
"pg_replication_lag_bytes",
"pg_replication_is_replica",
"active_users",
"user_sessions_total",
"api_calls_per_user",
"session_duration_seconds",
"system_availability",
"service_health_score",
"system_cpu_usage",
"system_memory_usage",
"service_availability",
"alerts_active",
"alerts_total",
"log_lines_total"
]
},
"support": {
"documentation": "https://signoz.io/docs/",
"bakery_ia_docs": "../SIGNOZ_COMPLETE_CONFIGURATION_GUIDE.md",
"issues": "https://github.com/your-repo/issues"
},
"notes": {
"format_fix": "All dashboards have been updated to use the correct SigNoz JSON schema with proper filter arrays to resolve the 'e.filter is not a function' error.",
"compatibility": "Tested with SigNoz v0.10.0+ and OpenTelemetry Collector v0.45.0+",
"customization": "You can customize these dashboards by editing the JSON files or cloning them in the SigNoz UI"
}
}

View File

@@ -0,0 +1,437 @@
{
"description": "Comprehensive infrastructure monitoring dashboard for Bakery IA Kubernetes cluster",
"tags": ["infrastructure", "kubernetes", "k8s", "system"],
"name": "bakery-ia-infrastructure-monitoring",
"title": "Bakery IA - Infrastructure Monitoring",
"uploadedGrafana": false,
"uuid": "bakery-ia-infra-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "pod-count",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "pod-phase",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "container-restarts",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "node-condition",
"moved": false,
"static": false
},
{
"x": 0,
"y": 6,
"w": 12,
"h": 3,
"i": "deployment-status",
"moved": false,
"static": false
}
],
"variables": {
"namespace": {
"id": "namespace-var",
"name": "namespace",
"description": "Filter by Kubernetes namespace",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(resource_attrs['k8s.namespace.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'k8s.pod.phase' AND value != '' ORDER BY value",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": "bakery-ia"
}
},
"widgets": [
{
"id": "pod-count",
"title": "Total Pods",
"description": "Total number of pods in the namespace",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "value",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "count",
"aggregateAttribute": {
"key": "k8s.pod.phase",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "Total Pods",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "pod-phase",
"title": "Pod Phase Distribution",
"description": "Pods by phase (Running, Pending, Failed, etc.)",
"isStacked": true,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "k8s.pod.phase",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "phase",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{phase}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "container-restarts",
"title": "Container Restarts",
"description": "Container restart count over time",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "k8s.container.restarts",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "increase",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"id": "k8s.pod.name--string--tag--false",
"key": "k8s.pod.name",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{k8s.pod.name}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "node-condition",
"title": "Node Conditions",
"description": "Node condition status (Ready, MemoryPressure, DiskPressure, etc.)",
"isStacked": true,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "k8s.node.condition_ready",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"id": "k8s.node.name--string--tag--false",
"key": "k8s.node.name",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{k8s.node.name}} Ready",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "deployment-status",
"title": "Deployment Status (Desired vs Available)",
"description": "Deployment replicas: desired vs available",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "k8s.deployment.desired",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"id": "k8s.deployment.name--string--tag--false",
"key": "k8s.deployment.name",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{k8s.deployment.name}} (desired)",
"reduceTo": "avg"
},
{
"dataSource": "metrics",
"queryName": "B",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "k8s.deployment.available",
"dataType": "int64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "B",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"id": "k8s.deployment.name--string--tag--false",
"key": "k8s.deployment.name",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{k8s.deployment.name}} (available)",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
}
]
}

View File

@@ -0,0 +1,333 @@
{
"description": "Comprehensive log analysis and search dashboard",
"tags": ["logs", "analysis", "search"],
"name": "bakery-ia-log-analysis",
"title": "Bakery IA - Log Analysis",
"uploadedGrafana": false,
"uuid": "bakery-ia-logs-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "log-volume",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "error-logs",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "logs-by-level",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "logs-by-service",
"moved": false,
"static": false
}
],
"variables": {
"service": {
"id": "service-var",
"name": "service",
"description": "Filter by service name",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(resource_attrs['service.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'log_lines_total' AND value != '' ORDER BY value",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": null
}
},
"widgets": [
{
"id": "log-volume",
"title": "Log Volume",
"description": "Total log volume by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "log_lines_total",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "logs/s"
},
{
"id": "error-logs",
"title": "Error Logs",
"description": "Error log volume by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "log_lines_total",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "rate",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
},
{
"key": {
"key": "level",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "error"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}} (errors)",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "logs/s"
},
{
"id": "logs-by-level",
"title": "Logs by Level",
"description": "Distribution of logs by severity level",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "pie",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "log_lines_total",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "sum",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "level",
"dataType": "string",
"type": "tag",
"isColumn": false
}
],
"legend": "{{level}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "logs-by-service",
"title": "Logs by Service",
"description": "Distribution of logs by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "pie",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "sum",
"aggregateAttribute": {
"key": "log_lines_total",
"dataType": "int64",
"type": "Counter",
"isColumn": false
},
"timeAggregation": "sum",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
}
]
}

View File

@@ -0,0 +1,303 @@
{
"description": "Comprehensive system health monitoring dashboard",
"tags": ["system", "health", "monitoring"],
"name": "bakery-ia-system-health",
"title": "Bakery IA - System Health",
"uploadedGrafana": false,
"uuid": "bakery-ia-health-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "system-availability",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "health-score",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "cpu-usage",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "memory-usage",
"moved": false,
"static": false
}
],
"variables": {
"namespace": {
"id": "namespace-var",
"name": "namespace",
"description": "Filter by Kubernetes namespace",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(resource_attrs['k8s.namespace.name']) as value FROM signoz_metrics.distributed_time_series_v4_1day WHERE metric_name = 'system_availability' AND value != '' ORDER BY value",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": "bakery-ia"
}
},
"widgets": [
{
"id": "system-availability",
"title": "System Availability",
"description": "Overall system availability percentage",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "value",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "system_availability",
"dataType": "float64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "System Availability",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "percent"
},
{
"id": "health-score",
"title": "Service Health Score",
"description": "Overall service health score",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "value",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "service_health_score",
"dataType": "float64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "latest",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "Health Score",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "cpu-usage",
"title": "CPU Usage",
"description": "System CPU usage over time",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "system_cpu_usage",
"dataType": "float64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "avg",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "CPU Usage",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "percent"
},
{
"id": "memory-usage",
"title": "Memory Usage",
"description": "System memory usage over time",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "metrics",
"queryName": "A",
"aggregateOperator": "avg",
"aggregateAttribute": {
"key": "system_memory_usage",
"dataType": "float64",
"type": "Gauge",
"isColumn": false
},
"timeAggregation": "avg",
"spaceAggregation": "avg",
"functions": [],
"filters": {
"items": [
{
"id": "filter-k8s-namespace",
"key": {
"id": "k8s.namespace.name--string--tag--false",
"key": "k8s.namespace.name",
"dataType": "string",
"type": "tag",
"isColumn": false
},
"op": "=",
"value": "{{.namespace}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "Memory Usage",
"reduceTo": "avg"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "percent"
}
]
}

View File

@@ -0,0 +1,429 @@
{
"description": "User activity and behavior monitoring dashboard",
"tags": ["user", "activity", "behavior"],
"name": "bakery-ia-user-activity",
"title": "Bakery IA - User Activity",
"uploadedGrafana": false,
"uuid": "bakery-ia-user-01",
"version": "v4",
"collapsableRowsMigrated": true,
"layout": [
{
"x": 0,
"y": 0,
"w": 6,
"h": 3,
"i": "active-users",
"moved": false,
"static": false
},
{
"x": 6,
"y": 0,
"w": 6,
"h": 3,
"i": "user-sessions",
"moved": false,
"static": false
},
{
"x": 0,
"y": 3,
"w": 6,
"h": 3,
"i": "user-actions",
"moved": false,
"static": false
},
{
"x": 6,
"y": 3,
"w": 6,
"h": 3,
"i": "page-views",
"moved": false,
"static": false
},
{
"x": 0,
"y": 6,
"w": 12,
"h": 4,
"i": "geo-visitors",
"moved": false,
"static": false
}
],
"variables": {
"service": {
"id": "service-var",
"name": "service",
"description": "Filter by service name",
"type": "QUERY",
"queryValue": "SELECT DISTINCT(serviceName) FROM signoz_traces.distributed_signoz_index_v2 ORDER BY serviceName",
"customValue": "",
"textboxValue": "",
"showALLOption": true,
"multiSelect": false,
"order": 1,
"modificationUUID": "",
"sort": "ASC",
"selectedValue": "bakery-frontend"
}
},
"widgets": [
{
"id": "active-users",
"title": "Active Users",
"description": "Number of active users by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "count_distinct",
"aggregateAttribute": {
"key": "user.id",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"timeAggregation": "count_distinct",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "user-sessions",
"title": "User Sessions",
"description": "Total user sessions by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "count",
"aggregateAttribute": {
"key": "session.id",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"timeAggregation": "count",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
},
{
"key": {
"key": "span.name",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "user_session"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "user-actions",
"title": "User Actions",
"description": "Total user actions by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "count",
"aggregateAttribute": {
"key": "user.action",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"timeAggregation": "count",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
},
{
"key": {
"key": "span.name",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "user_action"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "page-views",
"title": "Page Views",
"description": "Total page views by service",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "graph",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "count",
"aggregateAttribute": {
"key": "page.path",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"timeAggregation": "count",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
},
{
"key": {
"key": "span.name",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "page_view"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [
{
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
}
],
"legend": "{{serviceName}}",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
},
{
"id": "geo-visitors",
"title": "Geolocation Visitors",
"description": "Number of visitors who shared location data",
"isStacked": false,
"nullZeroValues": "zero",
"opacity": "1",
"panelTypes": "value",
"query": {
"builder": {
"queryData": [
{
"dataSource": "traces",
"queryName": "A",
"aggregateOperator": "count",
"aggregateAttribute": {
"key": "user.id",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"timeAggregation": "count",
"spaceAggregation": "sum",
"functions": [],
"filters": {
"items": [
{
"key": {
"key": "serviceName",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "{{.service}}"
},
{
"key": {
"key": "span.name",
"dataType": "string",
"type": "tag",
"isColumn": true
},
"op": "=",
"value": "user_location"
}
],
"op": "AND"
},
"expression": "A",
"disabled": false,
"having": [],
"stepInterval": 60,
"limit": null,
"orderBy": [],
"groupBy": [],
"legend": "Visitors with Location Data (See GEOLOCATION_VISUALIZATION_GUIDE.md for map integration)",
"reduceTo": "sum"
}
],
"queryFormulas": []
},
"queryType": "builder"
},
"fillSpans": false,
"yAxisUnit": "none"
}
]
}

View File

@@ -0,0 +1,392 @@
#!/bin/bash
# ============================================================================
# SigNoz Deployment Script for Bakery IA
# ============================================================================
# This script deploys SigNoz monitoring stack using Helm
# Supports both development and production environments
# ============================================================================
set -e
# Color codes for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Function to display help
show_help() {
echo "Usage: $0 [OPTIONS] ENVIRONMENT"
echo ""
echo "Deploy SigNoz monitoring stack for Bakery IA"
echo ""
echo "Arguments:
ENVIRONMENT Environment to deploy to (dev|prod)"
echo ""
echo "Options:
-h, --help Show this help message
-d, --dry-run Dry run - show what would be done without actually deploying
-u, --upgrade Upgrade existing deployment
-r, --remove Remove/Uninstall SigNoz deployment
-n, --namespace NAMESPACE Specify namespace (default: bakery-ia)"
echo ""
echo "Examples:
$0 dev # Deploy to development
$0 prod # Deploy to production
$0 --upgrade prod # Upgrade production deployment
$0 --remove dev # Remove development deployment"
echo ""
echo "Docker Hub Authentication:"
echo " This script automatically creates a Docker Hub secret for image pulls."
echo " Provide credentials via environment variables (recommended):"
echo " export DOCKERHUB_USERNAME='your-username'"
echo " export DOCKERHUB_PASSWORD='your-personal-access-token'"
echo " Or ensure you're logged in with Docker CLI:"
echo " docker login"
}
# Parse command line arguments
DRY_RUN=false
UPGRADE=false
REMOVE=false
NAMESPACE="bakery-ia"
while [[ $# -gt 0 ]]; do
case $1 in
-h|--help)
show_help
exit 0
;;
-d|--dry-run)
DRY_RUN=true
shift
;;
-u|--upgrade)
UPGRADE=true
shift
;;
-r|--remove)
REMOVE=true
shift
;;
-n|--namespace)
NAMESPACE="$2"
shift 2
;;
dev|prod)
ENVIRONMENT="$1"
shift
;;
*)
echo "Unknown argument: $1"
show_help
exit 1
;;
esac
done
# Validate environment
if [[ -z "$ENVIRONMENT" ]]; then
echo "Error: Environment not specified. Use 'dev' or 'prod'."
show_help
exit 1
fi
if [[ "$ENVIRONMENT" != "dev" && "$ENVIRONMENT" != "prod" ]]; then
echo "Error: Invalid environment. Use 'dev' or 'prod'."
exit 1
fi
# Function to check if Helm is installed
check_helm() {
if ! command -v helm &> /dev/null; then
echo "${RED}Error: Helm is not installed. Please install Helm first.${NC}"
echo "Installation instructions: https://helm.sh/docs/intro/install/"
exit 1
fi
}
# Function to check if kubectl is configured
check_kubectl() {
if ! kubectl cluster-info &> /dev/null; then
echo "${RED}Error: kubectl is not configured or cannot connect to cluster.${NC}"
echo "Please ensure you have access to a Kubernetes cluster."
exit 1
fi
}
# Function to check if namespace exists, create if not
ensure_namespace() {
if ! kubectl get namespace "$NAMESPACE" &> /dev/null; then
echo "${BLUE}Creating namespace $NAMESPACE...${NC}"
if [[ "$DRY_RUN" == true ]]; then
echo " (dry-run) Would create namespace $NAMESPACE"
else
kubectl create namespace "$NAMESPACE"
echo "${GREEN}Namespace $NAMESPACE created.${NC}"
fi
else
echo "${BLUE}Namespace $NAMESPACE already exists.${NC}"
fi
}
# Function to create Docker Hub secret for image pulls
create_dockerhub_secret() {
echo "${BLUE}Setting up Docker Hub image pull secret...${NC}"
if [[ "$DRY_RUN" == true ]]; then
echo " (dry-run) Would create Docker Hub secret in namespace $NAMESPACE"
return
fi
# Check if secret already exists
if kubectl get secret dockerhub-creds -n "$NAMESPACE" &> /dev/null; then
echo "${GREEN}Docker Hub secret already exists in namespace $NAMESPACE.${NC}"
return
fi
# Check if Docker Hub credentials are available
if [[ -n "$DOCKERHUB_USERNAME" ]] && [[ -n "$DOCKERHUB_PASSWORD" ]]; then
echo "${BLUE}Found DOCKERHUB_USERNAME and DOCKERHUB_PASSWORD environment variables${NC}"
kubectl create secret docker-registry dockerhub-creds \
--docker-server=https://index.docker.io/v1/ \
--docker-username="$DOCKERHUB_USERNAME" \
--docker-password="$DOCKERHUB_PASSWORD" \
--docker-email="${DOCKERHUB_EMAIL:-noreply@bakery-ia.local}" \
-n "$NAMESPACE"
echo "${GREEN}Docker Hub secret created successfully.${NC}"
elif [[ -f "$HOME/.docker/config.json" ]]; then
echo "${BLUE}Attempting to use Docker CLI credentials...${NC}"
# Try to extract credentials from Docker config
if grep -q "credsStore" "$HOME/.docker/config.json"; then
echo "${YELLOW}Docker is using a credential store. Please set environment variables:${NC}"
echo " export DOCKERHUB_USERNAME='your-username'"
echo " export DOCKERHUB_PASSWORD='your-password-or-token'"
echo "${YELLOW}Continuing without Docker Hub authentication...${NC}"
return
fi
# Try to extract from base64 encoded auth
AUTH=$(cat "$HOME/.docker/config.json" | jq -r '.auths["https://index.docker.io/v1/"].auth // empty' 2>/dev/null)
if [[ -n "$AUTH" ]]; then
echo "${GREEN}Found Docker Hub credentials in Docker config${NC}"
local DOCKER_USERNAME=$(echo "$AUTH" | base64 -d | cut -d: -f1)
local DOCKER_PASSWORD=$(echo "$AUTH" | base64 -d | cut -d: -f2-)
kubectl create secret docker-registry dockerhub-creds \
--docker-server=https://index.docker.io/v1/ \
--docker-username="$DOCKER_USERNAME" \
--docker-password="$DOCKER_PASSWORD" \
--docker-email="${DOCKERHUB_EMAIL:-noreply@bakery-ia.local}" \
-n "$NAMESPACE"
echo "${GREEN}Docker Hub secret created successfully.${NC}"
else
echo "${YELLOW}Could not find Docker Hub credentials${NC}"
echo "${YELLOW}To enable automatic Docker Hub authentication:${NC}"
echo " 1. Run 'docker login', OR"
echo " 2. Set environment variables:"
echo " export DOCKERHUB_USERNAME='your-username'"
echo " export DOCKERHUB_PASSWORD='your-password-or-token'"
echo "${YELLOW}Continuing without Docker Hub authentication...${NC}"
fi
else
echo "${YELLOW}Docker Hub credentials not found${NC}"
echo "${YELLOW}To enable automatic Docker Hub authentication:${NC}"
echo " 1. Run 'docker login', OR"
echo " 2. Set environment variables:"
echo " export DOCKERHUB_USERNAME='your-username'"
echo " export DOCKERHUB_PASSWORD='your-password-or-token'"
echo "${YELLOW}Continuing without Docker Hub authentication...${NC}"
fi
echo ""
}
# Function to add and update Helm repository
setup_helm_repo() {
echo "${BLUE}Setting up SigNoz Helm repository...${NC}"
if [[ "$DRY_RUN" == true ]]; then
echo " (dry-run) Would add SigNoz Helm repository"
return
fi
# Add SigNoz Helm repository
if helm repo list | grep -q "^signoz"; then
echo "${BLUE}SigNoz repository already added, updating...${NC}"
helm repo update signoz
else
echo "${BLUE}Adding SigNoz Helm repository...${NC}"
helm repo add signoz https://charts.signoz.io
helm repo update
fi
echo "${GREEN}Helm repository ready.${NC}"
echo ""
}
# Function to deploy SigNoz
deploy_signoz() {
local values_file="infrastructure/helm/signoz-values-$ENVIRONMENT.yaml"
if [[ ! -f "$values_file" ]]; then
echo "${RED}Error: Values file $values_file not found.${NC}"
exit 1
fi
echo "${BLUE}Deploying SigNoz to $ENVIRONMENT environment...${NC}"
echo " Using values file: $values_file"
echo " Target namespace: $NAMESPACE"
echo " Chart version: Latest from signoz/signoz"
if [[ "$DRY_RUN" == true ]]; then
echo " (dry-run) Would deploy SigNoz with:"
echo " helm upgrade --install signoz signoz/signoz -n $NAMESPACE -f $values_file --wait --timeout 15m"
return
fi
# Use upgrade --install to handle both new installations and upgrades
echo "${BLUE}Installing/Upgrading SigNoz...${NC}"
echo "This may take 10-15 minutes..."
helm upgrade --install signoz signoz/signoz \
-n "$NAMESPACE" \
-f "$values_file" \
--wait \
--timeout 15m \
--create-namespace
echo "${GREEN}SigNoz deployment completed.${NC}"
echo ""
# Show deployment status
show_deployment_status
}
# Function to remove SigNoz
remove_signoz() {
echo "${BLUE}Removing SigNoz deployment from namespace $NAMESPACE...${NC}"
if [[ "$DRY_RUN" == true ]]; then
echo " (dry-run) Would remove SigNoz deployment"
return
fi
if helm list -n "$NAMESPACE" | grep -q signoz; then
helm uninstall signoz -n "$NAMESPACE" --wait
echo "${GREEN}SigNoz deployment removed.${NC}"
# Optionally remove PVCs (commented out by default for safety)
echo ""
echo "${YELLOW}Note: Persistent Volume Claims (PVCs) were NOT deleted.${NC}"
echo "To delete PVCs and all data, run:"
echo " kubectl delete pvc -n $NAMESPACE -l app.kubernetes.io/instance=signoz"
else
echo "${YELLOW}No SigNoz deployment found in namespace $NAMESPACE.${NC}"
fi
}
# Function to show deployment status
show_deployment_status() {
echo ""
echo "${BLUE}=== SigNoz Deployment Status ===${NC}"
echo ""
# Get pods
echo "Pods:"
kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
echo ""
# Get services
echo "Services:"
kubectl get svc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
echo ""
# Get ingress
echo "Ingress:"
kubectl get ingress -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
echo ""
# Show access information
show_access_info
}
# Function to show access information
show_access_info() {
echo "${BLUE}=== Access Information ===${NC}"
if [[ "$ENVIRONMENT" == "dev" ]]; then
echo "SigNoz UI: http://monitoring.bakery-ia.local"
echo ""
echo "OpenTelemetry Collector Endpoints (from within cluster):"
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
echo ""
echo "Port-forward for local access:"
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4317:4317"
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4318:4318"
else
echo "SigNoz UI: https://monitoring.bakewise.ai"
echo ""
echo "OpenTelemetry Collector Endpoints (from within cluster):"
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
echo ""
echo "External endpoints (if exposed):"
echo " Check ingress configuration for external OTLP endpoints"
fi
echo ""
echo "Default credentials:"
echo " Username: admin@example.com"
echo " Password: admin"
echo ""
echo "Note: Change default password after first login!"
echo ""
}
# Main execution
main() {
echo "${BLUE}"
echo "=========================================="
echo "🚀 SigNoz Deployment for Bakery IA"
echo "=========================================="
echo "${NC}"
# Check prerequisites
check_helm
check_kubectl
# Ensure namespace
ensure_namespace
if [[ "$REMOVE" == true ]]; then
remove_signoz
exit 0
fi
# Setup Helm repository
setup_helm_repo
# Create Docker Hub secret for image pulls
create_dockerhub_secret
# Deploy SigNoz
deploy_signoz
echo "${GREEN}"
echo "=========================================="
echo "✅ SigNoz deployment completed!"
echo "=========================================="
echo "${NC}"
}
# Run main function
main

View File

@@ -0,0 +1,141 @@
#!/bin/bash
# Generate Test Traffic to Services
# This script generates API calls to verify telemetry data collection
set -e
NAMESPACE="bakery-ia"
GREEN='\033[0;32m'
BLUE='\033[0;34m'
YELLOW='\033[1;33m'
NC='\033[0m'
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${BLUE} Generating Test Traffic for SigNoz Verification${NC}"
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo ""
# Check if ingress is accessible
echo -e "${BLUE}Step 1: Verifying Gateway Access${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
GATEWAY_POD=$(kubectl get pods -n $NAMESPACE -l app=gateway --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
if [[ -z "$GATEWAY_POD" ]]; then
echo -e "${YELLOW}⚠ Gateway pod not running. Starting port-forward...${NC}"
# Port forward in background
kubectl port-forward -n $NAMESPACE svc/gateway-service 8000:8000 &
PORT_FORWARD_PID=$!
sleep 3
API_URL="http://localhost:8000"
else
echo -e "${GREEN}✓ Gateway is running: $GATEWAY_POD${NC}"
# Use internal service
API_URL="http://gateway-service.$NAMESPACE.svc.cluster.local:8000"
fi
echo ""
# Function to make API call from inside cluster
make_request() {
local endpoint=$1
local description=$2
echo -e "${BLUE}→ Testing: $description${NC}"
echo " Endpoint: $endpoint"
if [[ -n "$GATEWAY_POD" ]]; then
# Make request from inside the gateway pod
RESPONSE=$(kubectl exec -n $NAMESPACE $GATEWAY_POD -- curl -s -w "\nHTTP_CODE:%{http_code}" "$API_URL$endpoint" 2>/dev/null || echo "FAILED")
else
# Make request from localhost
RESPONSE=$(curl -s -w "\nHTTP_CODE:%{http_code}" "$API_URL$endpoint" 2>/dev/null || echo "FAILED")
fi
if [[ "$RESPONSE" == "FAILED" ]]; then
echo -e " ${YELLOW}⚠ Request failed${NC}"
else
HTTP_CODE=$(echo "$RESPONSE" | grep "HTTP_CODE" | cut -d: -f2)
if [[ "$HTTP_CODE" == "200" ]] || [[ "$HTTP_CODE" == "401" ]] || [[ "$HTTP_CODE" == "404" ]]; then
echo -e " ${GREEN}✓ Response received (HTTP $HTTP_CODE)${NC}"
else
echo -e " ${YELLOW}⚠ Unexpected response (HTTP $HTTP_CODE)${NC}"
fi
fi
echo ""
sleep 1
}
# Generate traffic to various endpoints
echo -e "${BLUE}Step 2: Generating Traffic to Services${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
# Health checks (should generate traces)
make_request "/health" "Gateway Health Check"
make_request "/api/health" "API Health Check"
# Auth service endpoints
make_request "/api/auth/health" "Auth Service Health"
# Tenant service endpoints
make_request "/api/tenants/health" "Tenant Service Health"
# Inventory service endpoints
make_request "/api/inventory/health" "Inventory Service Health"
# Orders service endpoints
make_request "/api/orders/health" "Orders Service Health"
# Forecasting service endpoints
make_request "/api/forecasting/health" "Forecasting Service Health"
echo -e "${BLUE}Step 3: Checking Service Logs for Telemetry${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
# Check a few service pods for tracing logs
SERVICES=("auth-service" "inventory-service" "gateway")
for service in "${SERVICES[@]}"; do
POD=$(kubectl get pods -n $NAMESPACE -l app=$service --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
if [[ -n "$POD" ]]; then
echo -e "${BLUE}Checking $service ($POD)...${NC}"
TRACING_LOG=$(kubectl logs -n $NAMESPACE $POD --tail=100 2>/dev/null | grep -i "tracing\|otel" | head -n 2 || echo "")
if [[ -n "$TRACING_LOG" ]]; then
echo -e "${GREEN}✓ Tracing configured:${NC}"
echo "$TRACING_LOG" | sed 's/^/ /'
else
echo -e "${YELLOW}⚠ No tracing logs found${NC}"
fi
echo ""
fi
done
# Wait for data to be processed
echo -e "${BLUE}Step 4: Waiting for Data Processing${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Waiting 30 seconds for telemetry data to be processed..."
for i in {30..1}; do
echo -ne "\r ${i} seconds remaining..."
sleep 1
done
echo -e "\n"
# Cleanup port-forward if started
if [[ -n "$PORT_FORWARD_PID" ]]; then
kill $PORT_FORWARD_PID 2>/dev/null || true
fi
echo -e "${GREEN}✓ Test traffic generation complete!${NC}"
echo ""
echo -e "${BLUE}Next Steps:${NC}"
echo "1. Run the verification script to check for collected data:"
echo " ./infrastructure/helm/verify-signoz-telemetry.sh"
echo ""
echo "2. Access SigNoz UI to visualize the data:"
echo " https://monitoring.bakery-ia.local"
echo " or"
echo " kubectl port-forward -n bakery-ia svc/signoz 3301:8080"
echo " Then go to: http://localhost:3301"
echo ""
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"

View File

@@ -0,0 +1,175 @@
#!/bin/bash
# SigNoz Dashboard Importer for Bakery IA
# This script imports all SigNoz dashboards into your SigNoz instance
# Configuration
SIGNOZ_HOST="localhost"
SIGNOZ_PORT="3301"
SIGNOZ_API_KEY="" # Add your API key if authentication is required
DASHBOARDS_DIR="infrastructure/signoz/dashboards"
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Function to display help
show_help() {
echo "Usage: $0 [options]"
echo ""
echo "Options:
-h, --host SigNoz host (default: localhost)
-p, --port SigNoz port (default: 3301)
-k, --api-key SigNoz API key (if required)
-d, --dir Dashboards directory (default: infrastructure/signoz/dashboards)
-h, --help Show this help message"
echo ""
echo "Example:
$0 --host signoz.example.com --port 3301 --api-key your-api-key"
}
# Parse command line arguments
while [[ $# -gt 0 ]]; do
case $1 in
-h|--host)
SIGNOZ_HOST="$2"
shift 2
;;
-p|--port)
SIGNOZ_PORT="$2"
shift 2
;;
-k|--api-key)
SIGNOZ_API_KEY="$2"
shift 2
;;
-d|--dir)
DASHBOARDS_DIR="$2"
shift 2
;;
--help)
show_help
exit 0
;;
*)
echo "Unknown option: $1"
show_help
exit 1
;;
esac
done
# Check if dashboards directory exists
if [ ! -d "$DASHBOARDS_DIR" ]; then
echo -e "${RED}Error: Dashboards directory not found: $DASHBOARDS_DIR${NC}"
exit 1
fi
# Check if jq is installed for JSON validation
if ! command -v jq &> /dev/null; then
echo -e "${YELLOW}Warning: jq not found. Skipping JSON validation.${NC}"
VALIDATE_JSON=false
else
VALIDATE_JSON=true
fi
# Function to validate JSON
validate_json() {
local file="$1"
if [ "$VALIDATE_JSON" = true ]; then
if ! jq empty "$file" &> /dev/null; then
echo -e "${RED}Error: Invalid JSON in file: $file${NC}"
return 1
fi
fi
return 0
}
# Function to import a single dashboard
import_dashboard() {
local file="$1"
local filename=$(basename "$file")
local dashboard_name=$(jq -r '.name' "$file" 2>/dev/null || echo "Unknown")
echo -e "${BLUE}Importing dashboard: $dashboard_name ($filename)${NC}"
# Prepare curl command
local curl_cmd="curl -s -X POST http://$SIGNOZ_HOST:$SIGNOZ_PORT/api/v1/dashboards/import"
if [ -n "$SIGNOZ_API_KEY" ]; then
curl_cmd="$curl_cmd -H \"Authorization: Bearer $SIGNOZ_API_KEY\""
fi
curl_cmd="$curl_cmd -H \"Content-Type: application/json\" -d @\"$file\""
# Execute import
local response=$(eval "$curl_cmd")
# Check response
if echo "$response" | grep -q "success"; then
echo -e "${GREEN}✓ Successfully imported: $dashboard_name${NC}"
return 0
else
echo -e "${RED}✗ Failed to import: $dashboard_name${NC}"
echo "Response: $response"
return 1
fi
}
# Main import process
echo -e "${YELLOW}=== SigNoz Dashboard Importer for Bakery IA ===${NC}"
echo -e "${BLUE}Configuration:${NC}"
echo " Host: $SIGNOZ_HOST"
echo " Port: $SIGNOZ_PORT"
echo " Dashboards Directory: $DASHBOARDS_DIR"
if [ -n "$SIGNOZ_API_KEY" ]; then
echo " API Key: ******** (set)"
else
echo " API Key: Not configured"
fi
echo ""
# Count dashboards
DASHBOARD_COUNT=$(find "$DASHBOARDS_DIR" -name "*.json" | wc -l)
echo -e "${BLUE}Found $DASHBOARD_COUNT dashboards to import${NC}"
echo ""
# Import each dashboard
SUCCESS_COUNT=0
FAILURE_COUNT=0
for file in "$DASHBOARDS_DIR"/*.json; do
if [ -f "$file" ]; then
# Validate JSON
if validate_json "$file"; then
if import_dashboard "$file"; then
((SUCCESS_COUNT++))
else
((FAILURE_COUNT++))
fi
else
((FAILURE_COUNT++))
fi
echo ""
fi
done
# Summary
echo -e "${YELLOW}=== Import Summary ===${NC}"
echo -e "${GREEN}Successfully imported: $SUCCESS_COUNT dashboards${NC}"
if [ $FAILURE_COUNT -gt 0 ]; then
echo -e "${RED}Failed to import: $FAILURE_COUNT dashboards${NC}"
fi
echo ""
if [ $FAILURE_COUNT -eq 0 ]; then
echo -e "${GREEN}All dashboards imported successfully!${NC}"
echo "You can now access them in your SigNoz UI at:"
echo "http://$SIGNOZ_HOST:$SIGNOZ_PORT/dashboards"
else
echo -e "${YELLOW}Some dashboards failed to import. Check the errors above.${NC}"
exit 1
fi

View File

@@ -0,0 +1,12 @@
# SigNoz Helm Chart Values - Development Environment
# Optimized for local development with minimal resource usage
# DEPLOYED IN bakery-ia NAMESPACE - Ingress managed by bakery-ingress
#
# Official Chart: https://github.com/SigNoz/charts
# Install Command: helm install signoz signoz/signoz -n bakery-ia -f signoz-values-dev.yaml
global:
storageClass: "standard"
clusterName: "bakery-ia-dev"
domain: "monitoring.bakery-ia.local"
# Docker Hub credentials - applied to all sub-charts (including Zookeeper, ClickHouse, etc)

View File

@@ -0,0 +1,12 @@
# SigNoz Helm Chart Values - Production Environment
# High-availability configuration with resource optimization
# DEPLOYED IN bakery-ia NAMESPACE - Ingress managed by bakery-ingress-prod
#
# Official Chart: https://github.com/SigNoz/charts
# Install Command: helm install signoz signoz/signoz -n bakery-ia -f signoz-values-prod.yaml
global:
storageClass: "microk8s-hostpath" # For MicroK8s, use "microk8s-hostpath" or custom storage class
clusterName: "bakery-ia-prod"
domain: "monitoring.bakewise.ai"
# Docker Hub credentials - applied to all sub-charts (including Zookeeper, ClickHouse, etc)

View File

@@ -0,0 +1,177 @@
#!/bin/bash
# SigNoz Telemetry Verification Script
# This script verifies that services are correctly sending metrics, logs, and traces to SigNoz
# and that SigNoz is collecting them properly.
set -e
NAMESPACE="bakery-ia"
GREEN='\033[0;32m'
RED='\033[0;31m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${BLUE} SigNoz Telemetry Verification Script${NC}"
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo ""
# Step 1: Verify SigNoz Components are Running
echo -e "${BLUE}[1/7] Checking SigNoz Components Status...${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
OTEL_POD=$(kubectl get pods -n $NAMESPACE -l app.kubernetes.io/name=signoz,app.kubernetes.io/component=otel-collector --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
SIGNOZ_POD=$(kubectl get pods -n $NAMESPACE -l app.kubernetes.io/name=signoz,app.kubernetes.io/component=signoz --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
CLICKHOUSE_POD=$(kubectl get pods -n $NAMESPACE -l clickhouse.altinity.com/chi=signoz-clickhouse --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
if [[ -n "$OTEL_POD" && -n "$SIGNOZ_POD" && -n "$CLICKHOUSE_POD" ]]; then
echo -e "${GREEN}✓ All SigNoz components are running${NC}"
echo " - OTel Collector: $OTEL_POD"
echo " - SigNoz Frontend: $SIGNOZ_POD"
echo " - ClickHouse: $CLICKHOUSE_POD"
else
echo -e "${RED}✗ Some SigNoz components are not running${NC}"
kubectl get pods -n $NAMESPACE | grep signoz
exit 1
fi
echo ""
# Step 2: Check OTel Collector Endpoints
echo -e "${BLUE}[2/7] Verifying OTel Collector Endpoints...${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
OTEL_SVC=$(kubectl get svc -n $NAMESPACE signoz-otel-collector -o jsonpath='{.spec.clusterIP}')
echo "OTel Collector Service IP: $OTEL_SVC"
echo ""
echo "Available endpoints:"
kubectl get svc -n $NAMESPACE signoz-otel-collector -o jsonpath='{range .spec.ports[*]}{.name}{"\t"}{.port}{"\n"}{end}' | column -t
echo ""
echo -e "${GREEN}✓ OTel Collector endpoints are exposed${NC}"
echo ""
# Step 3: Check OTel Collector Logs for Data Reception
echo -e "${BLUE}[3/7] Checking OTel Collector for Recent Activity...${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Recent OTel Collector logs (last 20 lines):"
kubectl logs -n $NAMESPACE $OTEL_POD --tail=20 | grep -E "received|exported|traces|metrics|logs" || echo "No recent telemetry data found in logs"
echo ""
# Step 4: Check Service Configurations
echo -e "${BLUE}[4/7] Verifying Service Telemetry Configuration...${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# Check ConfigMap for OTEL settings
OTEL_ENDPOINT=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.OTEL_EXPORTER_OTLP_ENDPOINT}')
ENABLE_TRACING=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.ENABLE_TRACING}')
ENABLE_METRICS=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.ENABLE_METRICS}')
ENABLE_LOGS=$(kubectl get configmap bakery-config -n $NAMESPACE -o jsonpath='{.data.ENABLE_LOGS}')
echo "Configuration from bakery-config ConfigMap:"
echo " OTEL_EXPORTER_OTLP_ENDPOINT: $OTEL_ENDPOINT"
echo " ENABLE_TRACING: $ENABLE_TRACING"
echo " ENABLE_METRICS: $ENABLE_METRICS"
echo " ENABLE_LOGS: $ENABLE_LOGS"
echo ""
if [[ "$ENABLE_TRACING" == "true" && "$ENABLE_METRICS" == "true" && "$ENABLE_LOGS" == "true" ]]; then
echo -e "${GREEN}✓ Telemetry is enabled in configuration${NC}"
else
echo -e "${YELLOW}⚠ Some telemetry features may be disabled${NC}"
fi
echo ""
# Step 5: Test OTel Collector Health
echo -e "${BLUE}[5/7] Testing OTel Collector Health Endpoint...${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
HEALTH_CHECK=$(kubectl exec -n $NAMESPACE $OTEL_POD -- wget -qO- http://localhost:13133/ 2>/dev/null || echo "FAILED")
if [[ "$HEALTH_CHECK" == *"Server available"* ]] || [[ "$HEALTH_CHECK" == "{}" ]]; then
echo -e "${GREEN}✓ OTel Collector health check passed${NC}"
else
echo -e "${RED}✗ OTel Collector health check failed${NC}"
echo "Response: $HEALTH_CHECK"
fi
echo ""
# Step 6: Query ClickHouse for Telemetry Data
echo -e "${BLUE}[6/7] Querying ClickHouse for Telemetry Data...${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# Get ClickHouse credentials
CH_PASSWORD=$(kubectl get secret -n $NAMESPACE signoz-clickhouse -o jsonpath='{.data.admin-password}' 2>/dev/null | base64 -d || echo "27ff0399-0d3a-4bd8-919d-17c2181e6fb9")
echo "Checking for traces in ClickHouse..."
TRACES_COUNT=$(kubectl exec -n $NAMESPACE $CLICKHOUSE_POD -- clickhouse-client --user=admin --password=$CH_PASSWORD --query="SELECT count() FROM signoz_traces.signoz_index_v2 WHERE timestamp >= now() - INTERVAL 1 HOUR" 2>/dev/null || echo "0")
echo " Traces in last hour: $TRACES_COUNT"
echo "Checking for metrics in ClickHouse..."
METRICS_COUNT=$(kubectl exec -n $NAMESPACE $CLICKHOUSE_POD -- clickhouse-client --user=admin --password=$CH_PASSWORD --query="SELECT count() FROM signoz_metrics.samples_v4 WHERE unix_milli >= toUnixTimestamp(now() - INTERVAL 1 HOUR) * 1000" 2>/dev/null || echo "0")
echo " Metrics in last hour: $METRICS_COUNT"
echo "Checking for logs in ClickHouse..."
LOGS_COUNT=$(kubectl exec -n $NAMESPACE $CLICKHOUSE_POD -- clickhouse-client --user=admin --password=$CH_PASSWORD --query="SELECT count() FROM signoz_logs.logs WHERE timestamp >= now() - INTERVAL 1 HOUR" 2>/dev/null || echo "0")
echo " Logs in last hour: $LOGS_COUNT"
echo ""
if [[ "$TRACES_COUNT" -gt "0" || "$METRICS_COUNT" -gt "0" || "$LOGS_COUNT" -gt "0" ]]; then
echo -e "${GREEN}✓ Telemetry data found in ClickHouse!${NC}"
else
echo -e "${YELLOW}⚠ No telemetry data found in the last hour${NC}"
echo " This might be normal if:"
echo " - Services were just deployed"
echo " - No traffic has been generated yet"
echo " - Services haven't finished initializing"
fi
echo ""
# Step 7: Access Information
echo -e "${BLUE}[7/7] SigNoz UI Access Information${NC}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
echo "SigNoz is accessible via ingress at:"
echo -e " ${GREEN}https://monitoring.bakery-ia.local${NC}"
echo ""
echo "Or via port-forward:"
echo -e " ${YELLOW}kubectl port-forward -n $NAMESPACE svc/signoz 3301:8080${NC}"
echo " Then access: http://localhost:3301"
echo ""
echo "To view OTel Collector metrics:"
echo -e " ${YELLOW}kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 8888:8888${NC}"
echo " Then access: http://localhost:8888/metrics"
echo ""
# Summary
echo ""
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo -e "${BLUE} Verification Summary${NC}"
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
echo ""
echo "Component Status:"
echo " ✓ SigNoz components running"
echo " ✓ OTel Collector healthy"
echo " ✓ Configuration correct"
echo ""
echo "Data Collection (last hour):"
echo " Traces: $TRACES_COUNT"
echo " Metrics: $METRICS_COUNT"
echo " Logs: $LOGS_COUNT"
echo ""
if [[ "$TRACES_COUNT" -gt "0" || "$METRICS_COUNT" -gt "0" || "$LOGS_COUNT" -gt "0" ]]; then
echo -e "${GREEN}✓ SigNoz is collecting telemetry data successfully!${NC}"
else
echo -e "${YELLOW}⚠ To generate telemetry data, try:${NC}"
echo ""
echo "1. Generate traffic to your services:"
echo " curl http://localhost/api/health"
echo ""
echo "2. Check service logs for tracing initialization:"
echo " kubectl logs -n $NAMESPACE <service-pod> | grep -i 'tracing\\|otel\\|signoz'"
echo ""
echo "3. Wait a few minutes and run this script again"
fi
echo ""
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"

View File

@@ -0,0 +1,446 @@
#!/bin/bash
# ============================================================================
# SigNoz Verification Script for Bakery IA
# ============================================================================
# This script verifies that SigNoz is properly deployed and functioning
# ============================================================================
set -e
# Color codes for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Function to display help
show_help() {
echo "Usage: $0 [OPTIONS] ENVIRONMENT"
echo ""
echo "Verify SigNoz deployment for Bakery IA"
echo ""
echo "Arguments:
ENVIRONMENT Environment to verify (dev|prod)"
echo ""
echo "Options:
-h, --help Show this help message
-n, --namespace NAMESPACE Specify namespace (default: bakery-ia)"
echo ""
echo "Examples:
$0 dev # Verify development deployment
$0 prod # Verify production deployment
$0 --namespace monitoring dev # Verify with custom namespace"
}
# Parse command line arguments
NAMESPACE="bakery-ia"
while [[ $# -gt 0 ]]; do
case $1 in
-h|--help)
show_help
exit 0
;;
-n|--namespace)
NAMESPACE="$2"
shift 2
;;
dev|prod)
ENVIRONMENT="$1"
shift
;;
*)
echo "Unknown argument: $1"
show_help
exit 1
;;
esac
done
# Validate environment
if [[ -z "$ENVIRONMENT" ]]; then
echo "Error: Environment not specified. Use 'dev' or 'prod'."
show_help
exit 1
fi
if [[ "$ENVIRONMENT" != "dev" && "$ENVIRONMENT" != "prod" ]]; then
echo "Error: Invalid environment. Use 'dev' or 'prod'."
exit 1
fi
# Function to check if kubectl is configured
check_kubectl() {
if ! kubectl cluster-info &> /dev/null; then
echo "${RED}Error: kubectl is not configured or cannot connect to cluster.${NC}"
echo "Please ensure you have access to a Kubernetes cluster."
exit 1
fi
}
# Function to check namespace exists
check_namespace() {
if ! kubectl get namespace "$NAMESPACE" &> /dev/null; then
echo "${RED}Error: Namespace $NAMESPACE does not exist.${NC}"
echo "Please deploy SigNoz first using: ./deploy-signoz.sh $ENVIRONMENT"
exit 1
fi
}
# Function to verify SigNoz deployment
verify_deployment() {
echo "${BLUE}"
echo "=========================================="
echo "🔍 Verifying SigNoz Deployment"
echo "=========================================="
echo "Environment: $ENVIRONMENT"
echo "Namespace: $NAMESPACE"
echo "${NC}"
echo ""
# Check if SigNoz helm release exists
echo "${BLUE}1. Checking Helm release...${NC}"
if helm list -n "$NAMESPACE" | grep -q signoz; then
echo "${GREEN}✅ SigNoz Helm release found${NC}"
else
echo "${RED}❌ SigNoz Helm release not found${NC}"
echo "Please deploy SigNoz first using: ./deploy-signoz.sh $ENVIRONMENT"
exit 1
fi
echo ""
# Check pod status
echo "${BLUE}2. Checking pod status...${NC}"
local total_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
local running_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz --field-selector=status.phase=Running 2>/dev/null | grep -c "Running" || echo "0")
local ready_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep "Running" | grep "1/1" | wc -l | tr -d ' ' || echo "0")
echo "Total pods: $total_pods"
echo "Running pods: $running_pods"
echo "Ready pods: $ready_pods"
if [[ $total_pods -eq 0 ]]; then
echo "${RED}❌ No SigNoz pods found${NC}"
exit 1
fi
if [[ $running_pods -eq $total_pods ]]; then
echo "${GREEN}✅ All pods are running${NC}"
else
echo "${YELLOW}⚠️ Some pods are not running${NC}"
fi
if [[ $ready_pods -eq $total_pods ]]; then
echo "${GREEN}✅ All pods are ready${NC}"
else
echo "${YELLOW}⚠️ Some pods are not ready${NC}"
fi
echo ""
# Show pod details
echo "${BLUE}Pod Details:${NC}"
kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
echo ""
# Check services
echo "${BLUE}3. Checking services...${NC}"
local service_count=$(kubectl get svc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
if [[ $service_count -gt 0 ]]; then
echo "${GREEN}✅ Services found ($service_count services)${NC}"
kubectl get svc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
else
echo "${RED}❌ No services found${NC}"
fi
echo ""
# Check ingress
echo "${BLUE}4. Checking ingress...${NC}"
local ingress_count=$(kubectl get ingress -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
if [[ $ingress_count -gt 0 ]]; then
echo "${GREEN}✅ Ingress found ($ingress_count ingress resources)${NC}"
kubectl get ingress -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
else
echo "${YELLOW}⚠️ No ingress found (may be configured in main namespace)${NC}"
fi
echo ""
# Check PVCs
echo "${BLUE}5. Checking persistent volume claims...${NC}"
local pvc_count=$(kubectl get pvc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
if [[ $pvc_count -gt 0 ]]; then
echo "${GREEN}✅ PVCs found ($pvc_count PVCs)${NC}"
kubectl get pvc -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
else
echo "${YELLOW}⚠️ No PVCs found (may not be required for all components)${NC}"
fi
echo ""
# Check resource usage
echo "${BLUE}6. Checking resource usage...${NC}"
if command -v kubectl &> /dev/null && kubectl top pods -n "$NAMESPACE" &> /dev/null; then
echo "${GREEN}✅ Resource usage:${NC}"
kubectl top pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz
else
echo "${YELLOW}⚠️ Metrics server not available or no resource usage data${NC}"
fi
echo ""
# Check logs for errors
echo "${BLUE}7. Checking for errors in logs...${NC}"
local error_found=false
# Check each pod for errors
while IFS= read -r pod; do
if [[ -n "$pod" ]]; then
local pod_errors=$(kubectl logs -n "$NAMESPACE" "$pod" 2>/dev/null | grep -i "error\|exception\|fail\|crash" | wc -l || echo "0")
if [[ $pod_errors -gt 0 ]]; then
echo "${RED}❌ Errors found in pod $pod ($pod_errors errors)${NC}"
error_found=true
fi
fi
done < <(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz -o name | sed 's|pod/||')
if [[ "$error_found" == false ]]; then
echo "${GREEN}✅ No errors found in logs${NC}"
fi
echo ""
# Environment-specific checks
if [[ "$ENVIRONMENT" == "dev" ]]; then
verify_dev_specific
else
verify_prod_specific
fi
# Show access information
show_access_info
}
# Function for development-specific verification
verify_dev_specific() {
echo "${BLUE}8. Development-specific checks...${NC}"
# Check if ingress is configured
if kubectl get ingress -n "$NAMESPACE" 2>/dev/null | grep -q "monitoring.bakery-ia.local"; then
echo "${GREEN}✅ Development ingress configured${NC}"
else
echo "${YELLOW}⚠️ Development ingress not found${NC}"
fi
# Check unified signoz component resource limits (should be lower for dev)
local signoz_mem=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.template.spec.containers[0].resources.limits.memory}' 2>/dev/null || echo "")
if [[ -n "$signoz_mem" ]]; then
echo "${GREEN}✅ SigNoz component found (memory limit: $signoz_mem)${NC}"
else
echo "${YELLOW}⚠️ Could not verify SigNoz component resources${NC}"
fi
# Check single replica setup for dev
local replicas=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "0")
if [[ $replicas -eq 1 ]]; then
echo "${GREEN}✅ Single replica configuration (appropriate for dev)${NC}"
else
echo "${YELLOW}⚠️ Multiple replicas detected (replicas: $replicas)${NC}"
fi
echo ""
}
# Function for production-specific verification
verify_prod_specific() {
echo "${BLUE}8. Production-specific checks...${NC}"
# Check if TLS is configured
if kubectl get ingress -n "$NAMESPACE" 2>/dev/null | grep -q "signoz-tls"; then
echo "${GREEN}✅ TLS certificate configured${NC}"
else
echo "${YELLOW}⚠️ TLS certificate not found${NC}"
fi
# Check if multiple replicas are running for HA
local signoz_replicas=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "1")
if [[ $signoz_replicas -gt 1 ]]; then
echo "${GREEN}✅ High availability configured ($signoz_replicas SigNoz replicas)${NC}"
else
echo "${YELLOW}⚠️ Single SigNoz replica detected (not highly available)${NC}"
fi
# Check Zookeeper replicas (critical for production)
local zk_replicas=$(kubectl get statefulset -n "$NAMESPACE" -l app.kubernetes.io/component=zookeeper -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "0")
if [[ $zk_replicas -eq 3 ]]; then
echo "${GREEN}✅ Zookeeper properly configured with 3 replicas${NC}"
elif [[ $zk_replicas -gt 0 ]]; then
echo "${YELLOW}⚠️ Zookeeper has $zk_replicas replicas (recommend 3 for production)${NC}"
else
echo "${RED}❌ Zookeeper not found${NC}"
fi
# Check OTel Collector replicas
local otel_replicas=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=otel-collector -o jsonpath='{.items[0].spec.replicas}' 2>/dev/null || echo "1")
if [[ $otel_replicas -gt 1 ]]; then
echo "${GREEN}✅ OTel Collector HA configured ($otel_replicas replicas)${NC}"
else
echo "${YELLOW}⚠️ Single OTel Collector replica${NC}"
fi
# Check resource limits (should be higher for prod)
local signoz_mem=$(kubectl get deployment -n "$NAMESPACE" -l app.kubernetes.io/component=query-service -o jsonpath='{.items[0].spec.template.spec.containers[0].resources.limits.memory}' 2>/dev/null || echo "")
if [[ -n "$signoz_mem" ]]; then
echo "${GREEN}✅ Production resource limits applied (memory: $signoz_mem)${NC}"
else
echo "${YELLOW}⚠️ Could not verify resource limits${NC}"
fi
# Check HPA (Horizontal Pod Autoscaler)
local hpa_count=$(kubectl get hpa -n "$NAMESPACE" 2>/dev/null | grep -c signoz || echo "0")
if [[ $hpa_count -gt 0 ]]; then
echo "${GREEN}✅ Horizontal Pod Autoscaler configured${NC}"
else
echo "${YELLOW}⚠️ No HPA found (consider enabling for production)${NC}"
fi
echo ""
}
# Function to show access information
show_access_info() {
echo "${BLUE}"
echo "=========================================="
echo "📋 Access Information"
echo "=========================================="
echo "${NC}"
if [[ "$ENVIRONMENT" == "dev" ]]; then
echo "SigNoz UI: http://monitoring.bakery-ia.local"
echo ""
echo "OpenTelemetry Collector (within cluster):"
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
echo ""
echo "Port-forward for local access:"
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4317:4317"
echo " kubectl port-forward -n $NAMESPACE svc/signoz-otel-collector 4318:4318"
else
echo "SigNoz UI: https://monitoring.bakewise.ai"
echo ""
echo "OpenTelemetry Collector (within cluster):"
echo " gRPC: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4317"
echo " HTTP: signoz-otel-collector.$NAMESPACE.svc.cluster.local:4318"
fi
echo ""
echo "Default Credentials:"
echo " Username: admin@example.com"
echo " Password: admin"
echo ""
echo "⚠️ IMPORTANT: Change default password after first login!"
echo ""
# Show connection test commands
echo "Connection Test Commands:"
if [[ "$ENVIRONMENT" == "dev" ]]; then
echo " # Test SigNoz UI"
echo " curl http://monitoring.bakery-ia.local"
echo ""
echo " # Test via port-forward"
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
echo " curl http://localhost:8080"
else
echo " # Test SigNoz UI"
echo " curl https://monitoring.bakewise.ai"
echo ""
echo " # Test API health"
echo " kubectl port-forward -n $NAMESPACE svc/signoz 8080:8080"
echo " curl http://localhost:8080/api/v1/health"
fi
echo ""
}
# Function to run connectivity tests
run_connectivity_tests() {
echo "${BLUE}"
echo "=========================================="
echo "🔗 Running Connectivity Tests"
echo "=========================================="
echo "${NC}"
# Test pod readiness first
echo "Checking pod readiness..."
local ready_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz --field-selector=status.phase=Running 2>/dev/null | grep "Running" | grep -c "1/1\|2/2" || echo "0")
local total_pods=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=signoz 2>/dev/null | grep -v "NAME" | wc -l | tr -d ' ' || echo "0")
if [[ $ready_pods -eq $total_pods && $total_pods -gt 0 ]]; then
echo "${GREEN}✅ All pods are ready ($ready_pods/$total_pods)${NC}"
else
echo "${YELLOW}⚠️ Some pods not ready ($ready_pods/$total_pods)${NC}"
fi
echo ""
# Test internal service connectivity
echo "Testing internal service connectivity..."
local signoz_svc=$(kubectl get svc -n "$NAMESPACE" signoz -o jsonpath='{.spec.clusterIP}' 2>/dev/null || echo "")
if [[ -n "$signoz_svc" ]]; then
echo "${GREEN}✅ SigNoz service accessible at $signoz_svc:8080${NC}"
else
echo "${RED}❌ SigNoz service not found${NC}"
fi
local otel_svc=$(kubectl get svc -n "$NAMESPACE" signoz-otel-collector -o jsonpath='{.spec.clusterIP}' 2>/dev/null || echo "")
if [[ -n "$otel_svc" ]]; then
echo "${GREEN}✅ OTel Collector service accessible at $otel_svc:4317 (gRPC), $otel_svc:4318 (HTTP)${NC}"
else
echo "${RED}❌ OTel Collector service not found${NC}"
fi
echo ""
if [[ "$ENVIRONMENT" == "prod" ]]; then
echo "${YELLOW}⚠️ Production connectivity tests require valid DNS and TLS${NC}"
echo " Please ensure monitoring.bakewise.ai resolves to your cluster"
echo ""
echo "Manual test:"
echo " curl -I https://monitoring.bakewise.ai"
fi
}
# Main execution
main() {
echo "${BLUE}"
echo "=========================================="
echo "🔍 SigNoz Verification for Bakery IA"
echo "=========================================="
echo "${NC}"
# Check prerequisites
check_kubectl
check_namespace
# Verify deployment
verify_deployment
# Run connectivity tests
run_connectivity_tests
echo "${GREEN}"
echo "=========================================="
echo "✅ Verification Complete"
echo "=========================================="
echo "${NC}"
echo "Summary:"
echo " Environment: $ENVIRONMENT"
echo " Namespace: $NAMESPACE"
echo ""
echo "Next Steps:"
echo " 1. Access SigNoz UI and verify dashboards"
echo " 2. Configure alert rules for your services"
echo " 3. Instrument your applications with OpenTelemetry"
echo " 4. Set up custom dashboards for key metrics"
echo ""
}
# Run main function
main

View File

@@ -0,0 +1,9 @@
apiVersion: v1
kind: Namespace
metadata:
name: bakery-ia
labels:
name: bakery-ia
environment: local
app.kubernetes.io/name: bakery-ia
app.kubernetes.io/part-of: bakery-forecasting-platform

View File

@@ -0,0 +1,11 @@
# Flux System Namespace
# This namespace is required for Flux CD components
# It should be created before any Flux resources are applied
apiVersion: v1
kind: Namespace
metadata:
name: flux-system
labels:
app.kubernetes.io/name: flux
kubernetes.io/metadata.name: flux-system

View File

@@ -0,0 +1,7 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- bakery-ia.yaml
- tekton-pipelines.yaml
- flux-system.yaml

View File

@@ -0,0 +1,11 @@
apiVersion: v1
kind: Namespace
metadata:
name: tekton-pipelines
labels:
app.kubernetes.io/name: tekton
app.kubernetes.io/component: pipelines
kubernetes.io/metadata.name: tekton-pipelines
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted

View File

@@ -0,0 +1,27 @@
# Create a root CA certificate for local development
# NOTE: This certificate must be ready before the local-ca-issuer can be used
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: local-ca-cert
namespace: cert-manager # This ensures the secret is created in the cert-manager namespace
spec:
isCA: true
commonName: bakery-ia-local-ca
subject:
organizationalUnits:
- "Bakery IA Local CA"
organizations:
- "Bakery IA"
countries:
- "US"
secretName: local-ca-key-pair
privateKey:
algorithm: ECDSA
size: 256
issuerRef:
name: selfsigned-issuer
kind: ClusterIssuer
group: cert-manager.io
duration: 8760h # 1 year
renewBefore: 720h # 30 days

View File

@@ -0,0 +1,23 @@
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager
---
# NOTE: Do NOT define cert-manager ServiceAccounts here!
# The ServiceAccounts (cert-manager, cert-manager-cainjector, cert-manager-webhook)
# are created by the upstream cert-manager installation (kubernetes_restart.sh).
# Redefining them here would strip their RBAC bindings and break authentication.
---
# Self-signed ClusterIssuer for bootstrapping the CA certificate chain
# This issuer is used to create the root CA certificate which then
# becomes the issuer for all other certificates in the cluster
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: selfsigned-issuer
spec:
selfSigned: {}
---
# Cert-manager installation using Helm repository
# This will be installed via kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml
# The actual installation will be done via command line, this file documents the resources

View File

@@ -0,0 +1,23 @@
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-production
namespace: cert-manager
spec:
acme:
# The ACME server URL (Let's Encrypt production)
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: admin@bakewise.ai
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-production
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: public
podTemplate:
spec:
nodeSelector:
"kubernetes.io/os": linux

View File

@@ -0,0 +1,24 @@
# Let's Encrypt Staging ClusterIssuer
# Use this for testing before switching to production
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging
spec:
acme:
# The ACME server URL (Let's Encrypt staging)
server: https://acme-staging-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: admin@bakery-ia.local # Change this to your email
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-staging
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: public
podTemplate:
spec:
nodeSelector:
"kubernetes.io/os": linux

View File

@@ -0,0 +1,9 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- cert-manager.yaml
- ca-root-certificate.yaml
- local-ca-issuer.yaml
- cluster-issuer-staging.yaml
- cluster-issuer-production.yaml

View File

@@ -0,0 +1,7 @@
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: local-ca-issuer
spec:
ca:
secretName: local-ca-key-pair

View File

@@ -0,0 +1,8 @@
# Self-signed ClusterIssuer for local development certificates
# This issuer can generate self-signed certificates without needing external CA
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: selfsigned-issuer
spec:
selfSigned: {}

View File

@@ -0,0 +1,104 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: gateway
namespace: bakery-ia
labels:
app.kubernetes.io/name: gateway
app.kubernetes.io/component: gateway
app.kubernetes.io/part-of: bakery-ia
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: gateway
app.kubernetes.io/component: gateway
template:
metadata:
labels:
app.kubernetes.io/name: gateway
app.kubernetes.io/component: gateway
spec:
containers:
- name: gateway
image: bakery/gateway:latest
ports:
- containerPort: 8000
name: http
envFrom:
- configMapRef:
name: bakery-config
- secretRef:
name: database-secrets
- secretRef:
name: redis-secrets
- secretRef:
name: rabbitmq-secrets
- secretRef:
name: jwt-secrets
- secretRef:
name: external-api-secrets
- secretRef:
name: payment-secrets
- secretRef:
name: email-secrets
- secretRef:
name: monitoring-secrets
- secretRef:
name: pos-integration-secrets
- secretRef:
name: whatsapp-secrets
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
valueFrom:
configMapKeyRef:
name: bakery-config
key: OTEL_EXPORTER_OTLP_ENDPOINT
- name: SIGNOZ_OTEL_COLLECTOR_URL
valueFrom:
configMapKeyRef:
name: bakery-config
key: SIGNOZ_OTEL_COLLECTOR_URL
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
timeoutSeconds: 10
periodSeconds: 30
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 5
timeoutSeconds: 5
periodSeconds: 10
failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
name: gateway-service
namespace: bakery-ia
labels:
app.kubernetes.io/name: gateway
app.kubernetes.io/component: gateway
spec:
type: ClusterIP
ports:
- port: 8000
targetPort: 8000
protocol: TCP
name: http
selector:
app.kubernetes.io/name: gateway
app.kubernetes.io/component: gateway

View File

@@ -0,0 +1,5 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- gateway-service.yaml

View File

@@ -0,0 +1,45 @@
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: forecasting-service-hpa
namespace: bakery-ia
labels:
app.kubernetes.io/name: forecasting-service
app.kubernetes.io/component: autoscaling
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: forecasting-service
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 1
periodSeconds: 60
selectPolicy: Max

View File

@@ -0,0 +1,45 @@
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: notification-service-hpa
namespace: bakery-ia
labels:
app.kubernetes.io/name: notification-service
app.kubernetes.io/component: autoscaling
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: notification-service
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 1
periodSeconds: 60
selectPolicy: Max

View File

@@ -0,0 +1,45 @@
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: orders-service-hpa
namespace: bakery-ia
labels:
app.kubernetes.io/name: orders-service
app.kubernetes.io/component: autoscaling
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: orders-service
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 1
periodSeconds: 60
selectPolicy: Max

View File

@@ -0,0 +1,198 @@
# Mailu Migration Guide: From Kustomize to Helm
This document outlines the migration process from the Kustomize-based Mailu deployment to the Helm-based deployment.
## Overview
The Mailu email server has been migrated from a Kustomize-based deployment to a Helm chart-based deployment. This change provides better maintainability, easier upgrades, and standardized configuration management.
## Key Changes
### 1. Service Names
- **Old**: `mailu-smtp`, `email-smtp`, `mailu-front`, `mailu-admin`, `mailu-imap`, `mailu-antispam`
- **New**: `mailu-postfix`, `mailu-front`, `mailu-admin`, `mailu-dovecot`, `mailu-rspamd`
### 2. Configuration Method
- **Old**: Individual YAML manifests with Kustomize overlays
- **New**: Helm chart with values files for environment-specific configuration
### 3. Directory Structure
- **Old**: `infrastructure/platform/mail/mailu/{base,overlays/{dev,prod}}`
- **New**: `infrastructure/platform/mail/mailu-helm/{dev,prod}`
### 4. Ingress Configuration
- **Old**: Ingress resources created as part of the Kustomize setup
- **New**: Built-in ingress disabled in Helm chart to work with existing ingress controller
## Updated Service References
The following configurations have been updated to use the new Helm service names:
## Ingress Configuration
The Mailu Helm chart has been configured to work with your existing ingress setup:
- **ingress.enabled: false**: Disables the chart's built-in Ingress creation
- **tlsFlavorOverride: notls**: Tells Mailu's internal NGINX not to enforce TLS, as your Ingress handles TLS termination
- **realIpHeader: X-Forwarded-For**: Ensures Mailu's NGINX logs and processes the correct client IPs from behind your Ingress
- **realIpFrom: 0.0.0.0/0**: Trusts all proxies (restrict to your Ingress pod CIDR for security)
### Required Ingress Resource
You need to create an Ingress resource to route traffic to Mailu. Here's an example:
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: mailu-ingress
namespace: bakery-ia # Same as Mailu's namespace
annotations:
kubernetes.io/ingress.class: nginx # Or your Ingress class
nginx.ingress.kubernetes.io/proxy-body-size: "100m" # Allow larger email attachments
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600" # For long connections
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true" # Redirect HTTP to HTTPS
# If using Cert-Manager: cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- mail.bakery-ia.dev # or mail.bakewise.ai for prod
secretName: mail-tls-secret # Your TLS Secret
rules:
- host: mail.bakery-ia.dev # or mail.bakewise.ai for prod
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: mailu-front-http # Mailu's front service (check with kubectl get svc -n bakery-ia)
port:
number: 80
```
Apply it: `kubectl apply -f ingress.yaml`.
This routes all traffic from https://mail.[domain]/ to Mailu's internal NGINX, which proxies to webmail (/webmail), admin (/admin), etc.
## Updated Service References
The following configurations have been updated to use the new Helm service names:
### Common ConfigMap
- `SMTP_HOST` changed from `email-smtp.bakery-ia.svc.cluster.local` to `mailu-postfix.bakery-ia.svc.cluster.local`
### SigNoz Configuration
- `signoz_smtp_host` changed from `email-smtp.bakery-ia.svc.cluster.local` to `mailu-postfix.bakery-ia.svc.cluster.local`
- `smtp_smarthost` changed from `email-smtp.bakery-ia.svc.cluster.local:587` to `mailu-postfix.bakery-ia.svc.cluster.local:587`
## Deployment Process
### Prerequisites
1. Helm 3.x installed
2. Access to Kubernetes cluster
3. Namespace `bakery-ia` exists
### Deployment Commands
#### For Development:
```bash
# Add Mailu Helm repository
helm repo add mailu https://mailu.github.io/helm-charts/
helm repo update
# Install Mailu for development
helm upgrade --install mailu-dev mailu/mailu \
--namespace bakery-ia \
--create-namespace \
--values infrastructure/platform/mail/mailu-helm/values.yaml \
--values infrastructure/platform/mail/mailu-helm/dev/values.yaml
```
#### For Production:
```bash
# Add Mailu Helm repository
helm repo add mailu https://mailu.github.io/helm-charts/
helm repo update
# Install Mailu for production
helm upgrade --install mailu-prod mailu/mailu \
--namespace bakery-ia \
--create-namespace \
--values infrastructure/platform/mail/mailu-helm/values.yaml \
--values infrastructure/platform/mail/mailu-helm/prod/values.yaml
```
## Critical Configuration Preservation
All critical configurations from the original Kustomize setup have been preserved:
- Domain and hostname settings
- External SMTP relay configuration (Mailgun)
- Redis integration with shared cluster
- Database connection settings
- TLS certificate management
- Resource limits and requests
- Network policies
- Storage configuration (10Gi PVC)
## Rollback Procedure
If rollback to the Kustomize setup is needed:
1. Uninstall the Helm release:
```bash
helm uninstall mailu-dev -n bakery-ia # or mailu-prod
```
2. Revert the configuration changes in `infrastructure/environments/common/configs/configmap.yaml` and `infrastructure/monitoring/signoz/signoz-values-prod.yaml`
3. Deploy the old Kustomize manifests:
```bash
kubectl apply -k infrastructure/platform/mail/mailu/overlays/dev
# or
kubectl apply -k infrastructure/platform/mail/mailu/overlays/prod
```
## Verification Steps
After deployment, verify the following:
1. Check that all Mailu pods are running:
```bash
kubectl get pods -n bakery-ia | grep mailu
```
2. Verify SMTP connectivity from other services:
```bash
# Test from a pod in the same namespace
kubectl run test-smtp --image=curlimages/curl -n bakery-ia --rm -it -- \
nc -zv mailu-postfix.bakery-ia.svc.cluster.local 587
```
3. Check that notification service can send emails:
```bash
kubectl logs -n bakery-ia deployment/notification-service | grep -i smtp
```
4. Verify web interface accessibility:
```bash
kubectl port-forward -n bakery-ia svc/mailu-front 8080:80
# Then visit http://localhost:8080/admin
```
## Known Issues
1. During migration, existing email data should be backed up before uninstalling the old deployment
2. DNS records may need to be updated to point to the new service endpoints
3. Some custom configurations may need to be reapplied after Helm installation
## Support
For issues with the new Helm-based deployment:
1. Check the [official Mailu Helm chart documentation](https://github.com/Mailu/helm-charts)
2. Review Helm release status: `helm status mailu-[dev|prod] -n bakery-ia`
3. Check pod logs: `kubectl logs -n bakery-ia deployment/[mailu-postfix|mailu-front|etc.]`
4. Verify network connectivity between services

View File

@@ -0,0 +1,171 @@
# Mailu Helm Chart for Bakery-IA
This directory contains the Helm chart configuration for Mailu, replacing the previous Kustomize-based setup.
## Overview
The Mailu email server is now deployed using the official Mailu Helm chart instead of Kustomize manifests. This provides better maintainability, easier upgrades, and standardized configuration. The setup is configured to work behind your existing Ingress controller (NGINX), with the internal Mailu NGINX acting as a proxy for services like webmail while your existing Ingress handles traffic routing, TLS termination, and forwarding to Mailu's internal NGINX on HTTP (port 80).
## Directory Structure
```
mailu-helm/
├── values.yaml # Base configuration values
├── dev/
│ └── values.yaml # Development-specific overrides
├── prod/
│ └── values.yaml # Production-specific overrides
└── mailu-ingress.yaml # Sample ingress configuration for use with existing ingress
```
## Critical Configuration Preservation
The following critical configurations from the original Kustomize setup have been preserved:
- **Domain settings**: Domain and hostnames for both dev and prod
- **External relay**: Mailgun SMTP relay configuration
- **Redis integration**: Connection to shared Redis cluster (database 15)
- **Database settings**: PostgreSQL connection details
- **Resource limits**: CPU and memory requests/limits matching original setup
- **Network policies**: Security policies restricting access to authorized services
- **Storage**: 10Gi persistent volume for mail data
- **Ingress configuration**: Built-in ingress disabled to work with existing ingress
## Deployment
### Prerequisites
1. Helm 3.x installed
2. Kubernetes cluster with storage provisioner
3. Ingress controller (NGINX) - already deployed in your cluster
4. Cert-manager for TLS certificates (optional, depends on your ingress setup)
5. External SMTP relay account (Mailgun)
### Deployment Commands
#### For Development:
```bash
helm repo add mailu https://mailu.github.io/helm-charts/
helm repo update
helm install mailu-dev mailu/mailu \
--namespace bakery-ia \
--create-namespace \
--values mailu-helm/values.yaml \
--values mailu-helm/dev/values.yaml
```
#### For Production:
```bash
helm repo add mailu https://mailu.github.io/helm-charts/
helm repo update
helm install mailu-prod mailu/mailu \
--namespace bakery-ia \
--create-namespace \
--values mailu-helm/values.yaml \
--values mailu-helm/prod/values.yaml
```
### Upgrading
To upgrade to a newer version of the Mailu Helm chart:
```bash
helm repo update
helm upgrade mailu-dev mailu/mailu \
--namespace bakery-ia \
--values mailu-helm/values.yaml \
--values mailu-helm/dev/values.yaml
```
## Ingress Configuration
The Mailu Helm chart is configured to work with your existing Ingress setup:
- **ingress.enabled: false**: Disables the chart's built-in Ingress creation
- **tlsFlavorOverride: notls**: Tells Mailu's internal NGINX not to enforce TLS, as your Ingress handles TLS termination
- **realIpHeader: X-Forwarded-For**: Ensures Mailu's NGINX logs and processes the correct client IPs from behind your Ingress
- **realIpFrom: 0.0.0.0/0**: Trusts all proxies (restrict to your Ingress pod CIDR for security)
### Required Ingress Resource
You need to create an Ingress resource to route traffic to Mailu. Here's an example:
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: mailu-ingress
namespace: bakery-ia # Same as Mailu's namespace
annotations:
kubernetes.io/ingress.class: nginx # Or your Ingress class
nginx.ingress.kubernetes.io/proxy-body-size: "100m" # Allow larger email attachments
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600" # For long connections
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true" # Redirect HTTP to HTTPS
# If using Cert-Manager: cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- mail.bakery-ia.dev # or mail.bakewise.ai for prod
secretName: mail-tls-secret # Your TLS Secret
rules:
- host: mail.bakery-ia.dev # or mail.bakewise.ai for prod
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: mailu-front-http # Mailu's front service (check with kubectl get svc -n bakery-ia)
port:
number: 80
```
Apply it: `kubectl apply -f ingress.yaml`.
This routes all traffic from https://mail.[domain]/ to Mailu's internal NGINX, which proxies to webmail (/webmail), admin (/admin), etc.
## Configuration Details
### Environment-Specific Values
- **Development** (`dev/values.yaml`):
- Domain: `bakery-ia.local`
- No TLS enforcement internally (handled by ingress)
- Disabled antivirus to save resources
- Debug logging level
- **Production** (`prod/values.yaml`):
- Domain: `bakewise.ai`
- No TLS enforcement internally (handled by ingress)
- Enabled antivirus
- Warning logging level
### Secrets Management
Sensitive values like passwords and API keys should be managed through Kubernetes secrets rather than being stored in the values files. The Helm chart supports referencing existing secrets for:
- Database passwords
- Redis passwords
- External relay credentials
- Mailu secret key
## Integration with Notification Service
The notification service continues to connect to Mailu via the internal service name `mailu-postfix.bakery-ia.svc.cluster.local` on port 587 with STARTTLS.
## Access Information
- **Admin Panel**: `https://mail.[domain]/admin`
- **Webmail**: `https://mail.[domain]/webmail`
- **SMTP**: `mail.[domain]:587` (STARTTLS) - handled via separate TCP services if needed
- **IMAP**: `mail.[domain]:993` (SSL/TLS) - handled via separate TCP services if needed
## Migration Notes
When migrating from the Kustomize setup to Helm:
1. Ensure all existing PVCs are preserved during migration
2. Export any existing mail data before migration if needed
3. Update any hardcoded service references in other deployments
4. Verify that network policies still allow necessary communications
5. Configure your existing ingress to route traffic to the Mailu services

View File

@@ -0,0 +1,38 @@
# CoreDNS ConfigMap patch to forward external DNS queries to Unbound for DNSSEC validation
# This is required for Mailu Admin which requires DNSSEC-validating DNS resolver
#
# Apply with: kubectl apply -f coredns-unbound-patch.yaml
# Then restart CoreDNS: kubectl rollout restart deployment coredns -n kube-system
#
# Note: The Unbound service IP (10.104.127.213) may change when the cluster is recreated.
# The setup script will automatically update this based on the actual Unbound service IP.
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . UNBOUND_SERVICE_IP {
max_concurrent 1000
}
cache 30 {
disable success cluster.local
disable denial cluster.local
}
loop
reload
loadbalance
}

View File

@@ -0,0 +1,94 @@
# Mailgun SMTP Credentials Secret for Mailu
#
# This secret stores Mailgun credentials for outbound email relay.
# Mailu uses Mailgun as an external SMTP relay to send all outbound emails.
#
# ============================================================================
# HOW TO CONFIGURE:
# ============================================================================
#
# 1. Go to https://www.mailgun.com and create an account
#
# 2. Add and verify your domain:
# - For dev: bakery-ia.dev
# - For prod: bakewise.ai
#
# 3. Go to Domain Settings > SMTP credentials in Mailgun dashboard
#
# 4. Note your SMTP credentials:
# - SMTP hostname: smtp.mailgun.org
# - Port: 587 (TLS/STARTTLS)
# - Username: typically postmaster@yourdomain.com
# - Password: your Mailgun SMTP password (NOT the API key)
#
# 5. Base64 encode your credentials:
# echo -n 'postmaster@bakewise.ai' | base64
# echo -n 'your-mailgun-smtp-password' | base64
#
# 6. Replace the placeholder values below with your encoded credentials
#
# 7. Apply this secret:
# kubectl apply -f mailgun-credentials-secret.yaml -n bakery-ia
#
# ============================================================================
# IMPORTANT NOTES:
# ============================================================================
#
# - Use the SMTP password from Mailgun, NOT the API key
# - The username format is: postmaster@yourdomain.com
# - For sandbox domains, Mailgun requires adding authorized recipients
# - Production domains need DNS verification (SPF, DKIM records)
#
# ============================================================================
# DNS RECORDS REQUIRED FOR MAILGUN:
# ============================================================================
#
# Add these DNS records to your domain for proper email delivery:
#
# 1. SPF Record (TXT):
# Name: @
# Value: v=spf1 include:mailgun.org ~all
#
# 2. DKIM Records (TXT):
# Mailgun will provide two DKIM keys to add as TXT records
# (check your Mailgun domain settings for exact values)
#
# 3. MX Records (optional, only if receiving via Mailgun):
# Priority 10: mxa.mailgun.org
# Priority 10: mxb.mailgun.org
#
# ============================================================================
---
apiVersion: v1
kind: Secret
metadata:
name: mailu-mailgun-credentials
namespace: bakery-ia
labels:
app: mailu
component: external-relay
annotations:
description: "Mailgun SMTP credentials for Mailu external relay"
type: Opaque
stringData:
# ============================================================================
# REPLACE THESE VALUES WITH YOUR MAILGUN CREDENTIALS
# ============================================================================
#
# Option 1: Use stringData (plain text - Kubernetes will encode automatically)
# This is easier for initial setup but shows credentials in the file
#
RELAY_USERNAME: "postmaster@sandboxc1bff891532b4f0c83056a68ae080b4c.mailgun.org"
RELAY_PASSWORD: "2e47104abadad8eb820d00042ea6d5eb-77c6c375-89c7ea55"
#
# ============================================================================
# ALTERNATIVE: Use pre-encoded values (more secure for version control)
# ============================================================================
# Comment out stringData above and uncomment data below:
#
# data:
# # Base64 encoded values
# # echo -n 'postmaster@bakewise.ai' | base64
# RELAY_USERNAME: cG9zdG1hc3RlckBiYWtld2lzZS5haQ==
# # echo -n 'your-password' | base64
# RELAY_PASSWORD: WU9VUl9NQUlMR1VOX1NNVFBfUEFTU1dPUkQ=

View File

@@ -0,0 +1,34 @@
# Mailu Admin Credentials Secret
# This secret stores the initial admin account password for Mailu
#
# The password is used by the Helm chart's initialAccount feature to create
# the admin user automatically during deployment.
#
# IMPORTANT: Replace the base64-encoded password before applying!
#
# To generate a secure password and encode it:
# PASSWORD=$(openssl rand -base64 16 | tr -d '/+=' | head -c 16)
# echo -n "$PASSWORD" | base64
#
# To apply this secret:
# kubectl apply -f mailu-admin-credentials-secret.yaml -n bakery-ia
#
# After deployment, you can log in to the Mailu admin panel at:
# https://mail.<domain>/admin
# Username: admin@<domain>
# Password: <the password you set>
#
apiVersion: v1
kind: Secret
metadata:
name: mailu-admin-credentials
namespace: bakery-ia
labels:
app.kubernetes.io/name: mailu
app.kubernetes.io/component: admin
type: Opaque
data:
# Base64-encoded password
# Example: "changeme123" = Y2hhbmdlbWUxMjM=
# IMPORTANT: Replace with your own secure password!
password: "Y2hhbmdlbWUxMjM="

View File

@@ -0,0 +1,26 @@
# Self-signed TLS certificate secret for Mailu Front
# This is required by the Mailu Helm chart even when TLS is disabled (tls.flavor: notls)
# The Front pod mounts this secret for internal certificate handling
#
# For production, replace with proper certificates from cert-manager or Let's Encrypt
# This script generates a self-signed certificate valid for 365 days
#
# To regenerate manually:
# openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
# -keyout tls.key -out tls.crt \
# -subj "/CN=mail.bakery-ia.dev/O=bakery-ia"
# kubectl create secret tls mailu-certificates \
# --cert=tls.crt --key=tls.key -n bakery-ia
apiVersion: v1
kind: Secret
metadata:
name: mailu-certificates
namespace: bakery-ia
labels:
app.kubernetes.io/name: mailu
app.kubernetes.io/component: certificates
type: kubernetes.io/tls
data:
# Generated certificate for mail.bakery-ia.dev
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURRekNDQWl1Z0F3SUJBZ0lVVWg1Rlg5cWlPRDdkc2FmVi9KemlKWWh1WUZJd0RRWUpLb1pJaHZjTkFRRUwKQlFBd01URWJNQmtHQTFVRUF3d1NiV0ZwYkM1aVlXdGxjbmt0YVdFdVpHVjJNUkl3RUFZRFZRUUtEQWxDWVd0bApjbmtnU1VFd0hoY05Nall3TVRFNU1qQTBOakkwV2hjTk1qY3dNVEU1TWpBME5qSTBXakF4TVJzd0dRWURWUVFECkRCSnRZV2xzTG1KaGEyVnllUzFwWVM1a1pYWXhFakFRQmdOVkJBb01DVUpoYTJWeWVTQkpRVENDQVNJd0RRWUoKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTDJlbXM2YW5DSjV5N0JQNm9KdTQ2TldQSXJ3Zlg3Mgp3WmgxZERJaVlIMmNsalBESldsb3ROU0JFTngxUkZZSEc3Z0VSRVk1MHpFQ3UwSC9Vc0YzRFlPTFhobkYwdVRXCkNSTmJFRjFoYjZNT2lqanVmOWJHKzdsVkJ5NmZkMXZRTzJpOTA1VktxRTdEZllraWIwVkpxN0duVUo5RWFtOFgKSWxTaUphY1F6Mm11WXd6QjBPN3hZeVV3VFFWTDcvSnRNTWs5ZjZDY1ZENXFRMGJuWEJNM2hqcVVGWTlnbEF5dApZZHBUUUhPdms1WXgrZk1nL2JZVlBjQ0VhZFhVVkhBdHoxYlJybGIwenlMc3FXeHd2OXlWN0pCM210TkNmbFdsCkRCWWRIb3J0ZlROTHVSNFhhRTNXT2pnbzkwT1ltbi9PYll6Mld0SXUwMnp5MkhrTnBNYUFvVmtDQXdFQUFhTlQKTUZFd0hRWURWUjBPQkJZRUZMS2hPc254WnpXQ1RyMFFuSTdjaE1hbWtTb2pNQjhHQTFVZEl3UVlNQmFBRkxLaApPc254WnpXQ1RyMFFuSTdjaE1hbWtTb2pNQThHQTFVZEV3RUIvd1FGTUFNQkFmOHdEUVlKS29aSWh2Y05BUUVCCkJRQURnZ0VCQUFMQ3hGV1VnY3Z3ZVpoRjFHdlNnR3R3VW9WakJtcG1GYnFPMC93S2lqMlhDRmZ6L0FqanZaOHMKOGVIUEc5Z3crbjlpaGNSN016Q2V5ZldRd1FsaTBXZkcySzBvUDFGeUxoYU9aMlhtdU9nNnhNRG5EVzBVZWtqMwpCYWdHc3RFVXpqQlR1UlJ3WS9uck5vb1ZCOVFoYnhoeW9mbXkrVzVmczhZMDNTZG9paTFpWG1iSEhaemMyL21ICmF2UDE0Z3BzWUNDZVl6aklyWm05WWE4Rzhpc2tYelNnZU0vSEhpRzhJOWhKRkJYaHRYYWRjeGkvbU5hNHRKcWgKM1crTEIzaEQ4NFVkZ3MrR3pCZ0hHdnIwdWxMMTQvaUxVRXFySXZaWjN2VTlvNlZ4MlBvRjQ3cjBQNXpOZXVTNwpkRk5xT3JJT2phSm5yMXFVb0tMeWd3RUhqdVRNbUk0PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRQzlucHJPbXB3aWVjdXcKVCtxQ2J1T2pWanlLOEgxKzlzR1lkWFF5SW1COW5KWXp3eVZwYUxUVWdSRGNkVVJXQnh1NEJFUkdPZE14QXJ0QgovMUxCZHcyRGkxNFp4ZExrMWdrVFd4QmRZVytqRG9vNDduL1d4dnU1VlFjdW4zZGIwRHRvdmRPVlNxaE93MzJKCkltOUZTYXV4cDFDZlJHcHZGeUpVb2lXbkVNOXBybU1Nd2REdThXTWxNRTBGUysveWJUREpQWCtnbkZRK2FrTkcKNTF3VE40WTZsQldQWUpRTXJXSGFVMEJ6cjVPV01mbnpJUDIyRlQzQWhHblYxRlJ3TGM5VzBhNVc5TThpN0tscwpjTC9jbGV5UWQ1clRRbjVWcFF3V0hSNks3WDB6UzdrZUYyaE4xam80S1BkRG1KcC96bTJNOWxyU0x0TnM4dGg1CkRhVEdnS0ZaQWdNQkFBRUNnZ0VBSW51TFQzTVNYYnFrYmdXNmNjblVuOGw0N1JOYTN4SGtsdU1WSkdEWUJ6L0kKbU5VdUlvTW1EMWNCUi9ZVFhVbWhvczh6MDBtRXZHN3d1c25CdE9qL2ppSjBGRi9EUUZZa0JGOFZGTVk1VlArNQo1eXlJRnZqTW9pRnlVdW93L0lOYnFtcUs1YVZVQWk3T3ozZHhvTG9LL1IyZUxiaDFXb3BzZGRPZTRValBUenBVCnU1TVl4NXlMVnVZc1A3U09TSHRrd2UvMDN5RFJLckl2V3k1QlBtYzJRVEhUcEJPVUJHNC9DcFJWR1ozZjhLa0QKN2QrNlZlNzd1TWV1eERPOG1HZ1paNTRpd0NuMStYR2NFcVFVR1Z1WngrcVpodVhTZks0ajR3eWVtbndlRUFCdgptTlNZSXQ2OG91SSs0cEFyV1ZONEFjaXhWRUxIV1d6MDRYTm56WFUyNFFLQmdRRDBlc0JZenVkRzJaU2t5SWJRCnU4SXhwT2RzRjRnU1lpekNkMTNLQktGbm9obTFrVzlYemRmS3ZObnJxcFRPRnJIYkRXUTdpaUhKM2NqVjlBVTUKTlEwMVUzWXY0SzhkdWtnb2MvRUFhbnQvRjhvMG5qc0pJZ2Z2WTFuUHNPVFVFcGtRQk1QSGpraGpyM3FBNkh4dgp4b0I2OEdVdU1OVHRkQitBV0Y0dXR1T2JoUUtCZ1FER2pnNmJnbGpXRVR4TnhBalQxc21reEpzQ2xmbFFteXRmCmNiaDVWempzdGNad2lLSjh1b0xjT0d4b05HWDJIaGJRQU5wRWhUR3FuMEZIbGxFc1BYbXBoeUJVY01JUFZTWEkKRUlLeU9kL3ZMYjhjWG9ydDZMaDNNS0FoakVLbExENVZOcDhXbVlQM3dCVE1ia3BrM0NDdWxDSEJLcEJXV2Y2NgpQWFp0RUZKa3hRS0JnQjNSTHM1bUJhME5jbVNhbEY2MjE1dG9hbFV6bFlQd2QxY01hZUx1cDZUVkQxK21xamJDClF6UlZ6aHBCQnI4UDQ0YzgzZUdwR2kvZG5kWUNXZlM5Tkt3eFRyUE9LbTFzdjhvM1FjaDBORFd1K0Jsc3h2UjUKTXhDT1JIRGhPVGRvUVVURDRBRGhxSkNINFdBQmV0UERHUDVsZldHaDBRWlk2RktsOUc2c0haeGxBb0dBWnlLLwpIN1B6WlM2S3ZuSkhpNUlVSjh3Z0lKVzZiVTVNbDBWQTUzYVJFUlBTd2YyWE9XYkFOcGZ3WjZoZ0ZobkhDOENGCm4vWDN1SU1FcTZTL0FWWGxibFBNVFZCTTNSNERoQXBmZVNocTA1aFZudXpWQ1lOSzNrNlp2eE5XUXVuYWJ2VHkKYWhEUDVjOFdmcUlEYnFTUkxWMndzdC9qSFplZG95dnQ2ZlVDZDJrQ2dZRUFsbzRZelRabC8vays0WGlpeHVMQQpnZ2ZieTBoS3M1QWlLcFY0Q3pVZVE1Y0tZT2k5SXpvQzJMckxTWCtVckgvd0w3MGdCRzZneUNSZ1dLaW1RbmFWCnRZTy8xM1NyUFVnbm51R2o2Q0I1YUVreXYyTGFPVmV2WEZFcmlFbWQ1cWJKSXJYMENmZ1FuRnI2dm5RZDRwUFMKOGRVMkdhaDRiNVdNSjVJdzgwU3BjR0k9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

View File

@@ -0,0 +1,171 @@
# Development-tuned Mailu configuration
global:
# Using Unbound DNS for DNSSEC validation (required by Mailu admin)
# Unbound service is available at unbound-dns.bakery-ia.svc.cluster.local
# Static ClusterIP configured in unbound-helm/values.yaml
custom_dns_servers: "10.96.53.53" # Unbound DNS static ClusterIP
# Redis configuration - use built-in Mailu Redis (no authentication needed)
externalRedis:
enabled: false
# Component-specific DNS configuration
# Admin requires DNSSEC validation - use Unbound DNS (forwards cluster.local to kube-dns)
admin:
dnsPolicy: "None"
dnsConfig:
nameservers:
- "10.96.53.53" # Unbound DNS static ClusterIP (forwards cluster.local to kube-dns)
searches:
- "bakery-ia.svc.cluster.local"
- "svc.cluster.local"
- "cluster.local"
options:
- name: ndots
value: "5"
# RSPAMD needs Unbound for DNSSEC validation (DKIM/SPF/DMARC checks)
# Using ClusterFirst with search domains + Kubernetes DNS which can forward to Unbound
rspamd:
dnsPolicy: "ClusterFirst"
# Domain configuration for dev
# NOTE: Using .dev TLD instead of .local because email-validator library
# rejects .local domains as "special-use or reserved names" (RFC 6761)
domain: "bakery-ia.dev"
hostnames:
- "mail.bakery-ia.dev"
# Initial admin account for dev environment
# Password is stored in mailu-admin-credentials secret
initialAccount:
enabled: true
username: "admin"
domain: "bakery-ia.dev"
existingSecret: "mailu-admin-credentials"
existingSecretPasswordKey: "password"
mode: "ifmissing"
# External relay configuration for dev (Mailgun)
# All outbound emails will be relayed through Mailgun SMTP
# To configure:
# 1. Register at mailgun.com and verify your domain (bakery-ia.dev)
# 2. Get your SMTP credentials from Mailgun dashboard
# 3. Update the secret in configs/mailgun-credentials-secret.yaml
# 4. Apply the secret: kubectl apply -f configs/mailgun-credentials-secret.yaml -n bakery-ia
externalRelay:
host: "[smtp.mailgun.org]:587"
# Credentials loaded from Kubernetes secret
secretName: "mailu-mailgun-credentials"
usernameKey: "RELAY_USERNAME"
passwordKey: "RELAY_PASSWORD"
# Environment-specific configurations
persistence:
enabled: true
# Development: use default storage class
storageClass: "standard"
size: "5Gi"
# Resource optimizations for development
resources:
admin:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
front:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "200m"
memory: "128Mi"
postfix:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "300m"
memory: "256Mi"
dovecot:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "300m"
memory: "256Mi"
rspamd:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "200m"
memory: "128Mi"
webmail:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "200m"
memory: "128Mi"
clamav:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "300m"
memory: "512Mi"
replicaCount: 1 # Single replica for development
# Security settings
secretKey: "generate-strong-key-here-for-development"
# Ingress configuration for development - disabled to use with existing ingress
ingress:
enabled: false # Disable chart's Ingress; use existing one
tls: false # Disable TLS in chart since ingress handles it
tlsFlavorOverride: notls # No TLS on internal NGINX; expect external proxy to handle TLS
realIpHeader: X-Forwarded-For # Header for client IP from your Ingress
realIpFrom: 0.0.0.0/0 # Trust all proxies (restrict to your Ingress pod CIDR for security)
path: /
pathType: ImplementationSpecific
# TLS flavor for dev (may use self-signed)
tls:
flavor: "notls" # Disable TLS for development
# Welcome message (disabled in dev)
welcomeMessage:
enabled: false
# Log level for dev
logLevel: "DEBUG"
# Development-specific overrides
env:
DEBUG: "true"
LOG_LEVEL: "INFO"
# Disable or simplify monitoring in development
monitoring:
enabled: false
# Network Policy for dev
networkPolicy:
enabled: true
ingressController:
namespace: ingress-nginx
podSelector: |
matchLabels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
monitoring:
namespace: monitoring
podSelector: |
matchLabels:
app: signoz-prometheus

View File

@@ -0,0 +1,31 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: mailu-ingress
namespace: bakery-ia
labels:
app.kubernetes.io/name: mailu
app.kubernetes.io/component: ingress
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- mail.bakery-ia.dev
secretName: bakery-dev-tls-cert
rules:
- host: mail.bakery-ia.dev
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: mailu-front # Helm release name 'mailu' + component 'front'
port:
number: 80

View File

@@ -0,0 +1,164 @@
# Production-tuned Mailu configuration
global:
# Using Kubernetes cluster DNS for name resolution
custom_dns_servers: "10.96.0.10" # Kubernetes cluster DNS IP
# Redis configuration - use built-in Mailu Redis (no authentication needed for internal)
externalRedis:
enabled: false
# DNS configuration for production
# Use Kubernetes DNS (ClusterFirst) which forwards to Unbound via CoreDNS
# This is configured automatically by the mailu-helm Tilt resource
admin:
dnsPolicy: "ClusterFirst"
rspamd:
dnsPolicy: "ClusterFirst"
# Domain configuration for production
domain: "bakewise.ai"
hostnames:
- "mail.bakewise.ai"
# Initial admin account for production environment
# Password is stored in mailu-admin-credentials secret
initialAccount:
enabled: true
username: "admin"
domain: "bakewise.ai"
existingSecret: "mailu-admin-credentials"
existingSecretPasswordKey: "password"
mode: "ifmissing"
# External relay configuration for production (Mailgun)
# All outbound emails will be relayed through Mailgun SMTP
# To configure:
# 1. Register at mailgun.com and verify your domain (bakewise.ai)
# 2. Get your SMTP credentials from Mailgun dashboard
# 3. Update the secret in configs/mailgun-credentials-secret.yaml
# 4. Apply the secret: kubectl apply -f configs/mailgun-credentials-secret.yaml -n bakery-ia
externalRelay:
host: "[smtp.mailgun.org]:587"
# Credentials loaded from Kubernetes secret
secretName: "mailu-mailgun-credentials"
usernameKey: "RELAY_USERNAME"
passwordKey: "RELAY_PASSWORD"
# Environment-specific configurations
persistence:
enabled: true
# Production: use microk8s-hostpath or longhorn
storageClass: "longhorn" # Assuming Longhorn is available in production
size: "20Gi" # Larger storage for production email volume
# Resource allocations for production
resources:
admin:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
front:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
postfix:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
dovecot:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
rspamd:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
clamav:
requests:
cpu: "200m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
replicaCount: 1 # Can be increased in production as needed
# Security settings
secretKey: "generate-strong-key-here-for-production"
# Ingress configuration for production - disabled to use with existing ingress
ingress:
enabled: false # Disable chart's Ingress; use existing one
tls: false # Disable TLS in chart since ingress handles it
tlsFlavorOverride: notls # No TLS on internal NGINX; expect external proxy to handle TLS
realIpHeader: X-Forwarded-For # Header for client IP from your Ingress
realIpFrom: 0.0.0.0/0 # Trust all proxies (restrict to your Ingress pod CIDR for security)
path: /
pathType: ImplementationSpecific
# TLS flavor for production (uses Let's Encrypt)
tls:
flavor: "cert"
# Welcome message (enabled in production)
welcomeMessage:
enabled: true
subject: "Welcome to Bakewise.ai Email Service"
body: "Welcome to our email service. Please change your password and update your profile."
# Log level for production
logLevel: "WARNING"
# Enable antivirus in production
antivirus:
enabled: true
flavor: "clamav"
# Production-specific settings
env:
DEBUG: "false"
LOG_LEVEL: "WARNING"
TLS_FLAVOR: "cert"
REDIS_PASSWORD: "secure-redis-password"
# Enable monitoring in production
monitoring:
enabled: true
# Production-specific security settings
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
# Network policies for production
networkPolicy:
enabled: true
ingressController:
namespace: ingress-nginx
podSelector: |
matchLabels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
monitoring:
namespace: monitoring
podSelector: |
matchLabels:
app: signoz-prometheus

View File

@@ -0,0 +1,269 @@
#!/bin/bash
# =============================================================================
# Mailu Production Deployment Script
# =============================================================================
# This script automates the deployment of Mailu mail server for production.
# It handles:
# 1. Unbound DNS deployment (for DNSSEC validation)
# 2. CoreDNS configuration (forward to Unbound)
# 3. TLS certificate secret creation
# 4. Admin credentials secret creation
# 5. Mailu Helm deployment (admin user created automatically via initialAccount)
#
# Usage:
# ./deploy-mailu-prod.sh [--domain DOMAIN] [--admin-password PASSWORD]
#
# Example:
# ./deploy-mailu-prod.sh --domain bakewise.ai --admin-password 'SecurePass123!'
# =============================================================================
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Default values
DOMAIN="${DOMAIN:-bakewise.ai}"
ADMIN_PASSWORD="${ADMIN_PASSWORD:-}"
NAMESPACE="bakery-ia"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
MAILU_HELM_DIR="$(dirname "$SCRIPT_DIR")"
# Parse arguments
while [[ $# -gt 0 ]]; do
case $1 in
--domain)
DOMAIN="$2"
shift 2
;;
--admin-password)
ADMIN_PASSWORD="$2"
shift 2
;;
--help)
echo "Usage: $0 [--domain DOMAIN] [--admin-password PASSWORD]"
echo ""
echo "Options:"
echo " --domain Domain for Mailu (default: bakewise.ai)"
echo " --admin-password Password for admin@DOMAIN user"
echo ""
exit 0
;;
*)
echo -e "${RED}Unknown option: $1${NC}"
exit 1
;;
esac
done
print_step() {
echo -e "\n${BLUE}==>${NC} ${GREEN}$1${NC}"
}
print_warning() {
echo -e "${YELLOW}WARNING:${NC} $1"
}
print_error() {
echo -e "${RED}ERROR:${NC} $1"
}
print_success() {
echo -e "${GREEN}${NC} $1"
}
# =============================================================================
# Step 0: Prerequisites Check
# =============================================================================
print_step "Step 0: Checking prerequisites..."
if ! command -v kubectl &> /dev/null; then
print_error "kubectl not found. Please install kubectl."
exit 1
fi
if ! command -v helm &> /dev/null; then
print_error "helm not found. Please install helm."
exit 1
fi
if ! kubectl get namespace "$NAMESPACE" &>/dev/null; then
print_warning "Namespace $NAMESPACE does not exist. Creating..."
kubectl create namespace "$NAMESPACE"
fi
print_success "Prerequisites check passed"
# =============================================================================
# Step 1: Deploy Unbound DNS Resolver
# =============================================================================
print_step "Step 1: Deploying Unbound DNS resolver..."
if kubectl get deployment unbound -n "$NAMESPACE" &>/dev/null; then
print_success "Unbound already deployed"
else
helm upgrade --install unbound "$MAILU_HELM_DIR/../../networking/dns/unbound-helm" \
-n "$NAMESPACE" \
-f "$MAILU_HELM_DIR/../../networking/dns/unbound-helm/values.yaml" \
-f "$MAILU_HELM_DIR/../../networking/dns/unbound-helm/prod/values.yaml" \
--timeout 5m \
--wait
print_success "Unbound deployed"
fi
# Wait for Unbound to be ready
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=unbound -n "$NAMESPACE" --timeout=120s
# Get Unbound service IP
UNBOUND_IP=$(kubectl get svc unbound-dns -n "$NAMESPACE" -o jsonpath='{.spec.clusterIP}')
echo "Unbound DNS service IP: $UNBOUND_IP"
# =============================================================================
# Step 2: Configure CoreDNS to Forward to Unbound
# =============================================================================
print_step "Step 2: Configuring CoreDNS for DNSSEC validation..."
# Check current CoreDNS forward configuration
CURRENT_FORWARD=$(kubectl get configmap coredns -n kube-system -o jsonpath='{.data.Corefile}' | grep -o 'forward \. [0-9.]*' | awk '{print $3}' || echo "")
if [ "$CURRENT_FORWARD" != "$UNBOUND_IP" ]; then
echo "Updating CoreDNS to forward to Unbound ($UNBOUND_IP)..."
kubectl patch configmap coredns -n kube-system --type merge -p "{
\"data\": {
\"Corefile\": \".:53 {\\n errors\\n health {\\n lameduck 5s\\n }\\n ready\\n kubernetes cluster.local in-addr.arpa ip6.arpa {\\n pods insecure\\n fallthrough in-addr.arpa ip6.arpa\\n ttl 30\\n }\\n prometheus :9153\\n forward . $UNBOUND_IP {\\n max_concurrent 1000\\n }\\n cache 30 {\\n disable success cluster.local\\n disable denial cluster.local\\n }\\n loop\\n reload\\n loadbalance\\n}\\n\"
}
}"
# Restart CoreDNS
kubectl rollout restart deployment coredns -n kube-system
kubectl rollout status deployment coredns -n kube-system --timeout=60s
print_success "CoreDNS configured to forward to Unbound"
else
print_success "CoreDNS already configured for Unbound"
fi
# =============================================================================
# Step 3: Create TLS Certificate Secret
# =============================================================================
print_step "Step 3: Creating TLS certificate secret..."
if kubectl get secret mailu-certificates -n "$NAMESPACE" &>/dev/null; then
print_success "TLS certificate secret already exists"
else
TEMP_DIR=$(mktemp -d)
cd "$TEMP_DIR"
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout tls.key -out tls.crt \
-subj "/CN=mail.$DOMAIN/O=$DOMAIN" 2>/dev/null
kubectl create secret tls mailu-certificates \
--cert=tls.crt \
--key=tls.key \
-n "$NAMESPACE"
rm -rf "$TEMP_DIR"
print_success "TLS certificate secret created"
fi
# =============================================================================
# Step 4: Create Admin Credentials Secret
# =============================================================================
print_step "Step 4: Creating admin credentials secret..."
if kubectl get secret mailu-admin-credentials -n "$NAMESPACE" &>/dev/null; then
print_success "Admin credentials secret already exists"
# Retrieve existing password for summary output
if [ -z "$ADMIN_PASSWORD" ]; then
ADMIN_PASSWORD=$(kubectl get secret mailu-admin-credentials -n "$NAMESPACE" -o jsonpath='{.data.password}' | base64 -d)
fi
else
if [ -z "$ADMIN_PASSWORD" ]; then
# Generate a random password
ADMIN_PASSWORD=$(openssl rand -base64 16 | tr -d '/+=' | head -c 16)
echo -e "${YELLOW}Generated admin password: $ADMIN_PASSWORD${NC}"
echo -e "${YELLOW}Please save this password securely!${NC}"
fi
kubectl create secret generic mailu-admin-credentials \
--from-literal=password="$ADMIN_PASSWORD" \
-n "$NAMESPACE"
print_success "Admin credentials secret created"
fi
# =============================================================================
# Step 5: Deploy Mailu via Helm
# =============================================================================
print_step "Step 5: Deploying Mailu via Helm..."
# Add Mailu Helm repository
helm repo add mailu https://mailu.github.io/helm-charts 2>/dev/null || true
helm repo update mailu
# Deploy Mailu
helm upgrade --install mailu mailu/mailu \
-n "$NAMESPACE" \
-f "$MAILU_HELM_DIR/values.yaml" \
-f "$MAILU_HELM_DIR/prod/values.yaml" \
--timeout 10m
print_success "Mailu Helm release deployed (admin user will be created automatically)"
# =============================================================================
# Step 6: Wait for Pods to be Ready
# =============================================================================
print_step "Step 6: Waiting for Mailu pods to be ready..."
echo "This may take 5-10 minutes (ClamAV takes time to initialize)..."
# Wait for admin pod first (it's the key dependency)
kubectl wait --for=condition=ready pod -l app.kubernetes.io/component=admin -n "$NAMESPACE" --timeout=300s || {
print_error "Admin pod failed to start. Checking logs..."
kubectl logs -n "$NAMESPACE" -l app.kubernetes.io/component=admin --tail=50
exit 1
}
print_success "Admin pod is ready"
# Show pod status
echo ""
echo "Mailu Pod Status:"
kubectl get pods -n "$NAMESPACE" | grep mailu
print_success "Admin user created automatically via Helm initialAccount"
# =============================================================================
# Summary
# =============================================================================
echo ""
echo "=============================================="
echo -e "${GREEN}Mailu Deployment Complete!${NC}"
echo "=============================================="
echo ""
echo "Admin Credentials:"
echo " Email: admin@$DOMAIN"
echo " Password: $ADMIN_PASSWORD"
echo ""
echo "Access URLs (configure Ingress/DNS first):"
echo " Admin Panel: https://mail.$DOMAIN/admin"
echo " Webmail: https://mail.$DOMAIN/webmail"
echo " SMTP: mail.$DOMAIN:587 (STARTTLS)"
echo " IMAP: mail.$DOMAIN:993 (SSL)"
echo ""
echo "Next Steps:"
echo " 1. Configure DNS records (A, MX, SPF, DMARC)"
echo " 2. Get DKIM key: kubectl exec -n $NAMESPACE deployment/mailu-admin -- cat /dkim/$DOMAIN.dkim.pub"
echo " 3. Add DKIM TXT record to DNS"
echo " 4. Configure Ingress for mail.$DOMAIN"
echo ""
echo "To check pod status:"
echo " kubectl get pods -n $NAMESPACE | grep mailu"
echo ""

View File

@@ -0,0 +1,235 @@
# Base Mailu Helm values for Bakery-IA
# Preserves critical configurations from the original Kustomize setup
# Global DNS configuration for DNSSEC validation
global:
# Using Unbound DNS resolver directly for DNSSEC validation
# Unbound service is available at unbound-dns.bakery-ia.svc.cluster.local
# Static ClusterIP configured in unbound-helm/values.yaml
custom_dns_servers: "10.96.53.53" # Unbound DNS static ClusterIP
# Domain configuration
domain: "DOMAIN_PLACEHOLDER"
hostnames:
- "mail.DOMAIN_PLACEHOLDER"
# Mailu version to match the original setup
mailuVersion: "2024.06"
# Secret key for authentication cookies
secretKey: "cb61b934d47029a64117c0e4110c93f66bbcf5eaa15c84c42727fad78f7"
# Timezone
timezone: "Etc/UTC"
# Postmaster configuration
postmaster: "admin"
# Initial admin account configuration
# This creates an admin user as part of the Helm deployment
# Credentials can be provided directly or via Kubernetes secret
initialAccount:
enabled: true
username: "admin"
domain: "" # Set in environment-specific values (dev/prod)
password: "" # Leave empty to use existingSecret
existingSecret: "mailu-admin-credentials"
existingSecretPasswordKey: "password"
mode: "ifmissing" # Only create if account doesn't exist
# TLS configuration
tls:
flavor: "notls" # Disable TLS for development
# Limits configuration
limits:
messageSizeLimitInMegabytes: 50
authRatelimit:
ip: "60/hour"
user: "100/day"
messageRatelimit:
value: "200/day"
# External relay configuration (Mailgun)
# Mailu will relay all outbound emails through Mailgun SMTP
# Credentials are loaded from Kubernetes secret for security
externalRelay:
host: "[smtp.mailgun.org]:587"
# Use existing secret for credentials (recommended for security)
secretName: "mailu-mailgun-credentials"
usernameKey: "RELAY_USERNAME"
passwordKey: "RELAY_PASSWORD"
# Webmail configuration
webmail:
enabled: true
type: "roundcube"
# Antivirus and antispam configuration
antivirus:
enabled: false # Disabled in dev to save resources
antispam:
enabled: true
flavor: "rspamd"
# Welcome message
welcomeMessage:
enabled: false # Disabled during development
# Logging
logLevel: "INFO"
# Network configuration
subnet: "10.42.0.0/16"
# Redis configuration - using internal Redis (built-in)
externalRedis:
enabled: false
# host: "redis-service.bakery-ia.svc.cluster.local"
# port: 6380
adminQuotaDbId: 15
adminRateLimitDbId: 15
rspamdDbId: 15
# Database configuration - using default SQLite (built-in)
externalDatabase:
enabled: false
# type: "postgresql"
# host: "postgres-service.bakery-ia.svc.cluster.local"
# port: 5432
# database: "mailu"
# username: "mailu"
# password: "E8Kz47YmVzDlHGs1M9wAbJzxcKnGONCT"
# Persistence configuration
persistence:
single_pvc: true
size: 10Gi
storageClass: ""
accessModes: [ReadWriteOnce]
# Ingress configuration - disabled to use with existing ingress
ingress:
enabled: false # Disable chart's Ingress; use existing one
tls: false # Disable TLS in chart since ingress handles it
tlsFlavorOverride: notls # No TLS on internal NGINX; expect external proxy to handle TLS
realIpHeader: X-Forwarded-For # Header for client IP from your Ingress
realIpFrom: 0.0.0.0/0 # Trust all proxies (restrict to your Ingress pod CIDR for security)
path: /
pathType: ImplementationSpecific
# Optional: Enable PROXY protocol for mail protocols if your Ingress supports TCP proxying
proxyProtocol:
smtp: false
smtps: false
submission: false
imap: false
imaps: false
pop3: false
pop3s: false
manageSieve: false
# Front configuration
front:
image:
tag: "2024.06"
replicaCount: 1
service:
type: ClusterIP
ports:
http: 80
https: 443
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
# Admin configuration
admin:
image:
tag: "2024.06"
replicaCount: 1
service:
type: ClusterIP
port: 80
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 300m
memory: 512Mi
# Postfix configuration
postfix:
image:
tag: "2024.06"
replicaCount: 1
service:
type: ClusterIP
ports:
smtp: 25
submission: 587
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
# Dovecot configuration
dovecot:
image:
tag: "2024.06"
replicaCount: 1
service:
type: ClusterIP
ports:
imap: 143
imaps: 993
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
# Rspamd configuration
rspamd:
image:
tag: "2024.06"
replicaCount: 1
service:
type: ClusterIP
ports:
rspamd: 11333
rspamd-admin: 11334
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
# Network Policy
networkPolicy:
enabled: true
ingressController:
namespace: ingress-nginx
podSelector: |
matchLabels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
# DNS Policy Configuration
# Use Kubernetes DNS (ClusterFirst) for internal service resolution
# DNSSEC validation for email is handled by rspamd component
# Note: For production with DNSSEC needs, configure CoreDNS to forward to Unbound
dnsPolicy: "ClusterFirst"

View File

@@ -0,0 +1,18 @@
apiVersion: v2
name: unbound
description: A Helm chart for deploying Unbound DNS resolver for Bakery-IA
type: application
version: 0.1.0
appVersion: "1.19.1"
maintainers:
- name: Bakery-IA Team
email: devops@bakery-ia.com
keywords:
- dns
- resolver
- caching
- unbound
home: https://www.nlnetlabs.nl/projects/unbound/
sources:
- https://github.com/NLnetLabs/unbound
- https://hub.docker.com/r/mvance/unbound

View File

@@ -0,0 +1,64 @@
# Development values for unbound DNS resolver
# Using same configuration as production for consistency
# Use official image for development (same as production)
image:
repository: "mvance/unbound"
tag: "latest"
pullPolicy: "IfNotPresent"
# Resource settings (slightly lower than production for dev)
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "300m"
memory: "384Mi"
# Single replica for development (can be scaled if needed)
replicaCount: 1
# Development annotations
podAnnotations:
environment: "development"
managed-by: "helm"
# Probe settings (same as production but slightly faster)
probes:
readiness:
initialDelaySeconds: 10
periodSeconds: 30
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"
liveness:
initialDelaySeconds: 30
periodSeconds: 60
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"
# Custom Unbound forward records for Kubernetes DNS
config:
enabled: true
# The mvance/unbound image includes forward-records.conf
# We need to add Kubernetes-specific forwarding zones
forwardRecords: |
# Forward all queries to Cloudflare with DNSSEC (catch-all)
forward-zone:
name: "."
forward-tls-upstream: yes
forward-addr: 1.1.1.1@853#cloudflare-dns.com
forward-addr: 1.0.0.1@853#cloudflare-dns.com
# Additional server config to mark cluster.local as insecure (no DNSSEC)
# and use stub zones for Kubernetes internal DNS (more reliable than forward)
serverConfig: |
domain-insecure: "cluster.local."
private-domain: "cluster.local."
local-zone: "10.in-addr.arpa." nodefault
stub-zone:
name: "cluster.local."
stub-addr: 10.96.0.10
stub-zone:
name: "10.in-addr.arpa."
stub-addr: 10.96.0.10

View File

@@ -0,0 +1,50 @@
# Production-specific values for unbound DNS resolver
# Overrides for the production environment
# Use official image for production
image:
repository: "mvance/unbound"
tag: "latest"
pullPolicy: "IfNotPresent"
# Production resource settings (higher limits for reliability)
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
# Production-specific settings
replicaCount: 2
# Production annotations
podAnnotations:
environment: "production"
critical: "true"
# Anti-affinity for high availability in production
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- unbound
topologyKey: "kubernetes.io/hostname"
# Production probe settings (more conservative)
probes:
readiness:
initialDelaySeconds: 15
periodSeconds: 30
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"
liveness:
initialDelaySeconds: 45
periodSeconds: 60
command: "drill @127.0.0.1 -p 53 example.org || echo 'DNS query test'"

View File

@@ -0,0 +1,63 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "unbound.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
*/}}
{{- define "unbound.fullname" -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if contains $name .Release.Name -}}
{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- end -}}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "unbound.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{/*
Common labels
*/}}
{{- define "unbound.labels" -}}
helm.sh/chart: {{ include "unbound.chart" . }}
{{ include "unbound.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}
{{/*
Selector labels
*/}}
{{- define "unbound.selectorLabels" -}}
app.kubernetes.io/name: {{ include "unbound.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: dns
app.kubernetes.io/part-of: bakery-ia
{{- end -}}
{{/*
Create the name of the service account to use
*/}}
{{- define "unbound.serviceAccountName" -}}
{{- if .Values.serviceAccount.create -}}
{{ default (include "unbound.fullname" .) .Values.serviceAccount.name }}
{{- else -}}
{{ default "default" .Values.serviceAccount.name }}
{{- end -}}
{{- end -}}

View File

@@ -0,0 +1,22 @@
{{- if .Values.config.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "unbound.fullname" . }}-config
namespace: {{ .Values.global.namespace }}
labels:
{{- include "unbound.labels" . | nindent 4 }}
data:
{{- if .Values.config.forwardRecords }}
forward-records.conf: |
{{ .Values.config.forwardRecords | indent 4 }}
{{- end }}
{{- if .Values.config.serverConfig }}
a-records.conf: |
{{ .Values.config.serverConfig | indent 4 }}
{{- end }}
{{- if .Values.config.content }}
unbound.conf: |
{{ .Values.config.content | indent 4 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,117 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "unbound.fullname" . }}
namespace: {{ .Values.global.namespace }}
labels:
{{- include "unbound.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "unbound.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "unbound.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "unbound.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: dns-udp
containerPort: {{ .Values.service.ports.dnsUdp }}
protocol: UDP
- name: dns-tcp
containerPort: {{ .Values.service.ports.dnsTcp }}
protocol: TCP
{{- if .Values.probes.readiness.enabled }}
readinessProbe:
exec:
command:
- sh
- -c
- {{ .Values.probes.readiness.command | quote }}
initialDelaySeconds: {{ .Values.probes.readiness.initialDelaySeconds }}
periodSeconds: {{ .Values.probes.readiness.periodSeconds }}
{{- end }}
{{- if .Values.probes.liveness.enabled }}
livenessProbe:
exec:
command:
- sh
- -c
- {{ .Values.probes.liveness.command | quote }}
initialDelaySeconds: {{ .Values.probes.liveness.initialDelaySeconds }}
periodSeconds: {{ .Values.probes.liveness.periodSeconds }}
{{- end }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumeMounts:
{{- if .Values.config.enabled }}
{{- if .Values.config.forwardRecords }}
- name: unbound-config
mountPath: /opt/unbound/etc/unbound/forward-records.conf
subPath: forward-records.conf
{{- end }}
{{- if .Values.config.serverConfig }}
- name: unbound-config
mountPath: /opt/unbound/etc/unbound/a-records.conf
subPath: a-records.conf
{{- end }}
{{- if .Values.config.content }}
- name: unbound-config
mountPath: /opt/unbound/etc/unbound/unbound.conf
subPath: unbound.conf
{{- end }}
{{- end }}
{{- with .Values.volumeMounts }}
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.env }}
env:
{{- toYaml . | nindent 12 }}
{{- end }}
volumes:
{{- if .Values.config.enabled }}
- name: unbound-config
configMap:
name: {{ include "unbound.fullname" . }}-config
{{- end }}
{{- with .Values.volumes }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.extraInitContainers }}
initContainers:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.extraContainers }}
containers:
{{- toYaml . | nindent 8 }}
{{- end }}

View File

@@ -0,0 +1,27 @@
apiVersion: v1
kind: Service
metadata:
name: {{ .Values.global.dnsServiceName }}
namespace: {{ .Values.global.namespace }}
labels:
{{- include "unbound.labels" . | nindent 4 }}
{{- with .Values.serviceAnnotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
type: {{ .Values.service.type }}
{{- if .Values.service.clusterIP }}
clusterIP: {{ .Values.service.clusterIP }}
{{- end }}
ports:
- name: dns-udp
port: {{ .Values.service.ports.dnsUdp }}
targetPort: {{ .Values.service.ports.dnsUdp }}
protocol: UDP
- name: dns-tcp
port: {{ .Values.service.ports.dnsTcp }}
targetPort: {{ .Values.service.ports.dnsTcp }}
protocol: TCP
selector:
{{- include "unbound.selectorLabels" . | nindent 4 }}

View File

@@ -0,0 +1,13 @@
{{- if .Values.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "unbound.serviceAccountName" . }}
namespace: {{ .Values.global.namespace }}
labels:
{{- include "unbound.labels" . | nindent 4 }}
{{- with .Values.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end -}}

Some files were not shown because too many files have changed in this diff Show More