Files
bakery-ia/CI_CD_IMPLEMENTATION_PLAN.md
2026-01-18 09:02:27 +01:00

1207 lines
33 KiB
Markdown

# Bakery-IA Production CI/CD Implementation Plan
## Document Overview
**Status**: Draft
**Version**: 1.0
**Date**: 2024-07-15
**Author**: Mistral Vibe
This document outlines the production-grade CI/CD architecture for Bakery-IA and provides a step-by-step implementation plan without requiring immediate code changes.
## Table of Contents
1. [Current State Analysis](#current-state-analysis)
2. [Target Architecture](#target-architecture)
3. [Implementation Strategy](#implementation-strategy)
4. [Phase 1: Infrastructure Setup](#phase-1-infrastructure-setup)
5. [Phase 2: CI/CD Pipeline Configuration](#phase-2-cicd-pipeline-configuration)
6. [Phase 3: Monitoring and Observability](#phase-3-monitoring-and-observability)
7. [Phase 4: Testing and Validation](#phase-4-testing-and-validation)
8. [Phase 5: Rollout and Migration](#phase-5-rollback-and-migration)
9. [Risk Assessment](#risk-assessment)
10. [Success Metrics](#success-metrics)
11. [Appendices](#appendices)
---
## Current State Analysis
### Existing Infrastructure
- **Microservices**: 19 services in `services/` directory
- **Frontend**: React application in `frontend/`
- **Gateway**: API gateway in `gateway/`
- **Databases**: 22 PostgreSQL instances + Redis + RabbitMQ
- **Storage**: MinIO for object storage
- **Monitoring**: SigNoz already deployed
- **Target Platform**: MicroK8s on Clouding.io VPS
### Current Deployment Process
- Manual builds using Tiltfile/Skaffold (local only)
- Manual image pushes to local registry or Docker Hub
- Manual kubectl apply commands
- No automated testing gates
- No rollback mechanism
### Pain Points
- "Works on my machine" issues
- No audit trail of deployments
- Time-consuming manual processes
- Risk of human error
- No automated testing in pipeline
---
## Target Architecture
### High-Level Architecture Diagram
```mermaid
graph TD
A[Developer Workstation] -->|Push Code| B[Gitea Git Server]
B -->|Webhook| C[Tekton Pipelines]
C -->|Build/Test| D[Gitea Container Registry]
D -->|New Image| E[Flux CD]
E -->|Git Commit| B
E -->|kubectl apply| F[MicroK8s Cluster]
F -->|Metrics/Logs| G[SigNoz Monitoring]
```
### How CI/CD Tools Run in Kubernetes
Yes, they are individual container images running as pods in your MicroK8s cluster, just like your application services.
```mermaid
graph TB
subgraph "MicroK8s Cluster (Your VPS)"
subgraph "Namespace: gitea"
A1[Pod: gitea<br/>Image: gitea/gitea:latest]
A2[Pod: gitea-postgresql<br/>Image: postgres:15]
A3[PVC: gitea-data]
end
subgraph "Namespace: tekton-pipelines"
B1[Pod: tekton-pipelines-controller<br/>Image: gcr.io/tekton-releases/...]
B2[Pod: tekton-pipelines-webhook<br/>Image: gcr.io/tekton-releases/...]
B3[Pod: tekton-triggers-controller<br/>Image: gcr.io/tekton-releases/...]
end
subgraph "Namespace: flux-system"
C1[Pod: source-controller<br/>Image: ghcr.io/fluxcd/...]
C2[Pod: kustomize-controller<br/>Image: ghcr.io/fluxcd/...]
C3[Pod: helm-controller<br/>Image: ghcr.io/fluxcd/...]
end
subgraph "Namespace: bakery-ia (YOUR APP)"
D1[19 services + 22 databases + Redis + RabbitMQ + MinIO]
end
end
```
### Component Breakdown
#### 1. Gitea (Git Server + Registry)
- **Purpose**: Replace GitHub dependency
- **Namespace**: `gitea`
- **Resources**: ~768MB RAM (512MB Gitea + 256MB PostgreSQL)
- **Storage**: PVC for repositories and registry
- **Access**: Internal DNS `gitea.bakery-ia.local`
- **LeaderElectionService**: Gitea handles leader election internally for high availability scenarios
#### 2. Tekton (CI Pipelines)
- **Purpose**: Build, test, and push container images
- **Namespace**: `tekton-pipelines`
- **Resources**: ~650MB baseline + 512MB per build
- **Key Features**:
- Path-based change detection
- Parallel builds for independent services
- Kaniko for in-cluster image building
- Integration with Gitea registry
- **LeaderElectionService**: Tekton controllers use leader election to ensure high availability
#### 3. Flux CD (GitOps Deployment)
- **Purpose**: Automated deployments from Git
- **Namespace**: `flux-system`
- **Resources**: ~230MB baseline
- **Key Features**:
- Pull-based deployments (no webhooks needed)
- Kustomize support for your existing overlays
- Image automation for rolling updates
- Drift detection and correction
- **LeaderElectionService**: Flux controllers use leader election to ensure only one active controller
#### 4. SigNoz (Monitoring)
- **Purpose**: Observability for CI/CD and applications
- **Integration Points**:
- Tekton pipeline metrics
- Flux reconciliation events
- Kubernetes resource metrics
- Application performance monitoring
### Deployment Methods for Each Tool
#### 1. Flux (Easiest - Built into MicroK8s)
```bash
# One command - MicroK8s has it built-in
microk8s enable fluxcd
# This creates:
# - Namespace: flux-system
# - Deployments: source-controller, kustomize-controller, helm-controller, notification-controller
# - CRDs: GitRepository, Kustomization, HelmRelease, etc.
```
**Images pulled:**
- `ghcr.io/fluxcd/source-controller:v1.x.x`
- `ghcr.io/fluxcd/kustomize-controller:v1.x.x`
- `ghcr.io/fluxcd/helm-controller:v0.x.x`
- `ghcr.io/fluxcd/notification-controller:v1.x.x`
#### 2. Tekton (kubectl apply or Helm)
```bash
# Option A: Direct apply (official releases)
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
kubectl apply -f https://storage.googleapis.com/tekton-releases/dashboard/latest/release.yaml
# Option B: Helm chart
helm repo add tekton https://tekton.dev/charts
helm install tekton-pipelines tekton/tekton-pipelines -n tekton-pipelines --create-namespace
```
**Images pulled:**
- `gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/controller:v0.x.x`
- `gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/webhook:v0.x.x`
- `gcr.io/tekton-releases/github.com/tektoncd/triggers/cmd/controller:v0.x.x`
- `gcr.io/tekton-releases/github.com/tektoncd/dashboard/cmd/dashboard:v0.x.x`
#### 3. Gitea (Helm chart)
```bash
# Add Helm repo
helm repo add gitea https://dl.gitea.io/charts
# Install with custom values
helm install gitea gitea/gitea \
-n gitea --create-namespace \
-f gitea-values.yaml
```
**Images pulled:**
- `gitea/gitea:1.x.x`
- `postgres:15-alpine` (or bundled)
---
## Complete Deployment Architecture
```mermaid
graph TB
subgraph "Your Git Repository<br/>(Initially in GitHub, then Gitea)"
A[bakery-ia/<br/>├── services/<br/>├── frontend/<br/>├── gateway/<br/>├── infrastructure/<br/>│ ├── kubernetes/<br/>│ │ ├── base/<br/>│ │ └── overlays/<br/>│ │ ├── dev/<br/>│ │ └── prod/<br/>│ └── ci-cd/<br/>│ ├── gitea/<br/>│ ├── tekton/<br/>│ └── flux/<br/>└── tekton/<br/> └── pipeline.yaml]
end
A --> B[Gitea<br/>Self-hosted Git<br/>Stores code<br/>Triggers webhook]
B --> C[Tekton<br/>EventListener<br/>TriggerTemplate<br/>PipelineRun]
C --> D[Pipeline Steps<br/>├── clone<br/>├── detect changes<br/>├── test<br/>├── build<br/>└── push]
D --> E[Gitea Registry<br/>gitea:5000/bakery/<br/>auth-service:abc123]
E --> F[Flux<br/>source-controller<br/>kustomize-controller<br/>kubectl apply]
F --> G[Your Application<br/>bakery-ia namespace<br/>Updated services]
```
### Guiding Principles
1. **No Code Changes Required**: Use existing codebase as-is
2. **Incremental Rollout**: Phase-based implementation
3. **Zero Downtime**: Parallel run with existing manual process
4. **Observability First**: Monitor before automating
5. **Security by Design**: Secrets management from day one
### Implementation Phases
```mermaid
gantt
title CI/CD Implementation Timeline
dateFormat YYYY-MM-DD
section Phase 1: Infrastructure
Infrastructure Setup :a1, 2024-07-15, 7d
section Phase 2: CI/CD Config
Pipeline Configuration :a2, 2024-07-22, 10d
section Phase 3: Monitoring
SigNoz Integration :a3, 2024-08-01, 5d
section Phase 4: Testing
Validation Testing :a4, 2024-08-06, 7d
section Phase 5: Rollout
Production Migration :a5, 2024-08-13, 5d
```
## Step-by-Step: How to Deploy CI/CD to Production
### Phase 1: Bootstrap (One-time setup on VPS)
```bash
# SSH to your VPS
ssh user@your-clouding-vps
# 1. Enable Flux (built into MicroK8s)
microk8s enable fluxcd
# 2. Install Tekton
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
# 3. Install Gitea via Helm
microk8s helm repo add gitea https://dl.gitea.io/charts
microk8s helm install gitea gitea/gitea -n gitea --create-namespace -f gitea-values.yaml
# 4. Verify all running
microk8s kubectl get pods -A | grep -E "gitea|tekton|flux"
```
After this, you have:
```
NAMESPACE NAME READY STATUS
gitea gitea-0 1/1 Running
gitea gitea-postgresql-0 1/1 Running
tekton-pipelines tekton-pipelines-controller-xxx 1/1 Running
tekton-pipelines tekton-pipelines-webhook-xxx 1/1 Running
tekton-pipelines tekton-triggers-controller-xxx 1/1 Running
flux-system source-controller-xxx 1/1 Running
flux-system kustomize-controller-xxx 1/1 Running
flux-system helm-controller-xxx 1/1 Running
```
### Phase 2: Configure Flux to Watch Your Repo
```yaml
# infrastructure/ci-cd/flux/gitrepository.yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: bakery-ia
namespace: flux-system
spec:
interval: 1m
url: https://gitea.bakery-ia.local/bakery/bakery-ia.git
ref:
branch: main
secretRef:
name: gitea-credentials # Git credentials
---
# infrastructure/ci-cd/flux/kustomization.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: bakery-ia-prod
namespace: flux-system
spec:
interval: 5m
path: ./infrastructure/kubernetes/overlays/prod
prune: true
sourceRef:
kind: GitRepository
name: bakery-ia
targetNamespace: bakery-ia
```
### Phase 3: Configure Tekton Pipeline
```yaml
# tekton/pipeline.yaml
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: bakery-ia-ci
namespace: tekton-pipelines
spec:
params:
- name: git-url
- name: git-revision
- name: changed-services
type: array
workspaces:
- name: source
- name: docker-credentials
tasks:
- name: clone
taskRef:
name: git-clone
workspaces:
- name: output
workspace: source
params:
- name: url
value: $(params.git-url)
- name: revision
value: $(params.git-revision)
- name: detect-changes
runAfter: [clone]
taskRef:
name: detect-changed-services
workspaces:
- name: source
workspace: source
- name: build-and-push
runAfter: [detect-changes]
taskRef:
name: kaniko-build
params:
- name: services
value: $(tasks.detect-changes.results.changed-services)
workspaces:
- name: source
workspace: source
- name: docker-credentials
workspace: docker-credentials
```
## Visual: Complete Production Flow
```mermaid
graph LR
A[Developer pushes code] --> B[Gitea<br/>Self-hosted Git<br/>• Receives push<br/>• Stores code<br/>• Triggers webhook]
B -->|webhook POST to tekton-triggers| C[Tekton<br/>EventListener<br/>TriggerTemplate<br/>PipelineRun]
C --> D[Pipeline Steps<br/>Each step = container in pod:<br/>├── clone<br/>├── detect changes<br/>├── test (pytest)<br/>├── build (kaniko)<br/>└── push (registry)]
D --> E[Only changed services]
D --> F[Gitea Registry<br/>gitea:5000/bakery/<br/>auth-service:abc123]
F -->|Final step: Update image tag in Git<br/>commits new tag to infrastructure/kubernetes/overlays/prod| G[Git commit triggers Flux]
G --> H[Flux<br/>source-controller<br/>kustomize-controller<br/>kubectl apply<br/>• Detects new image tag in Git<br/>• Renders Kustomize overlay<br/>• Applies to bakery-ia namespace<br/>• Rolling update of changed services]
H --> I[Your Application<br/>Namespace: bakery-ia<br/>├── auth-service:abc123 ←NEW<br/>├── tenant-svc:def456<br/>└── training-svc:ghi789<br/>Only auth-service was updated (others unchanged)]
```
## Where Images Come From
| Component | Image Source | Notes |
|-----------|--------------|-------|
| Flux | ghcr.io/fluxcd/* | Pulled once, cached locally |
| Tekton | gcr.io/tekton-releases/* | Pulled once, cached locally |
| Gitea | gitea/gitea (Docker Hub) | Pulled once, cached locally |
| Your Services | gitea.local:5000/bakery/* | Built by Tekton, stored in Gitea registry |
| Build Tools | gcr.io/kaniko-project/executor | Used during builds only |
## Summary: What Lives Where
```mermaid
graph TB
subgraph "MicroK8s Cluster"
subgraph "Namespace: gitea (CI/CD Infrastructure)<br/>~768MB total"
A1[gitea pod ~512MB RAM]
A2[postgresql pod ~256MB RAM]
end
subgraph "Namespace: tekton-pipelines (CI/CD Infrastructure)<br/>~650MB baseline"
B1[pipelines-controller ~200MB RAM]
B2[pipelines-webhook ~100MB RAM]
B3[triggers-controller ~150MB RAM]
B4[triggers-webhook ~100MB RAM]
end
subgraph "Namespace: flux-system (CI/CD Infrastructure)<br/>~230MB baseline"
C1[source-controller ~50MB RAM]
C2[kustomize-controller ~50MB RAM]
C3[helm-controller ~50MB RAM]
C4[notification-controller ~30MB RAM]
end
subgraph "Namespace: bakery-ia (YOUR APPLICATION)"
D1[19 microservices]
D2[22 PostgreSQL databases]
D3[Redis]
D4[RabbitMQ]
D5[MinIO]
end
end
note1["CI/CD Total: ~1.5GB baseline"]
note2["During builds: +512MB per concurrent build (Tekton spawns pods)"]
```
### Key Points
- Everything runs as pods - Gitea, Tekton, Flux are all containerized
- Pulled from public registries once - then cached on your VPS
- Your app images stay local - built by Tekton, stored in Gitea registry
- No external dependencies after setup - fully self-contained
- Flux pulls from Git - no incoming webhooks needed for deployments
---
## Phase 1: Infrastructure Setup
### Objective
Deploy CI/CD infrastructure components without affecting existing applications.
### Step-by-Step Implementation
#### Step 1: Prepare MicroK8s Cluster
```bash
# SSH to VPS
ssh admin@bakery-ia-vps
# Verify MicroK8s status
microk8s status
# Enable required addons
microk8s enable dns storage ingress fluxcd
# Verify storage class
microk8s kubectl get storageclass
```
#### Step 2: Deploy Gitea
**Create Gitea values file** (`infrastructure/ci-cd/gitea/values.yaml`):
```yaml
service:
type: ClusterIP
httpPort: 3000
sshPort: 2222
persistence:
enabled: true
size: 50Gi
storageClass: "microk8s-hostpath"
gitea:
config:
server:
DOMAIN: gitea.bakery-ia.local
SSH_DOMAIN: gitea.bakery-ia.local
ROOT_URL: http://gitea.bakery-ia.local
repository:
ENABLE_PUSH_CREATE_USER: true
ENABLE_PUSH_CREATE_ORG: true
registry:
ENABLED: true
postgresql:
enabled: true
persistence:
size: 20Gi
```
**Deploy Gitea**:
```bash
# Add Helm repo
microk8s helm repo add gitea https://dl.gitea.io/charts
# Create namespace
microk8s kubectl create namespace gitea
# Install Gitea
microk8s helm install gitea gitea/gitea \
-n gitea \
-f infrastructure/ci-cd/gitea/values.yaml
```
**Verify Deployment**:
```bash
# Check pods
microk8s kubectl get pods -n gitea
# Get admin password
microk8s kubectl get secret -n gitea gitea-admin-secret -o jsonpath='{.data.password}' | base64 -d
```
#### Step 3: Configure Ingress for Gitea
**Create Ingress Resource** (`infrastructure/ci-cd/gitea/ingress.yaml`):
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: gitea-ingress
namespace: gitea
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: gitea.bakery-ia.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: gitea-http
port:
number: 3000
```
**Apply Ingress**:
```bash
microk8s kubectl apply -f infrastructure/ci-cd/gitea/ingress.yaml
```
#### Step 4: Migrate Repository from GitHub
**Manual Migration Steps**:
1. Create new repository in Gitea UI
2. Use git mirror to push existing repo:
```bash
# Clone bare repo from GitHub
git clone --bare git@github.com:your-org/bakery-ia.git
# Push to Gitea
cd bakery-ia.git
git push --mirror http://admin:PASSWORD@gitea.bakery-ia.local/your-org/bakery-ia.git
```
#### Step 5: Deploy Tekton
**Install Tekton Pipelines**:
```bash
# Create namespace
microk8s kubectl create namespace tekton-pipelines
# Install Tekton Pipelines
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/latest/release.yaml
# Install Tekton Triggers
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/triggers/latest/release.yaml
# Install Tekton Dashboard (optional)
microk8s kubectl apply -f https://storage.googleapis.com/tekton-releases/dashboard/latest/release.yaml
```
**Verify Installation**:
```bash
microk8s kubectl get pods -n tekton-pipelines
```
#### Step 6: Configure Tekton for Gitea Integration
**Create Gitea Webhook Secret**:
```bash
# Generate webhook secret
WEBHOOK_SECRET=$(openssl rand -hex 20)
# Create secret
microk8s kubectl create secret generic gitea-webhook-secret \
-n tekton-pipelines \
--from-literal=secretToken=$WEBHOOK_SECRET
```
**Configure Gitea Webhook**:
1. Go to Gitea repository settings
2. Add webhook:
- URL: `http://tekton-triggers.tekton-pipelines.svc.cluster.local:8080`
- Secret: Use the generated `WEBHOOK_SECRET`
- Trigger: Push events
#### Step 7: Verify Flux Installation
**Check Flux Components**:
```bash
microk8s kubectl get pods -n flux-system
# Verify CRDs
microk8s kubectl get crd | grep flux
```
---
## Phase 2: CI/CD Pipeline Configuration
### Objective
Configure pipelines to build, test, and deploy services automatically.
### Step-by-Step Implementation
#### Step 1: Create Tekton Tasks
**Git Clone Task** (`infrastructure/ci-cd/tekton/tasks/git-clone.yaml`):
```yaml
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: git-clone
namespace: tekton-pipelines
spec:
workspaces:
- name: output
params:
- name: url
type: string
- name: revision
type: string
default: "main"
steps:
- name: clone
image: alpine/git
script: |
git clone $(params.url) $(workspaces.output.path)
cd $(workspaces.output.path)
git checkout $(params.revision)
```
**Detect Changed Services Task** (`infrastructure/ci-cd/tekton/tasks/detect-changes.yaml`):
```yaml
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: detect-changed-services
namespace: tekton-pipelines
spec:
workspaces:
- name: source
results:
- name: changed-services
description: List of changed services
steps:
- name: detect
image: alpine/git
script: |
cd $(workspaces.source.path)
# Get list of changed files
CHANGED_FILES=$(git diff --name-only HEAD~1 HEAD)
# Map files to services
CHANGED_SERVICES=()
for file in $CHANGED_FILES; do
if [[ $file == services/* ]]; then
SERVICE=$(echo $file | cut -d'/' -f2)
CHANGED_SERVICES+=($SERVICE)
fi
done
# Remove duplicates and output
echo $(printf "%s," "${CHANGED_SERVICES[@]}" | sed 's/,$//') | tee $(results.changed-services.path)
```
**Kaniko Build Task** (`infrastructure/ci-cd/tekton/tasks/kaniko-build.yaml`):
```yaml
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: kaniko-build
namespace: tekton-pipelines
spec:
workspaces:
- name: source
- name: docker-credentials
params:
- name: services
type: string
- name: registry
type: string
default: "gitea.bakery-ia.local:5000"
steps:
- name: build-and-push
image: gcr.io/kaniko-project/executor:v1.9.0
args:
- --dockerfile=$(workspaces.source.path)/services/$(params.services)/Dockerfile
- --context=$(workspaces.source.path)
- --destination=$(params.registry)/bakery/$(params.services):$(params.git-revision)
volumeMounts:
- name: docker-config
mountPath: /kaniko/.docker
```
#### Step 2: Create Tekton Pipeline
**Main CI Pipeline** (`infrastructure/ci-cd/tekton/pipelines/ci-pipeline.yaml`):
```yaml
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: bakery-ia-ci
namespace: tekton-pipelines
spec:
workspaces:
- name: shared-workspace
- name: docker-credentials
params:
- name: git-url
type: string
- name: git-revision
type: string
tasks:
- name: fetch-source
taskRef:
name: git-clone
workspaces:
- name: output
workspace: shared-workspace
params:
- name: url
value: $(params.git-url)
- name: revision
value: $(params.git-revision)
- name: detect-changes
runAfter: [fetch-source]
taskRef:
name: detect-changed-services
workspaces:
- name: source
workspace: shared-workspace
- name: build-and-push
runAfter: [detect-changes]
taskRef:
name: kaniko-build
workspaces:
- name: source
workspace: shared-workspace
- name: docker-credentials
workspace: docker-credentials
params:
- name: services
value: $(tasks.detect-changes.results.changed-services)
- name: registry
value: "gitea.bakery-ia.local:5000"
```
#### Step 3: Create Tekton Trigger
**Trigger Template** (`infrastructure/ci-cd/tekton/triggers/trigger-template.yaml`):
```yaml
apiVersion: triggers.tekton.dev/v1alpha1
kind: TriggerTemplate
metadata:
name: bakery-ia-trigger-template
namespace: tekton-pipelines
spec:
params:
- name: git-repo-url
- name: git-revision
resourcetemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: bakery-ia-ci-run-
spec:
pipelineRef:
name: bakery-ia-ci
workspaces:
- name: shared-workspace
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
- name: docker-credentials
secret:
secretName: gitea-registry-credentials
params:
- name: git-url
value: $(params.git-repo-url)
- name: git-revision
value: $(params.git-revision)
```
**Trigger Binding** (`infrastructure/ci-cd/tekton/triggers/trigger-binding.yaml`):
```yaml
apiVersion: triggers.tekton.dev/v1alpha1
kind: TriggerBinding
metadata:
name: bakery-ia-trigger-binding
namespace: tekton-pipelines
spec:
params:
- name: git-repo-url
value: $(body.repository.clone_url)
- name: git-revision
value: $(body.head_commit.id)
```
**Event Listener** (`infrastructure/ci-cd/tekton/triggers/event-listener.yaml`):
```yaml
apiVersion: triggers.tekton.dev/v1alpha1
kind: EventListener
metadata:
name: bakery-ia-listener
namespace: tekton-pipelines
spec:
serviceAccountName: tekton-triggers-sa
triggers:
- name: bakery-ia-trigger
bindings:
- ref: bakery-ia-trigger-binding
template:
ref: bakery-ia-trigger-template
```
#### Step 4: Configure Flux for GitOps
**Git Repository Source** (`infrastructure/ci-cd/flux/git-repository.yaml`):
```yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: bakery-ia
namespace: flux-system
spec:
interval: 1m
url: http://gitea.bakery-ia.local/your-org/bakery-ia.git
ref:
branch: main
secretRef:
name: gitea-credentials
```
**Kustomization for Production** (`infrastructure/ci-cd/flux/kustomization.yaml`):
```yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: bakery-ia-prod
namespace: flux-system
spec:
interval: 5m
path: ./infrastructure/kubernetes/overlays/prod
prune: true
sourceRef:
kind: GitRepository
name: bakery-ia
targetNamespace: bakery-ia
```
#### Step 5: Apply All Configurations
```bash
# Apply Tekton tasks
microk8s kubectl apply -f infrastructure/ci-cd/tekton/tasks/
# Apply Tekton pipeline
microk8s kubectl apply -f infrastructure/ci-cd/tekton/pipelines/
# Apply Tekton triggers
microk8s kubectl apply -f infrastructure/ci-cd/tekton/triggers/
# Apply Flux configurations
microk8s kubectl apply -f infrastructure/ci-cd/flux/
```
---
## Phase 3: Monitoring and Observability
### Objective
Integrate SigNoz with CI/CD pipelines for comprehensive monitoring.
### Step-by-Step Implementation
#### Step 1: Configure OpenTelemetry for Tekton
**Install OpenTelemetry Collector** (`infrastructure/ci-cd/monitoring/otel-collector.yaml`):
```yaml
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: tekton-otel
namespace: tekton-pipelines
spec:
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
exporters:
otlp:
endpoint: "signoz-otel-collector.monitoring.svc.cluster.local:4317"
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [otlp]
```
**Apply Configuration**:
```bash
microk8s kubectl apply -f infrastructure/ci-cd/monitoring/otel-collector.yaml
```
#### Step 2: Instrument Tekton Pipelines
**Update Pipeline with Tracing** (add to `ci-pipeline.yaml`):
```yaml
spec:
tasks:
- name: fetch-source
taskRef:
name: git-clone
# Add OpenTelemetry sidecar
sidecars:
- name: otel-collector
image: otel/opentelemetry-collector-contrib:0.70.0
args: ["--config=/etc/otel-collector-config.yaml"]
volumeMounts:
- name: otel-config
mountPath: /etc/otel-collector-config.yaml
subPath: otel-collector-config.yaml
volumes:
- name: otel-config
configMap:
name: otel-collector-config
```
#### Step 3: Configure SigNoz Dashboards
**Create CI/CD Dashboard**:
1. Log in to SigNoz UI
2. Create new dashboard: "CI/CD Pipeline Metrics"
3. Add panels:
- Pipeline execution time
- Success/failure rates
- Build duration by service
- Resource usage during builds
**Create Deployment Dashboard**:
1. Create dashboard: "GitOps Deployment Metrics"
2. Add panels:
- Flux reconciliation events
- Deployment frequency
- Rollback events
- Resource changes
---
## Phase 4: Testing and Validation
### Objective
Validate CI/CD pipeline functionality without affecting production.
### Test Plan
#### Test 1: Gitea Functionality
- **Test**: Push code to Gitea repository
- **Expected**: Code appears in Gitea UI, webhook triggers
- **Validation**:
```bash
# Push test commit
cd bakery-ia
echo "test" > test-file.txt
git add test-file.txt
git commit -m "Test CI/CD"
git push origin main
```
#### Test 2: Tekton Pipeline Trigger
- **Test**: Verify pipeline triggers on push
- **Expected**: PipelineRun created in tekton-pipelines namespace
- **Validation**:
```bash
# Check PipelineRuns
microk8s kubectl get pipelineruns -n tekton-pipelines
```
#### Test 3: Change Detection
- **Test**: Modify single service and verify only that service builds
- **Expected**: Only changed service is built and pushed
- **Validation**:
```bash
# Check build logs
microk8s kubectl logs -n tekton-pipelines <pipelinerun-pod> -c build-and-push
```
#### Test 4: Image Registry
- **Test**: Verify images pushed to Gitea registry
- **Expected**: New image appears in registry
- **Validation**:
```bash
# List images in registry
curl -X GET http://gitea.bakery-ia.local/api/v2/repositories/bakery/auth-service/tags
```
#### Test 5: Flux Deployment
- **Test**: Verify Flux detects and applies changes
- **Expected**: New deployment in bakery-ia namespace
- **Validation**:
```bash
# Check Flux reconciliation
microk8s kubectl get kustomizations -n flux-system
# Check deployments
microk8s kubectl get deployments -n bakery-ia
```
#### Test 6: Rollback
- **Test**: Verify rollback capability
- **Expected**: Previous version redeployed successfully
- **Validation**:
```bash
# Rollback via Git
git revert <commit-hash>
git push origin main
# Verify rollback
microk8s kubectl get pods -n bakery-ia -w
```
---
## Phase 5: Rollout and Migration
### Objective
Gradually migrate from manual to automated CI/CD.
### Migration Strategy
#### Step 1: Parallel Run
- Run automated CI/CD alongside manual process
- Compare results for 1 week
- Monitor with SigNoz
#### Step 2: Canary Deployment
- Start with non-critical services:
- auth-service
- tenant-service
- training-service
- Monitor stability and performance
#### Step 3: Full Migration
- Migrate all services to automated pipeline
- Disable manual deployment scripts
- Update documentation
#### Step 4: Cleanup
- Remove old Tiltfile/Skaffold configurations
- Archive manual deployment scripts
- Update team documentation
---
## Risk Assessment
### Identified Risks
| Risk | Likelihood | Impact | Mitigation Strategy |
|------|------------|--------|---------------------|
| Pipeline fails to detect changes | Medium | High | Manual override procedure, detailed logging |
| Resource exhaustion during builds | High | Medium | Resource quotas, build queue limits |
| Registry storage fills up | Medium | Medium | Automated cleanup policy, monitoring alerts |
| Flux applies incorrect configuration | Low | High | Manual approval for first run, rollback testing |
| Network issues between components | Medium | High | Health checks, retry logic |
### Mitigation Plan
1. **Resource Management**:
- Set resource quotas for CI/CD namespaces
- Limit concurrent builds to 2
- Monitor with SigNoz alerts
2. **Backup Strategy**:
- Regular backups of Gitea (repos + registry)
- Backup Flux configurations
- Database backups for all services
3. **Rollback Plan**:
- Document manual rollback procedures
- Test rollback for each service
- Maintain backup of manual deployment scripts
4. **Monitoring Alerts**:
- Pipeline failure alerts
- Resource threshold alerts
- Deployment failure alerts
---
## Success Metrics
### Quantitative Metrics
1. **Deployment Frequency**: Increase from manual to automated deployments
2. **Lead Time for Changes**: Reduce from hours to minutes
3. **Change Failure Rate**: Maintain or reduce current rate
4. **Mean Time to Recovery**: Improve with automated rollbacks
5. **Resource Utilization**: Monitor CI/CD overhead (< 2GB baseline)
### Qualitative Metrics
1. **Developer Satisfaction**: Survey team on CI/CD experience
2. **Deployment Confidence**: Reduced "works on my machine" issues
3. **Auditability**: Full traceability of all deployments
4. **Reliability**: Consistent deployment outcomes
---
## Appendices
### Appendix A: Required Tools and Versions
- MicroK8s: v1.27+
- Gitea: v1.19+
- Tekton Pipelines: v0.47+
- Flux CD: v2.0+
- SigNoz: v0.20+
- Kaniko: v1.9+
### Appendix B: Network Requirements
- Internal DNS: `gitea.bakery-ia.local`
- Ingress: Configured for Gitea and SigNoz
- Network Policies: Allow communication between namespaces
### Appendix C: Backup Procedures
```bash
# Backup Gitea
microk8s kubectl exec -n gitea gitea-0 -- gitea dump -c /data/gitea/conf/app.ini
# Backup Flux configurations
microk8s kubectl get all -n flux-system -o yaml > flux-backup.yaml
# Backup Tekton configurations
microk8s kubectl get all -n tekton-pipelines -o yaml > tekton-backup.yaml
```
### Appendix D: Troubleshooting Guide
**Issue: Pipeline not triggering**
- Check Gitea webhook logs
- Verify EventListener pods
- Check TriggerBinding configuration
**Issue: Build fails**
- Check Kaniko logs
- Verify Dockerfile paths
- Ensure registry credentials are correct
**Issue: Flux not applying changes**
- Check GitRepository status
- Verify Kustomization reconciliation
- Check Flux logs
---
## Conclusion
This implementation plan provides a clear path to transition from manual deployments to a fully automated, self-hosted CI/CD system. By following the phased approach, we minimize risk while maximizing the benefits of automation, observability, and reliability.
### Next Steps
1. Review and approve this plan
2. Schedule Phase 1 implementation
3. Assign team members to specific tasks
4. Begin infrastructure setup
**Approval**:
- [ ] Team Lead
- [ ] DevOps Engineer
- [ ] Security Review
**Implementation Start Date**: _______________
**Target Completion Date**: _______________