Files

Urtzi Alfaro 667e6e0404 New alert service

2025-12-05 20:07:01 +01:00

9.6 KiB

Raw Blame History

VPS Sizing for Production Deployment

Executive Summary

This document provides detailed resource requirements for deploying the Bakery IA platform to a production VPS environment at clouding.io for a 10-tenant pilot program during the first 6 months.

Recommended VPS Configuration

RAM: 20 GB
Processor: 8 vCPU cores
SSD NVMe (Triple Replica): 200 GB

Estimated Monthly Cost: Contact clouding.io for current pricing

Resource Analysis

1. Application Services (18 Microservices)

Standard Services (14 services)

Each service configured with:

Request: 256Mi RAM, 100m CPU
Limit: 512Mi RAM, 500m CPU
Production replicas: 2-3 per service (from prod overlay)

Services:

auth-service (3 replicas)
tenant-service (2 replicas)
inventory-service (2 replicas)
recipes-service (2 replicas)
suppliers-service (2 replicas)
orders-service (3 replicas) with HPA 1-3
sales-service (2 replicas)
pos-service (2 replicas)
production-service (2 replicas)
procurement-service (2 replicas)
orchestrator-service (2 replicas)
external-service (2 replicas)
ai-insights-service (2 replicas)
alert-processor (3 replicas)

Total for standard services: ~39 pods

RAM requests: ~10 GB
RAM limits: ~20 GB
CPU requests: ~3.9 cores
CPU limits: ~19.5 cores

ML/Heavy Services (2 services)

Training Service (2 replicas):

Request: 512Mi RAM, 200m CPU
Limit: 4Gi RAM, 2000m CPU
Special storage: 10Gi PVC for models, 4Gi temp storage

Forecasting Service (3 replicas) with HPA 1-3:

Request: 512Mi RAM, 200m CPU
Limit: 1Gi RAM, 1000m CPU

Notification Service (3 replicas) with HPA 1-3:

Request: 256Mi RAM, 100m CPU
Limit: 512Mi RAM, 500m CPU

ML services total:

RAM requests: ~2.3 GB
RAM limits: ~11 GB
CPU requests: ~1 core
CPU limits: ~7 cores

2. Databases (18 PostgreSQL instances)

Each database:

Request: 256Mi RAM, 100m CPU
Limit: 512Mi RAM, 500m CPU
Storage: 2Gi PVC each
Production replicas: 1 per database

Total for databases: 18 instances

RAM requests: ~4.6 GB
RAM limits: ~9.2 GB
CPU requests: ~1.8 cores
CPU limits: ~9 cores
Storage: 36 GB

3. Infrastructure Services

Redis (1 instance):

Request: 256Mi RAM, 100m CPU
Limit: 512Mi RAM, 500m CPU
Storage: 1Gi PVC
TLS enabled

RabbitMQ (1 instance):

Request: 512Mi RAM, 200m CPU
Limit: 1Gi RAM, 1000m CPU
Storage: 2Gi PVC

Infrastructure total:

RAM requests: ~0.8 GB
RAM limits: ~1.5 GB
CPU requests: ~0.3 cores
CPU limits: ~1.5 cores
Storage: 3 GB

4. Gateway & Frontend

Gateway (3 replicas):

Request: 256Mi RAM, 100m CPU
Limit: 512Mi RAM, 500m CPU

Frontend (2 replicas):

Request: 512Mi RAM, 250m CPU
Limit: 1Gi RAM, 500m CPU

Total:

RAM requests: ~1.8 GB
RAM limits: ~3.5 GB
CPU requests: ~0.8 cores
CPU limits: ~2.5 cores

5. Monitoring Stack (Optional but Recommended)

Prometheus:

Request: 1Gi RAM, 500m CPU
Limit: 2Gi RAM, 1000m CPU
Storage: 20Gi PVC
Retention: 200h

Grafana:

Request: 256Mi RAM, 100m CPU
Limit: 512Mi RAM, 200m CPU
Storage: 5Gi PVC

Jaeger:

Request: 256Mi RAM, 100m CPU
Limit: 512Mi RAM, 200m CPU

Monitoring total:

RAM requests: ~1.5 GB
RAM limits: ~3 GB
CPU requests: ~0.7 cores
CPU limits: ~1.4 cores
Storage: 25 GB

6. External Services (Optional in Production)

Nominatim (Disabled by default - can use external geocoding API):

If enabled: 2Gi/1 CPU request, 4Gi/2 CPU limit
Storage: 70Gi (50Gi data + 20Gi flatnode)
Recommendation: Use external geocoding service (Google Maps API, Mapbox) for pilot to save resources

Total Resource Summary

With Monitoring, Without Nominatim (Recommended)

Resource	Requests	Limits	Recommended VPS
RAM	~21 GB	~48 GB	20 GB
CPU	~8.5 cores	~41 cores	8 vCPU
Storage	~79 GB	-	200 GB NVMe

Memory Calculation Details

Application services: 14.1 GB requests / 34.5 GB limits
Databases: 4.6 GB requests / 9.2 GB limits
Infrastructure: 0.8 GB requests / 1.5 GB limits
Gateway/Frontend: 1.8 GB requests / 3.5 GB limits
Monitoring: 1.5 GB requests / 3 GB limits
Total requests: ~22.8 GB
Total limits: ~51.7 GB

Why 20 GB RAM is Sufficient

Requests vs Limits: Kubernetes uses requests for scheduling. Our total requests (~22.8 GB) fit in 20 GB because:
- Not all services will run at their request levels simultaneously during pilot
- HPA-enabled services (orders, forecasting, notification) start at 1 replica
- Some overhead included in our calculations
Actual Usage: Production limits are safety margins. Real usage for 10 tenants will be:
- Most services use 40-60% of their limits under normal load
- Pilot traffic is significantly lower than peak design capacity
Cost-Effective Pilot: Starting with 20 GB allows:
- Room for monitoring and logging
- Comfortable headroom (15-25%)
- Easy vertical scaling if needed

CPU Calculation Details

Application services: 5.7 cores requests / 28.5 cores limits
Databases: 1.8 cores requests / 9 cores limits
Infrastructure: 0.3 cores requests / 1.5 cores limits
Gateway/Frontend: 0.8 cores requests / 2.5 cores limits
Monitoring: 0.7 cores requests / 1.4 cores limits
Total requests: ~9.3 cores
Total limits: ~42.9 cores

Storage Calculation

Databases: 36 GB (18 × 2Gi)
Model storage: 10 GB
Infrastructure (Redis, RabbitMQ): 3 GB
Monitoring: 25 GB
OS and container images: ~30 GB
Growth buffer: ~95 GB
Total: ~199 GB → 200 GB NVMe recommended

Scaling Considerations

Horizontal Pod Autoscaling (HPA)

Already configured for:

orders-service: 1-3 replicas based on CPU (70%) and memory (80%)
forecasting-service: 1-3 replicas based on CPU (70%) and memory (75%)
notification-service: 1-3 replicas based on CPU (70%) and memory (80%)

These services will automatically scale up under load without manual intervention.

Growth Path for 6-12 Months

If tenant count grows beyond 10:

Tenants	RAM	CPU	Storage
10	20 GB	8 cores	200 GB
25	32 GB	12 cores	300 GB
50	48 GB	16 cores	500 GB
100+	Consider Kubernetes cluster with multiple nodes

Vertical Scaling

If you hit resource limits before adding more tenants:

Upgrade RAM first (most common bottleneck)
Then CPU if services show high utilization
Storage can be expanded independently

Cost Optimization Strategies

For Pilot Phase (Months 1-6)

Disable Nominatim: Use external geocoding API
- Saves: 70 GB storage, 2 GB RAM, 1 CPU core
- Cost: ~$5-10/month for external API (Google Maps, Mapbox)
- Recommendation: Enable Nominatim only if >50 tenants
Start Without Monitoring: Add later if needed
- Saves: 25 GB storage, 1.5 GB RAM, 0.7 CPU cores
- Not recommended - monitoring is crucial for production
Reduce Database Replicas: Keep at 1 per service
- Already configured in base
- Acceptable risk for pilot phase

After Pilot Success (Months 6+)

Enable full HA: Increase database replicas to 2
Add Nominatim: If external API costs exceed $20/month
Upgrade VPS: To 32 GB RAM / 12 cores for 25+ tenants

Network and Additional Requirements

Bandwidth

Estimated: 2-5 TB/month for 10 tenants
Includes: API traffic, frontend assets, image uploads, reports

Backup Strategy

Database backups: ~10 GB/day (compressed)
Retention: 30 days
Additional storage: 300 GB for backups (separate volume recommended)

Domain & SSL

1 domain: yourdomain.com
SSL: Let's Encrypt (free) or wildcard certificate
Ingress controller: nginx (included in stack)

Deployment Checklist

Pre-Deployment

VPS provisioned with 20 GB RAM, 8 cores, 200 GB NVMe
Docker and Kubernetes (k3s or similar) installed
Domain DNS configured
SSL certificates ready

Initial Deployment

Deploy with skaffold run -p prod
Verify all pods running: kubectl get pods -n bakery-ia
Check PVC status: kubectl get pvc -n bakery-ia
Access frontend and test login

Post-Deployment Monitoring

Set up external monitoring (UptimeRobot, Pingdom)
Configure backup schedule
Test database backups and restore
Load test with simulated tenant traffic

Support and Scaling

When to Scale Up

Monitor these metrics:

RAM usage consistently >80% → Upgrade RAM
CPU usage consistently >70% → Upgrade CPU
Storage >150 GB used → Upgrade storage
Response times >2 seconds → Add replicas or upgrade VPS

Emergency Scaling

If you hit limits suddenly:

Scale down non-critical services temporarily
Disable monitoring temporarily (not recommended for >1 hour)
Increase VPS resources (clouding.io allows live upgrades)
Review and optimize resource-heavy queries

Conclusion

The recommended 20 GB RAM / 8 vCPU / 200 GB NVMe configuration provides:

✅ Comfortable headroom for 10-tenant pilot ✅ Full monitoring and observability ✅ High availability for critical services ✅ Room for traffic spikes (2-3x baseline) ✅ Cost-effective starting point ✅ Easy scaling path as you grow

Total estimated compute cost: €40-80/month (check clouding.io current pricing) Additional costs: Domain (~~€15/year), external APIs (~~€10/month), backups (~€10/month)

Next steps:

Provision VPS at clouding.io
Follow deployment guide in /docs/DEPLOYMENT.md
Monitor resource usage for first 2 weeks
Adjust based on actual metrics

9.6 KiB Raw Blame History Unescape Escape