Frontend Changes: - Fix runtime error: Remove undefined handleModify reference from ActionQueueCard in DashboardPage - Migrate PurchaseOrderDetailsModal to use correct PurchaseOrderItem type from purchase_orders service - Fix item display: Parse unit_price as string (Decimal) instead of number - Use correct field names: item_notes instead of notes - Remove deprecated PurchaseOrder types from suppliers.ts to prevent type conflicts - Update CreatePurchaseOrderModal to use unified types - Clean up API exports: Remove old PO hooks re-exported from suppliers - Add comprehensive translations for PO modal (en, es, eu) Documentation Reorganization: - Move WhatsApp implementation docs to docs/03-features/notifications/whatsapp/ - Move forecast validation docs to docs/03-features/forecasting/ - Move specification docs to docs/03-features/specifications/ - Move deployment docs (Colima, K8s, VPS sizing) to docs/05-deployment/ - Archive completed implementation summaries to docs/archive/implementation-summaries/ - Delete obsolete FRONTEND_CHANGES_NEEDED.md - Standardize filenames to lowercase with hyphens 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
9.6 KiB
VPS Sizing for Production Deployment
Executive Summary
This document provides detailed resource requirements for deploying the Bakery IA platform to a production VPS environment at clouding.io for a 10-tenant pilot program during the first 6 months.
Recommended VPS Configuration
RAM: 20 GB
Processor: 8 vCPU cores
SSD NVMe (Triple Replica): 200 GB
Estimated Monthly Cost: Contact clouding.io for current pricing
Resource Analysis
1. Application Services (18 Microservices)
Standard Services (14 services)
Each service configured with:
- Request: 256Mi RAM, 100m CPU
- Limit: 512Mi RAM, 500m CPU
- Production replicas: 2-3 per service (from prod overlay)
Services:
- auth-service (3 replicas)
- tenant-service (2 replicas)
- inventory-service (2 replicas)
- recipes-service (2 replicas)
- suppliers-service (2 replicas)
- orders-service (3 replicas) with HPA 1-3
- sales-service (2 replicas)
- pos-service (2 replicas)
- production-service (2 replicas)
- procurement-service (2 replicas)
- orchestrator-service (2 replicas)
- external-service (2 replicas)
- ai-insights-service (2 replicas)
- alert-processor (3 replicas)
Total for standard services: ~39 pods
- RAM requests: ~10 GB
- RAM limits: ~20 GB
- CPU requests: ~3.9 cores
- CPU limits: ~19.5 cores
ML/Heavy Services (2 services)
Training Service (2 replicas):
- Request: 512Mi RAM, 200m CPU
- Limit: 4Gi RAM, 2000m CPU
- Special storage: 10Gi PVC for models, 4Gi temp storage
Forecasting Service (3 replicas) with HPA 1-3:
- Request: 512Mi RAM, 200m CPU
- Limit: 1Gi RAM, 1000m CPU
Notification Service (3 replicas) with HPA 1-3:
- Request: 256Mi RAM, 100m CPU
- Limit: 512Mi RAM, 500m CPU
ML services total:
- RAM requests: ~2.3 GB
- RAM limits: ~11 GB
- CPU requests: ~1 core
- CPU limits: ~7 cores
2. Databases (18 PostgreSQL instances)
Each database:
- Request: 256Mi RAM, 100m CPU
- Limit: 512Mi RAM, 500m CPU
- Storage: 2Gi PVC each
- Production replicas: 1 per database
Total for databases: 18 instances
- RAM requests: ~4.6 GB
- RAM limits: ~9.2 GB
- CPU requests: ~1.8 cores
- CPU limits: ~9 cores
- Storage: 36 GB
3. Infrastructure Services
Redis (1 instance):
- Request: 256Mi RAM, 100m CPU
- Limit: 512Mi RAM, 500m CPU
- Storage: 1Gi PVC
- TLS enabled
RabbitMQ (1 instance):
- Request: 512Mi RAM, 200m CPU
- Limit: 1Gi RAM, 1000m CPU
- Storage: 2Gi PVC
Infrastructure total:
- RAM requests: ~0.8 GB
- RAM limits: ~1.5 GB
- CPU requests: ~0.3 cores
- CPU limits: ~1.5 cores
- Storage: 3 GB
4. Gateway & Frontend
Gateway (3 replicas):
- Request: 256Mi RAM, 100m CPU
- Limit: 512Mi RAM, 500m CPU
Frontend (2 replicas):
- Request: 512Mi RAM, 250m CPU
- Limit: 1Gi RAM, 500m CPU
Total:
- RAM requests: ~1.8 GB
- RAM limits: ~3.5 GB
- CPU requests: ~0.8 cores
- CPU limits: ~2.5 cores
5. Monitoring Stack (Optional but Recommended)
Prometheus:
- Request: 1Gi RAM, 500m CPU
- Limit: 2Gi RAM, 1000m CPU
- Storage: 20Gi PVC
- Retention: 200h
Grafana:
- Request: 256Mi RAM, 100m CPU
- Limit: 512Mi RAM, 200m CPU
- Storage: 5Gi PVC
Jaeger:
- Request: 256Mi RAM, 100m CPU
- Limit: 512Mi RAM, 200m CPU
Monitoring total:
- RAM requests: ~1.5 GB
- RAM limits: ~3 GB
- CPU requests: ~0.7 cores
- CPU limits: ~1.4 cores
- Storage: 25 GB
6. External Services (Optional in Production)
Nominatim (Disabled by default - can use external geocoding API):
- If enabled: 2Gi/1 CPU request, 4Gi/2 CPU limit
- Storage: 70Gi (50Gi data + 20Gi flatnode)
- Recommendation: Use external geocoding service (Google Maps API, Mapbox) for pilot to save resources
Total Resource Summary
With Monitoring, Without Nominatim (Recommended)
| Resource | Requests | Limits | Recommended VPS |
|---|---|---|---|
| RAM | ~21 GB | ~48 GB | 20 GB |
| CPU | ~8.5 cores | ~41 cores | 8 vCPU |
| Storage | ~79 GB | - | 200 GB NVMe |
Memory Calculation Details
- Application services: 14.1 GB requests / 34.5 GB limits
- Databases: 4.6 GB requests / 9.2 GB limits
- Infrastructure: 0.8 GB requests / 1.5 GB limits
- Gateway/Frontend: 1.8 GB requests / 3.5 GB limits
- Monitoring: 1.5 GB requests / 3 GB limits
- Total requests: ~22.8 GB
- Total limits: ~51.7 GB
Why 20 GB RAM is Sufficient
-
Requests vs Limits: Kubernetes uses requests for scheduling. Our total requests (~22.8 GB) fit in 20 GB because:
- Not all services will run at their request levels simultaneously during pilot
- HPA-enabled services (orders, forecasting, notification) start at 1 replica
- Some overhead included in our calculations
-
Actual Usage: Production limits are safety margins. Real usage for 10 tenants will be:
- Most services use 40-60% of their limits under normal load
- Pilot traffic is significantly lower than peak design capacity
-
Cost-Effective Pilot: Starting with 20 GB allows:
- Room for monitoring and logging
- Comfortable headroom (15-25%)
- Easy vertical scaling if needed
CPU Calculation Details
- Application services: 5.7 cores requests / 28.5 cores limits
- Databases: 1.8 cores requests / 9 cores limits
- Infrastructure: 0.3 cores requests / 1.5 cores limits
- Gateway/Frontend: 0.8 cores requests / 2.5 cores limits
- Monitoring: 0.7 cores requests / 1.4 cores limits
- Total requests: ~9.3 cores
- Total limits: ~42.9 cores
Storage Calculation
- Databases: 36 GB (18 × 2Gi)
- Model storage: 10 GB
- Infrastructure (Redis, RabbitMQ): 3 GB
- Monitoring: 25 GB
- OS and container images: ~30 GB
- Growth buffer: ~95 GB
- Total: ~199 GB → 200 GB NVMe recommended
Scaling Considerations
Horizontal Pod Autoscaling (HPA)
Already configured for:
- orders-service: 1-3 replicas based on CPU (70%) and memory (80%)
- forecasting-service: 1-3 replicas based on CPU (70%) and memory (75%)
- notification-service: 1-3 replicas based on CPU (70%) and memory (80%)
These services will automatically scale up under load without manual intervention.
Growth Path for 6-12 Months
If tenant count grows beyond 10:
| Tenants | RAM | CPU | Storage |
|---|---|---|---|
| 10 | 20 GB | 8 cores | 200 GB |
| 25 | 32 GB | 12 cores | 300 GB |
| 50 | 48 GB | 16 cores | 500 GB |
| 100+ | Consider Kubernetes cluster with multiple nodes |
Vertical Scaling
If you hit resource limits before adding more tenants:
- Upgrade RAM first (most common bottleneck)
- Then CPU if services show high utilization
- Storage can be expanded independently
Cost Optimization Strategies
For Pilot Phase (Months 1-6)
-
Disable Nominatim: Use external geocoding API
- Saves: 70 GB storage, 2 GB RAM, 1 CPU core
- Cost: ~$5-10/month for external API (Google Maps, Mapbox)
- Recommendation: Enable Nominatim only if >50 tenants
-
Start Without Monitoring: Add later if needed
- Saves: 25 GB storage, 1.5 GB RAM, 0.7 CPU cores
- Not recommended - monitoring is crucial for production
-
Reduce Database Replicas: Keep at 1 per service
- Already configured in base
- Acceptable risk for pilot phase
After Pilot Success (Months 6+)
- Enable full HA: Increase database replicas to 2
- Add Nominatim: If external API costs exceed $20/month
- Upgrade VPS: To 32 GB RAM / 12 cores for 25+ tenants
Network and Additional Requirements
Bandwidth
- Estimated: 2-5 TB/month for 10 tenants
- Includes: API traffic, frontend assets, image uploads, reports
Backup Strategy
- Database backups: ~10 GB/day (compressed)
- Retention: 30 days
- Additional storage: 300 GB for backups (separate volume recommended)
Domain & SSL
- 1 domain:
yourdomain.com - SSL: Let's Encrypt (free) or wildcard certificate
- Ingress controller: nginx (included in stack)
Deployment Checklist
Pre-Deployment
- VPS provisioned with 20 GB RAM, 8 cores, 200 GB NVMe
- Docker and Kubernetes (k3s or similar) installed
- Domain DNS configured
- SSL certificates ready
Initial Deployment
- Deploy with
skaffold run -p prod - Verify all pods running:
kubectl get pods -n bakery-ia - Check PVC status:
kubectl get pvc -n bakery-ia - Access frontend and test login
Post-Deployment Monitoring
- Set up external monitoring (UptimeRobot, Pingdom)
- Configure backup schedule
- Test database backups and restore
- Load test with simulated tenant traffic
Support and Scaling
When to Scale Up
Monitor these metrics:
- RAM usage consistently >80% → Upgrade RAM
- CPU usage consistently >70% → Upgrade CPU
- Storage >150 GB used → Upgrade storage
- Response times >2 seconds → Add replicas or upgrade VPS
Emergency Scaling
If you hit limits suddenly:
- Scale down non-critical services temporarily
- Disable monitoring temporarily (not recommended for >1 hour)
- Increase VPS resources (clouding.io allows live upgrades)
- Review and optimize resource-heavy queries
Conclusion
The recommended 20 GB RAM / 8 vCPU / 200 GB NVMe configuration provides:
✅ Comfortable headroom for 10-tenant pilot ✅ Full monitoring and observability ✅ High availability for critical services ✅ Room for traffic spikes (2-3x baseline) ✅ Cost-effective starting point ✅ Easy scaling path as you grow
Total estimated compute cost: €40-80/month (check clouding.io current pricing)
Additional costs: Domain (€15/year), external APIs (€10/month), backups (~€10/month)
Next steps:
- Provision VPS at clouding.io
- Follow deployment guide in
/docs/DEPLOYMENT.md - Monitor resource usage for first 2 weeks
- Adjust based on actual metrics