Frontend Changes: - Fix runtime error: Remove undefined handleModify reference from ActionQueueCard in DashboardPage - Migrate PurchaseOrderDetailsModal to use correct PurchaseOrderItem type from purchase_orders service - Fix item display: Parse unit_price as string (Decimal) instead of number - Use correct field names: item_notes instead of notes - Remove deprecated PurchaseOrder types from suppliers.ts to prevent type conflicts - Update CreatePurchaseOrderModal to use unified types - Clean up API exports: Remove old PO hooks re-exported from suppliers - Add comprehensive translations for PO modal (en, es, eu) Documentation Reorganization: - Move WhatsApp implementation docs to docs/03-features/notifications/whatsapp/ - Move forecast validation docs to docs/03-features/forecasting/ - Move specification docs to docs/03-features/specifications/ - Move deployment docs (Colima, K8s, VPS sizing) to docs/05-deployment/ - Archive completed implementation summaries to docs/archive/implementation-summaries/ - Delete obsolete FRONTEND_CHANGES_NEEDED.md - Standardize filenames to lowercase with hyphens 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
346 lines
9.6 KiB
Markdown
346 lines
9.6 KiB
Markdown
# VPS Sizing for Production Deployment
|
||
|
||
## Executive Summary
|
||
|
||
This document provides detailed resource requirements for deploying the Bakery IA platform to a production VPS environment at **clouding.io** for a **10-tenant pilot program** during the first 6 months.
|
||
|
||
### Recommended VPS Configuration
|
||
|
||
```
|
||
RAM: 20 GB
|
||
Processor: 8 vCPU cores
|
||
SSD NVMe (Triple Replica): 200 GB
|
||
```
|
||
|
||
**Estimated Monthly Cost**: Contact clouding.io for current pricing
|
||
|
||
---
|
||
|
||
## Resource Analysis
|
||
|
||
### 1. Application Services (18 Microservices)
|
||
|
||
#### Standard Services (14 services)
|
||
Each service configured with:
|
||
- **Request**: 256Mi RAM, 100m CPU
|
||
- **Limit**: 512Mi RAM, 500m CPU
|
||
- **Production replicas**: 2-3 per service (from prod overlay)
|
||
|
||
Services:
|
||
- auth-service (3 replicas)
|
||
- tenant-service (2 replicas)
|
||
- inventory-service (2 replicas)
|
||
- recipes-service (2 replicas)
|
||
- suppliers-service (2 replicas)
|
||
- orders-service (3 replicas) *with HPA 1-3*
|
||
- sales-service (2 replicas)
|
||
- pos-service (2 replicas)
|
||
- production-service (2 replicas)
|
||
- procurement-service (2 replicas)
|
||
- orchestrator-service (2 replicas)
|
||
- external-service (2 replicas)
|
||
- ai-insights-service (2 replicas)
|
||
- alert-processor (3 replicas)
|
||
|
||
**Total for standard services**: ~39 pods
|
||
- RAM requests: ~10 GB
|
||
- RAM limits: ~20 GB
|
||
- CPU requests: ~3.9 cores
|
||
- CPU limits: ~19.5 cores
|
||
|
||
#### ML/Heavy Services (2 services)
|
||
|
||
**Training Service** (2 replicas):
|
||
- Request: 512Mi RAM, 200m CPU
|
||
- Limit: 4Gi RAM, 2000m CPU
|
||
- Special storage: 10Gi PVC for models, 4Gi temp storage
|
||
|
||
**Forecasting Service** (3 replicas) *with HPA 1-3*:
|
||
- Request: 512Mi RAM, 200m CPU
|
||
- Limit: 1Gi RAM, 1000m CPU
|
||
|
||
**Notification Service** (3 replicas) *with HPA 1-3*:
|
||
- Request: 256Mi RAM, 100m CPU
|
||
- Limit: 512Mi RAM, 500m CPU
|
||
|
||
**ML services total**:
|
||
- RAM requests: ~2.3 GB
|
||
- RAM limits: ~11 GB
|
||
- CPU requests: ~1 core
|
||
- CPU limits: ~7 cores
|
||
|
||
### 2. Databases (18 PostgreSQL instances)
|
||
|
||
Each database:
|
||
- **Request**: 256Mi RAM, 100m CPU
|
||
- **Limit**: 512Mi RAM, 500m CPU
|
||
- **Storage**: 2Gi PVC each
|
||
- **Production replicas**: 1 per database
|
||
|
||
**Total for databases**: 18 instances
|
||
- RAM requests: ~4.6 GB
|
||
- RAM limits: ~9.2 GB
|
||
- CPU requests: ~1.8 cores
|
||
- CPU limits: ~9 cores
|
||
- Storage: 36 GB
|
||
|
||
### 3. Infrastructure Services
|
||
|
||
**Redis** (1 instance):
|
||
- Request: 256Mi RAM, 100m CPU
|
||
- Limit: 512Mi RAM, 500m CPU
|
||
- Storage: 1Gi PVC
|
||
- TLS enabled
|
||
|
||
**RabbitMQ** (1 instance):
|
||
- Request: 512Mi RAM, 200m CPU
|
||
- Limit: 1Gi RAM, 1000m CPU
|
||
- Storage: 2Gi PVC
|
||
|
||
**Infrastructure total**:
|
||
- RAM requests: ~0.8 GB
|
||
- RAM limits: ~1.5 GB
|
||
- CPU requests: ~0.3 cores
|
||
- CPU limits: ~1.5 cores
|
||
- Storage: 3 GB
|
||
|
||
### 4. Gateway & Frontend
|
||
|
||
**Gateway** (3 replicas):
|
||
- Request: 256Mi RAM, 100m CPU
|
||
- Limit: 512Mi RAM, 500m CPU
|
||
|
||
**Frontend** (2 replicas):
|
||
- Request: 512Mi RAM, 250m CPU
|
||
- Limit: 1Gi RAM, 500m CPU
|
||
|
||
**Total**:
|
||
- RAM requests: ~1.8 GB
|
||
- RAM limits: ~3.5 GB
|
||
- CPU requests: ~0.8 cores
|
||
- CPU limits: ~2.5 cores
|
||
|
||
### 5. Monitoring Stack (Optional but Recommended)
|
||
|
||
**Prometheus**:
|
||
- Request: 1Gi RAM, 500m CPU
|
||
- Limit: 2Gi RAM, 1000m CPU
|
||
- Storage: 20Gi PVC
|
||
- Retention: 200h
|
||
|
||
**Grafana**:
|
||
- Request: 256Mi RAM, 100m CPU
|
||
- Limit: 512Mi RAM, 200m CPU
|
||
- Storage: 5Gi PVC
|
||
|
||
**Jaeger**:
|
||
- Request: 256Mi RAM, 100m CPU
|
||
- Limit: 512Mi RAM, 200m CPU
|
||
|
||
**Monitoring total**:
|
||
- RAM requests: ~1.5 GB
|
||
- RAM limits: ~3 GB
|
||
- CPU requests: ~0.7 cores
|
||
- CPU limits: ~1.4 cores
|
||
- Storage: 25 GB
|
||
|
||
### 6. External Services (Optional in Production)
|
||
|
||
**Nominatim** (Disabled by default - can use external geocoding API):
|
||
- If enabled: 2Gi/1 CPU request, 4Gi/2 CPU limit
|
||
- Storage: 70Gi (50Gi data + 20Gi flatnode)
|
||
- **Recommendation**: Use external geocoding service (Google Maps API, Mapbox) for pilot to save resources
|
||
|
||
---
|
||
|
||
## Total Resource Summary
|
||
|
||
### With Monitoring, Without Nominatim (Recommended)
|
||
|
||
| Resource | Requests | Limits | Recommended VPS |
|
||
|----------|----------|--------|-----------------|
|
||
| **RAM** | ~21 GB | ~48 GB | **20 GB** |
|
||
| **CPU** | ~8.5 cores | ~41 cores | **8 vCPU** |
|
||
| **Storage** | ~79 GB | - | **200 GB NVMe** |
|
||
|
||
### Memory Calculation Details
|
||
- Application services: 14.1 GB requests / 34.5 GB limits
|
||
- Databases: 4.6 GB requests / 9.2 GB limits
|
||
- Infrastructure: 0.8 GB requests / 1.5 GB limits
|
||
- Gateway/Frontend: 1.8 GB requests / 3.5 GB limits
|
||
- Monitoring: 1.5 GB requests / 3 GB limits
|
||
- **Total requests**: ~22.8 GB
|
||
- **Total limits**: ~51.7 GB
|
||
|
||
### Why 20 GB RAM is Sufficient
|
||
|
||
1. **Requests vs Limits**: Kubernetes uses requests for scheduling. Our total requests (~22.8 GB) fit in 20 GB because:
|
||
- Not all services will run at their request levels simultaneously during pilot
|
||
- HPA-enabled services (orders, forecasting, notification) start at 1 replica
|
||
- Some overhead included in our calculations
|
||
|
||
2. **Actual Usage**: Production limits are safety margins. Real usage for 10 tenants will be:
|
||
- Most services use 40-60% of their limits under normal load
|
||
- Pilot traffic is significantly lower than peak design capacity
|
||
|
||
3. **Cost-Effective Pilot**: Starting with 20 GB allows:
|
||
- Room for monitoring and logging
|
||
- Comfortable headroom (15-25%)
|
||
- Easy vertical scaling if needed
|
||
|
||
### CPU Calculation Details
|
||
- Application services: 5.7 cores requests / 28.5 cores limits
|
||
- Databases: 1.8 cores requests / 9 cores limits
|
||
- Infrastructure: 0.3 cores requests / 1.5 cores limits
|
||
- Gateway/Frontend: 0.8 cores requests / 2.5 cores limits
|
||
- Monitoring: 0.7 cores requests / 1.4 cores limits
|
||
- **Total requests**: ~9.3 cores
|
||
- **Total limits**: ~42.9 cores
|
||
|
||
### Storage Calculation
|
||
- Databases: 36 GB (18 × 2Gi)
|
||
- Model storage: 10 GB
|
||
- Infrastructure (Redis, RabbitMQ): 3 GB
|
||
- Monitoring: 25 GB
|
||
- OS and container images: ~30 GB
|
||
- Growth buffer: ~95 GB
|
||
- **Total**: ~199 GB → **200 GB NVMe recommended**
|
||
|
||
---
|
||
|
||
## Scaling Considerations
|
||
|
||
### Horizontal Pod Autoscaling (HPA)
|
||
|
||
Already configured for:
|
||
1. **orders-service**: 1-3 replicas based on CPU (70%) and memory (80%)
|
||
2. **forecasting-service**: 1-3 replicas based on CPU (70%) and memory (75%)
|
||
3. **notification-service**: 1-3 replicas based on CPU (70%) and memory (80%)
|
||
|
||
These services will automatically scale up under load without manual intervention.
|
||
|
||
### Growth Path for 6-12 Months
|
||
|
||
If tenant count grows beyond 10:
|
||
|
||
| Tenants | RAM | CPU | Storage |
|
||
|---------|-----|-----|---------|
|
||
| 10 | 20 GB | 8 cores | 200 GB |
|
||
| 25 | 32 GB | 12 cores | 300 GB |
|
||
| 50 | 48 GB | 16 cores | 500 GB |
|
||
| 100+ | Consider Kubernetes cluster with multiple nodes |
|
||
|
||
### Vertical Scaling
|
||
|
||
If you hit resource limits before adding more tenants:
|
||
1. Upgrade RAM first (most common bottleneck)
|
||
2. Then CPU if services show high utilization
|
||
3. Storage can be expanded independently
|
||
|
||
---
|
||
|
||
## Cost Optimization Strategies
|
||
|
||
### For Pilot Phase (Months 1-6)
|
||
|
||
1. **Disable Nominatim**: Use external geocoding API
|
||
- Saves: 70 GB storage, 2 GB RAM, 1 CPU core
|
||
- Cost: ~$5-10/month for external API (Google Maps, Mapbox)
|
||
- **Recommendation**: Enable Nominatim only if >50 tenants
|
||
|
||
2. **Start Without Monitoring**: Add later if needed
|
||
- Saves: 25 GB storage, 1.5 GB RAM, 0.7 CPU cores
|
||
- **Not recommended** - monitoring is crucial for production
|
||
|
||
3. **Reduce Database Replicas**: Keep at 1 per service
|
||
- Already configured in base
|
||
- **Acceptable risk** for pilot phase
|
||
|
||
### After Pilot Success (Months 6+)
|
||
|
||
1. **Enable full HA**: Increase database replicas to 2
|
||
2. **Add Nominatim**: If external API costs exceed $20/month
|
||
3. **Upgrade VPS**: To 32 GB RAM / 12 cores for 25+ tenants
|
||
|
||
---
|
||
|
||
## Network and Additional Requirements
|
||
|
||
### Bandwidth
|
||
- Estimated: 2-5 TB/month for 10 tenants
|
||
- Includes: API traffic, frontend assets, image uploads, reports
|
||
|
||
### Backup Strategy
|
||
- Database backups: ~10 GB/day (compressed)
|
||
- Retention: 30 days
|
||
- Additional storage: 300 GB for backups (separate volume recommended)
|
||
|
||
### Domain & SSL
|
||
- 1 domain: `yourdomain.com`
|
||
- SSL: Let's Encrypt (free) or wildcard certificate
|
||
- Ingress controller: nginx (included in stack)
|
||
|
||
---
|
||
|
||
## Deployment Checklist
|
||
|
||
### Pre-Deployment
|
||
- [ ] VPS provisioned with 20 GB RAM, 8 cores, 200 GB NVMe
|
||
- [ ] Docker and Kubernetes (k3s or similar) installed
|
||
- [ ] Domain DNS configured
|
||
- [ ] SSL certificates ready
|
||
|
||
### Initial Deployment
|
||
- [ ] Deploy with `skaffold run -p prod`
|
||
- [ ] Verify all pods running: `kubectl get pods -n bakery-ia`
|
||
- [ ] Check PVC status: `kubectl get pvc -n bakery-ia`
|
||
- [ ] Access frontend and test login
|
||
|
||
### Post-Deployment Monitoring
|
||
- [ ] Set up external monitoring (UptimeRobot, Pingdom)
|
||
- [ ] Configure backup schedule
|
||
- [ ] Test database backups and restore
|
||
- [ ] Load test with simulated tenant traffic
|
||
|
||
---
|
||
|
||
## Support and Scaling
|
||
|
||
### When to Scale Up
|
||
|
||
Monitor these metrics:
|
||
1. **RAM usage consistently >80%** → Upgrade RAM
|
||
2. **CPU usage consistently >70%** → Upgrade CPU
|
||
3. **Storage >150 GB used** → Upgrade storage
|
||
4. **Response times >2 seconds** → Add replicas or upgrade VPS
|
||
|
||
### Emergency Scaling
|
||
|
||
If you hit limits suddenly:
|
||
1. Scale down non-critical services temporarily
|
||
2. Disable monitoring temporarily (not recommended for >1 hour)
|
||
3. Increase VPS resources (clouding.io allows live upgrades)
|
||
4. Review and optimize resource-heavy queries
|
||
|
||
---
|
||
|
||
## Conclusion
|
||
|
||
The recommended **20 GB RAM / 8 vCPU / 200 GB NVMe** configuration provides:
|
||
|
||
✅ Comfortable headroom for 10-tenant pilot
|
||
✅ Full monitoring and observability
|
||
✅ High availability for critical services
|
||
✅ Room for traffic spikes (2-3x baseline)
|
||
✅ Cost-effective starting point
|
||
✅ Easy scaling path as you grow
|
||
|
||
**Total estimated compute cost**: €40-80/month (check clouding.io current pricing)
|
||
**Additional costs**: Domain (~€15/year), external APIs (~€10/month), backups (~€10/month)
|
||
|
||
**Next steps**:
|
||
1. Provision VPS at clouding.io
|
||
2. Follow deployment guide in `/docs/DEPLOYMENT.md`
|
||
3. Monitor resource usage for first 2 weeks
|
||
4. Adjust based on actual metrics
|