5.1 KiB
Deployment Troubleshooting Guide
This guide addresses common deployment issues encountered with the Bakery IA system.
Table of Contents
Too Many Open Files Error
Symptoms
failed to create fsnotify watcher: too many open files
Error streaming distribution-service-7ff4db8c48-k4xw7 logs: failed to create fsnotify watcher: too many open files
Root Cause
This error occurs when the system hits inotify limits, which are used by Kubernetes and Docker to monitor file system changes. This is common in development environments with many containers.
Solutions
For macOS (Docker Desktop)
-
Increase Docker Resources:
- Open Docker Desktop
- Go to Settings > Resources > Advanced
- Increase memory allocation to 8GB or more
- Restart Docker Desktop
-
Clean Docker System:
docker system prune -a --volumes -
Adjust macOS System Limits:
# Add to /etc/sysctl.conf echo "kern.maxfiles=1048576" | sudo tee -a /etc/sysctl.conf echo "kern.maxfilesperproc=65536" | sudo tee -a /etc/sysctl.conf # Apply changes sudo sysctl -w kern.maxfiles=1048576 sudo sysctl -w kern.maxfilesperproc=65536
For Linux (Kubernetes Nodes)
-
Temporary Fix:
sudo sysctl -w fs.inotify.max_user_watches=524288 sudo sysctl -w fs.inotify.max_user_instances=1024 sudo sysctl -w fs.inotify.max_queued_events=16384 -
Permanent Fix:
# Add to /etc/sysctl.conf echo "fs.inotify.max_user_watches=524288" | sudo tee -a /etc/sysctl.conf echo "fs.inotify.max_user_instances=1024" | sudo tee -a /etc/sysctl.conf echo "fs.inotify.max_queued_events=16384" | sudo tee -a /etc/sysctl.conf # Apply changes sudo sysctl -p -
Restart Kubernetes Components:
sudo systemctl restart kubelet sudo systemctl restart docker
For Kind Clusters
# Delete and recreate cluster
kind delete cluster
kind create cluster
For Minikube
minikube stop
minikube start
Prevention
Add security context to your deployments to limit resource usage:
securityContext:
runAsUser: 1000
runAsGroup: 1000
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
RouteBuilder TypeError Fix
Symptoms
TypeError: RouteBuilder.build_resource_detail_route() takes from 2 to 4 positional arguments but 5 were given
Root Cause
Incorrect usage of RouteBuilder methods. The build_resource_detail_route method only accepts 2-3 parameters, but was being called with 4-5 parameters.
Solution
Use the correct RouteBuilder methods:
-
For nested resources: Use
build_nested_resource_route()# Wrong route_builder.build_resource_detail_route("forecasts", "forecast_id", "feedback") # Correct route_builder.build_nested_resource_route("forecasts", "forecast_id", "feedback") -
For resource actions: Use
build_resource_action_route()# Wrong route_builder.build_resource_detail_route("forecasts", "forecast_id", "feedback", "retrain") # Correct route_builder.build_resource_action_route("forecasts", "forecast_id", "retrain")
Files Fixed
services/forecasting/app/api/forecast_feedback.py
General Kubernetes Troubleshooting
Check Pod Status
kubectl get pods -n bakery-ia
kubectl describe pod distribution-service -n bakery-ia
Check Logs
kubectl logs distribution-service -n bakery-ia
kubectl logs -f distribution-service -n bakery-ia # Follow logs
Check Resource Usage
kubectl top pods -n bakery-ia
kubectl describe nodes | grep -A 10 "Allocated resources"
Restart Deployment
kubectl rollout restart deployment distribution-service -n bakery-ia
Scale Down/Up
kubectl scale deployment distribution-service -n bakery-ia --replicas=1
kubectl scale deployment distribution-service -n bakery-ia --replicas=2
Running Fix Scripts
Fix Inotify Limits
cd scripts
./fix_kubernetes_inotify.sh
Fix RouteBuilder Issues
The RouteBuilder issues have been fixed in the codebase. If you encounter similar issues:
- Check the RouteBuilder method signatures in
shared/routing/route_builder.py - Use the appropriate method for your routing pattern
- Follow the examples in the fixed forecast feedback API
Additional Resources
Support
If issues persist after trying these solutions:
- Check the specific error message and logs
- Verify system resources (CPU, memory, disk)
- Review recent changes to the codebase
- Consult the architecture documentation for service boundaries