# Deployment Troubleshooting Guide This guide addresses common deployment issues encountered with the Bakery IA system. ## Table of Contents - [Too Many Open Files Error](#too-many-open-files-error) - [RouteBuilder TypeError Fix](#routebuilder-typeerror-fix) - [General Kubernetes Troubleshooting](#general-kubernetes-troubleshooting) ## Too Many Open Files Error ### Symptoms ``` failed to create fsnotify watcher: too many open files Error streaming distribution-service-7ff4db8c48-k4xw7 logs: failed to create fsnotify watcher: too many open files ``` ### Root Cause This error occurs when the system hits inotify limits, which are used by Kubernetes and Docker to monitor file system changes. This is common in development environments with many containers. ### Solutions #### For macOS (Docker Desktop) 1. **Increase Docker Resources**: - Open Docker Desktop - Go to Settings > Resources > Advanced - Increase memory allocation to 8GB or more - Restart Docker Desktop 2. **Clean Docker System**: ```bash docker system prune -a --volumes ``` 3. **Adjust macOS System Limits**: ```bash # Add to /etc/sysctl.conf echo "kern.maxfiles=1048576" | sudo tee -a /etc/sysctl.conf echo "kern.maxfilesperproc=65536" | sudo tee -a /etc/sysctl.conf # Apply changes sudo sysctl -w kern.maxfiles=1048576 sudo sysctl -w kern.maxfilesperproc=65536 ``` #### For Linux (Kubernetes Nodes) 1. **Temporary Fix**: ```bash sudo sysctl -w fs.inotify.max_user_watches=524288 sudo sysctl -w fs.inotify.max_user_instances=1024 sudo sysctl -w fs.inotify.max_queued_events=16384 ``` 2. **Permanent Fix**: ```bash # Add to /etc/sysctl.conf echo "fs.inotify.max_user_watches=524288" | sudo tee -a /etc/sysctl.conf echo "fs.inotify.max_user_instances=1024" | sudo tee -a /etc/sysctl.conf echo "fs.inotify.max_queued_events=16384" | sudo tee -a /etc/sysctl.conf # Apply changes sudo sysctl -p ``` 3. **Restart Kubernetes Components**: ```bash sudo systemctl restart kubelet sudo systemctl restart docker ``` #### For Kind Clusters ```bash # Delete and recreate cluster kind delete cluster kind create cluster ``` #### For Minikube ```bash minikube stop minikube start ``` ### Prevention Add security context to your deployments to limit resource usage: ```yaml securityContext: runAsUser: 1000 runAsGroup: 1000 allowPrivilegeEscalation: false readOnlyRootFilesystem: false ``` ## RouteBuilder TypeError Fix ### Symptoms ``` TypeError: RouteBuilder.build_resource_detail_route() takes from 2 to 4 positional arguments but 5 were given ``` ### Root Cause Incorrect usage of RouteBuilder methods. The `build_resource_detail_route` method only accepts 2-3 parameters, but was being called with 4-5 parameters. ### Solution Use the correct RouteBuilder methods: - **For nested resources**: Use `build_nested_resource_route()` ```python # Wrong route_builder.build_resource_detail_route("forecasts", "forecast_id", "feedback") # Correct route_builder.build_nested_resource_route("forecasts", "forecast_id", "feedback") ``` - **For resource actions**: Use `build_resource_action_route()` ```python # Wrong route_builder.build_resource_detail_route("forecasts", "forecast_id", "feedback", "retrain") # Correct route_builder.build_resource_action_route("forecasts", "forecast_id", "retrain") ``` ### Files Fixed - `services/forecasting/app/api/forecast_feedback.py` ## General Kubernetes Troubleshooting ### Check Pod Status ```bash kubectl get pods -n bakery-ia kubectl describe pod distribution-service -n bakery-ia ``` ### Check Logs ```bash kubectl logs distribution-service -n bakery-ia kubectl logs -f distribution-service -n bakery-ia # Follow logs ``` ### Check Resource Usage ```bash kubectl top pods -n bakery-ia kubectl describe nodes | grep -A 10 "Allocated resources" ``` ### Restart Deployment ```bash kubectl rollout restart deployment distribution-service -n bakery-ia ``` ### Scale Down/Up ```bash kubectl scale deployment distribution-service -n bakery-ia --replicas=1 kubectl scale deployment distribution-service -n bakery-ia --replicas=2 ``` ## Running Fix Scripts ### Fix Inotify Limits ```bash cd scripts ./fix_kubernetes_inotify.sh ``` ### Fix RouteBuilder Issues The RouteBuilder issues have been fixed in the codebase. If you encounter similar issues: 1. Check the RouteBuilder method signatures in `shared/routing/route_builder.py` 2. Use the appropriate method for your routing pattern 3. Follow the examples in the fixed forecast feedback API ## Additional Resources - [Kubernetes Inotify Limits](https://kind.sigs.k8s.io/docs/user/known-issues/#pod-errors-due-to-too-many-open-files) - [Docker Desktop Resource Limits](https://docs.docker.com/desktop/settings/mac/#resources) - [RouteBuilder Documentation](shared/routing/route_builder.py) ## Support If issues persist after trying these solutions: 1. Check the specific error message and logs 2. Verify system resources (CPU, memory, disk) 3. Review recent changes to the codebase 4. Consult the architecture documentation for service boundaries