Improve AI logic
This commit is contained in:
258
docs/06-security/README.md
Normal file
258
docs/06-security/README.md
Normal file
@@ -0,0 +1,258 @@
|
||||
# Security Documentation
|
||||
|
||||
**Bakery IA Platform - Consolidated Security Guides**
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This directory contains comprehensive, production-ready security documentation for the Bakery IA platform. Our infrastructure has been hardened from a **D- security grade to an A- grade** through systematic implementation of industry best practices.
|
||||
|
||||
### Security Achievement Summary
|
||||
|
||||
- **15 databases secured** (14 PostgreSQL + 1 Redis)
|
||||
- **100% TLS encryption** for all database connections
|
||||
- **Strong authentication** with 32-character cryptographic passwords
|
||||
- **Data persistence** with PersistentVolumeClaims preventing data loss
|
||||
- **Audit logging** enabled for all database operations
|
||||
- **Compliance ready** for GDPR, PCI-DSS, and SOC 2
|
||||
|
||||
### Security Grade Improvement
|
||||
|
||||
| Metric | Before | After |
|
||||
|--------|--------|-------|
|
||||
| Overall Grade | D- | A- |
|
||||
| Critical Issues | 4 | 0 |
|
||||
| High-Risk Issues | 3 | 0 |
|
||||
| Medium-Risk Issues | 4 | 0 |
|
||||
|
||||
---
|
||||
|
||||
## Documentation Guides
|
||||
|
||||
### 1. [Database Security Guide](./database-security.md)
|
||||
**Complete guide to database security implementation**
|
||||
|
||||
Covers database inventory, authentication, encryption (transit & rest), data persistence, backups, audit logging, compliance status, and troubleshooting.
|
||||
|
||||
**Best for:** Understanding overall database security, troubleshooting database issues, backup procedures
|
||||
|
||||
### 2. [RBAC Implementation Guide](./rbac-implementation.md)
|
||||
**Role-Based Access Control across all microservices**
|
||||
|
||||
Covers role hierarchy (4 roles), subscription tiers (3 tiers), service-by-service access matrix (250+ endpoints), implementation code examples, and testing strategies.
|
||||
|
||||
**Best for:** Implementing access control, understanding subscription limits, securing API endpoints
|
||||
|
||||
### 3. [TLS Configuration Guide](./tls-configuration.md)
|
||||
**Detailed TLS/SSL setup and configuration**
|
||||
|
||||
Covers certificate infrastructure, PostgreSQL TLS setup, Redis TLS setup, client configuration, deployment procedures, verification, and certificate rotation.
|
||||
|
||||
**Best for:** Setting up TLS encryption, certificate management, diagnosing TLS connection issues
|
||||
|
||||
### 4. [Security Checklist](./security-checklist.md)
|
||||
**Production deployment and verification checklist**
|
||||
|
||||
Covers pre-deployment prep, phased deployment (weeks 1-6), verification procedures, post-deployment tasks, maintenance schedules, and emergency procedures.
|
||||
|
||||
**Best for:** Production deployment, security audits, ongoing maintenance planning
|
||||
|
||||
## Quick Start
|
||||
|
||||
### For Developers
|
||||
|
||||
1. **Authentication**: All services use JWT tokens
|
||||
2. **Authorization**: Use role decorators from `shared/auth/access_control.py`
|
||||
3. **Database**: Connections automatically use TLS
|
||||
4. **Secrets**: Never commit credentials - use Kubernetes secrets
|
||||
|
||||
### For Operations
|
||||
|
||||
1. **TLS Certificates**: Stored in `infrastructure/tls/`
|
||||
2. **Backup Script**: `scripts/encrypted-backup.sh`
|
||||
3. **Password Rotation**: `scripts/generate-passwords.sh`
|
||||
4. **Monitoring**: Check audit logs regularly
|
||||
|
||||
## Compliance Status
|
||||
|
||||
| Requirement | Status |
|
||||
|-------------|--------|
|
||||
| GDPR Article 32 (Encryption) | ✅ COMPLIANT |
|
||||
| PCI-DSS Req 3.4 (Transit Encryption) | ✅ COMPLIANT |
|
||||
| PCI-DSS Req 3.5 (At-Rest Encryption) | ✅ COMPLIANT |
|
||||
| PCI-DSS Req 10 (Audit Logging) | ✅ COMPLIANT |
|
||||
| SOC 2 CC6.1 (Access Control) | ✅ COMPLIANT |
|
||||
| SOC 2 CC6.6 (Transit Encryption) | ✅ COMPLIANT |
|
||||
| SOC 2 CC6.7 (Rest Encryption) | ✅ COMPLIANT |
|
||||
|
||||
## Security Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ API GATEWAY │
|
||||
│ - JWT validation │
|
||||
│ - Rate limiting │
|
||||
│ - TLS termination │
|
||||
└──────────────────────────────┬──────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ SERVICE LAYER │
|
||||
│ - Role-based access control (RBAC) │
|
||||
│ - Tenant isolation │
|
||||
│ - Permission validation │
|
||||
│ - Audit logging │
|
||||
└──────────────────────────────┬──────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ DATA LAYER │
|
||||
│ - TLS encrypted connections │
|
||||
│ - Strong authentication (scram-sha-256) │
|
||||
│ - Encrypted secrets at rest │
|
||||
│ - Column-level encryption (pgcrypto) │
|
||||
│ - Persistent volumes with backups │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Critical Security Features
|
||||
|
||||
### Authentication
|
||||
- JWT-based authentication across all services
|
||||
- Service-to-service authentication with tokens
|
||||
- Refresh token rotation
|
||||
- Password hashing with bcrypt
|
||||
|
||||
### Authorization
|
||||
- Hierarchical role system (Viewer → Member → Admin → Owner)
|
||||
- Subscription tier-based feature gating
|
||||
- Resource-level permissions
|
||||
- Tenant isolation
|
||||
|
||||
### Data Protection
|
||||
- TLS 1.2+ for all connections
|
||||
- AES-256 encryption for secrets at rest
|
||||
- pgcrypto for sensitive column encryption
|
||||
- Encrypted backups with GPG
|
||||
|
||||
### Monitoring & Auditing
|
||||
- Comprehensive PostgreSQL audit logging
|
||||
- Connection/disconnection tracking
|
||||
- SQL statement logging
|
||||
- Failed authentication attempts
|
||||
|
||||
## Common Security Tasks
|
||||
|
||||
### Rotate Database Passwords
|
||||
|
||||
```bash
|
||||
# Generate new passwords
|
||||
./scripts/generate-passwords.sh
|
||||
|
||||
# Update environment files
|
||||
./scripts/update-env-passwords.sh
|
||||
|
||||
# Update Kubernetes secrets
|
||||
./scripts/update-k8s-secrets.sh
|
||||
```
|
||||
|
||||
### Create Encrypted Backup
|
||||
|
||||
```bash
|
||||
# Backup all databases
|
||||
./scripts/encrypted-backup.sh
|
||||
|
||||
# Restore specific database
|
||||
gpg --decrypt backup_file.sql.gz.gpg | gunzip | psql -U user -d database
|
||||
```
|
||||
|
||||
### Regenerate TLS Certificates
|
||||
|
||||
```bash
|
||||
# Regenerate all certificates (before expiry)
|
||||
cd infrastructure/tls
|
||||
./generate-certificates.sh
|
||||
|
||||
# Update Kubernetes secrets
|
||||
./scripts/create-tls-secrets.sh
|
||||
```
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
### For Developers
|
||||
|
||||
1. **Never hardcode credentials** - Use environment variables
|
||||
2. **Always use role decorators** on sensitive endpoints
|
||||
3. **Validate input** - Prevent SQL injection and XSS
|
||||
4. **Log security events** - Failed auth, permission denied
|
||||
5. **Use parameterized queries** - Never concatenate SQL
|
||||
6. **Implement rate limiting** - Prevent brute force attacks
|
||||
|
||||
### For Operations
|
||||
|
||||
1. **Rotate passwords regularly** - Every 90 days
|
||||
2. **Monitor audit logs** - Check for suspicious activity
|
||||
3. **Keep certificates current** - Renew before expiry
|
||||
4. **Test backups** - Verify restoration procedures
|
||||
5. **Update dependencies** - Apply security patches
|
||||
6. **Review access** - Remove unused accounts
|
||||
|
||||
## Incident Response
|
||||
|
||||
### Security Incident Checklist
|
||||
|
||||
1. **Identify** the scope and impact
|
||||
2. **Contain** the threat (disable compromised accounts)
|
||||
3. **Eradicate** the vulnerability
|
||||
4. **Recover** affected systems
|
||||
5. **Document** the incident
|
||||
6. **Review** and improve security measures
|
||||
|
||||
### Emergency Contacts
|
||||
|
||||
- Security incidents should be reported immediately
|
||||
- Check audit logs: `/var/log/postgresql/` in database pods
|
||||
- Review application logs for suspicious patterns
|
||||
|
||||
## Additional Resources
|
||||
|
||||
### Consolidated Security Guides
|
||||
- [Database Security Guide](./database-security.md) - Complete database security
|
||||
- [RBAC Implementation Guide](./rbac-implementation.md) - Access control
|
||||
- [TLS Configuration Guide](./tls-configuration.md) - TLS/SSL setup
|
||||
- [Security Checklist](./security-checklist.md) - Deployment verification
|
||||
|
||||
### Source Analysis Reports
|
||||
These detailed reports were used to create the consolidated guides above:
|
||||
- [Database Security Analysis Report](../archive/DATABASE_SECURITY_ANALYSIS_REPORT.md) - Original security analysis
|
||||
- [Security Implementation Complete](../archive/SECURITY_IMPLEMENTATION_COMPLETE.md) - Implementation summary
|
||||
- [RBAC Analysis Report](../archive/RBAC_ANALYSIS_REPORT.md) - Access control analysis
|
||||
- [TLS Implementation Complete](../archive/TLS_IMPLEMENTATION_COMPLETE.md) - TLS implementation
|
||||
|
||||
### Platform Documentation
|
||||
- [System Overview](../02-architecture/system-overview.md) - Platform architecture
|
||||
- [AI Insights API](../08-api-reference/ai-insights-api.md) - Technical API details
|
||||
- [Testing Guide](../04-development/testing-guide.md) - Testing strategies
|
||||
|
||||
---
|
||||
|
||||
## Document Maintenance
|
||||
|
||||
**Last Updated**: November 2025
|
||||
**Version**: 1.0
|
||||
**Next Review**: May 2026
|
||||
**Review Cycle**: Every 6 months
|
||||
**Maintained by**: Security Team
|
||||
|
||||
---
|
||||
|
||||
## Support
|
||||
|
||||
For security questions or issues:
|
||||
|
||||
1. **First**: Check the relevant guide in this directory
|
||||
2. **Then**: Review source reports in the `docs/` directory
|
||||
3. **Finally**: Contact Security Team or DevOps Team
|
||||
|
||||
**For security incidents**: Follow incident response procedures immediately.
|
||||
552
docs/06-security/database-security.md
Normal file
552
docs/06-security/database-security.md
Normal file
@@ -0,0 +1,552 @@
|
||||
# Database Security Guide
|
||||
|
||||
**Last Updated:** November 2025
|
||||
**Status:** Production Ready
|
||||
**Security Grade:** A-
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Database Inventory](#database-inventory)
|
||||
3. [Security Implementation](#security-implementation)
|
||||
4. [Data Protection](#data-protection)
|
||||
5. [Compliance](#compliance)
|
||||
6. [Monitoring and Maintenance](#monitoring-and-maintenance)
|
||||
7. [Troubleshooting](#troubleshooting)
|
||||
8. [Related Documentation](#related-documentation)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This guide provides comprehensive information about database security in the Bakery IA platform. Our infrastructure has been hardened from a D- security grade to an A- grade through systematic implementation of industry best practices.
|
||||
|
||||
### Security Achievements
|
||||
|
||||
- **15 databases secured** (14 PostgreSQL + 1 Redis)
|
||||
- **100% TLS encryption** for all database connections
|
||||
- **Strong authentication** with 32-character cryptographic passwords
|
||||
- **Data persistence** with PersistentVolumeClaims preventing data loss
|
||||
- **Audit logging** enabled for all database operations
|
||||
- **Encryption at rest** capabilities with pgcrypto extension
|
||||
|
||||
### Security Grade Improvement
|
||||
|
||||
| Metric | Before | After |
|
||||
|--------|--------|-------|
|
||||
| Overall Grade | D- | A- |
|
||||
| Critical Issues | 4 | 0 |
|
||||
| High-Risk Issues | 3 | 0 |
|
||||
| Medium-Risk Issues | 4 | 0 |
|
||||
| Encryption in Transit | None | TLS 1.2+ |
|
||||
| Encryption at Rest | None | Available (pgcrypto + K8s) |
|
||||
|
||||
---
|
||||
|
||||
## Database Inventory
|
||||
|
||||
### PostgreSQL Databases (14 instances)
|
||||
|
||||
All running PostgreSQL 17-alpine with TLS encryption enabled:
|
||||
|
||||
| Database | Service | Purpose |
|
||||
|----------|---------|---------|
|
||||
| auth-db | Authentication | User authentication and authorization |
|
||||
| tenant-db | Tenant | Multi-tenancy management |
|
||||
| training-db | Training | ML model training data |
|
||||
| forecasting-db | Forecasting | Demand forecasting |
|
||||
| sales-db | Sales | Sales transactions |
|
||||
| external-db | External | External API data |
|
||||
| notification-db | Notification | Notifications and alerts |
|
||||
| inventory-db | Inventory | Inventory management |
|
||||
| recipes-db | Recipes | Recipe data |
|
||||
| suppliers-db | Suppliers | Supplier information |
|
||||
| pos-db | POS | Point of Sale integrations |
|
||||
| orders-db | Orders | Order management |
|
||||
| production-db | Production | Production batches |
|
||||
| alert-processor-db | Alert Processor | Alert processing |
|
||||
|
||||
### Other Datastores
|
||||
|
||||
- **Redis:** Shared caching and session storage with TLS encryption
|
||||
- **RabbitMQ:** Message broker for inter-service communication
|
||||
|
||||
---
|
||||
|
||||
## Security Implementation
|
||||
|
||||
### 1. Authentication and Access Control
|
||||
|
||||
#### Service Isolation
|
||||
- Each service has its own dedicated database with unique credentials
|
||||
- Prevents cross-service data access
|
||||
- Limits blast radius of credential compromise
|
||||
|
||||
#### Password Security
|
||||
- **Algorithm:** PostgreSQL uses scram-sha-256 authentication (modern, secure)
|
||||
- **Password Strength:** 32-character cryptographically secure passwords
|
||||
- **Generation:** Created using OpenSSL: `openssl rand -base64 32`
|
||||
- **Rotation Policy:** Recommended every 90 days
|
||||
|
||||
#### Network Isolation
|
||||
- All databases run on internal Kubernetes network
|
||||
- No direct external exposure
|
||||
- ClusterIP services (internal only)
|
||||
- Cannot be accessed from outside the cluster
|
||||
|
||||
### 2. Encryption in Transit (TLS/SSL)
|
||||
|
||||
All database connections enforce TLS 1.2+ encryption.
|
||||
|
||||
#### PostgreSQL TLS Configuration
|
||||
|
||||
**Server Configuration:**
|
||||
```yaml
|
||||
# PostgreSQL SSL Settings (postgresql.conf)
|
||||
ssl = on
|
||||
ssl_cert_file = '/tls/server-cert.pem'
|
||||
ssl_key_file = '/tls/server-key.pem'
|
||||
ssl_ca_file = '/tls/ca-cert.pem'
|
||||
ssl_prefer_server_ciphers = on
|
||||
ssl_min_protocol_version = 'TLSv1.2'
|
||||
```
|
||||
|
||||
**Client Connection String:**
|
||||
```python
|
||||
# Automatically enforced by DatabaseManager
|
||||
"postgresql+asyncpg://user:pass@host:5432/db?ssl=require"
|
||||
```
|
||||
|
||||
**Certificate Details:**
|
||||
- **Algorithm:** RSA 4096-bit
|
||||
- **Signature:** SHA-256
|
||||
- **Validity:** 3 years (expires October 2028)
|
||||
- **CA Validity:** 10 years (expires 2035)
|
||||
|
||||
#### Redis TLS Configuration
|
||||
|
||||
**Server Configuration:**
|
||||
```bash
|
||||
redis-server \
|
||||
--requirepass $REDIS_PASSWORD \
|
||||
--tls-port 6379 \
|
||||
--port 0 \
|
||||
--tls-cert-file /tls/redis-cert.pem \
|
||||
--tls-key-file /tls/redis-key.pem \
|
||||
--tls-ca-cert-file /tls/ca-cert.pem \
|
||||
--tls-auth-clients no
|
||||
```
|
||||
|
||||
**Client Connection String:**
|
||||
```python
|
||||
"rediss://:password@redis-service:6379?ssl_cert_reqs=none"
|
||||
```
|
||||
|
||||
### 3. Data Persistence
|
||||
|
||||
#### PersistentVolumeClaims (PVCs)
|
||||
|
||||
All PostgreSQL databases use PVCs to prevent data loss:
|
||||
|
||||
```yaml
|
||||
# Example PVC configuration
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: auth-db-pvc
|
||||
namespace: bakery-ia
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: 2Gi
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Data persists across pod restarts
|
||||
- Prevents catastrophic data loss from ephemeral storage
|
||||
- Enables backup and restore operations
|
||||
- Supports volume snapshots
|
||||
|
||||
#### Redis Persistence
|
||||
|
||||
Redis configured with:
|
||||
- **AOF (Append Only File):** enabled
|
||||
- **RDB snapshots:** periodic
|
||||
- **PersistentVolumeClaim:** for data directory
|
||||
|
||||
---
|
||||
|
||||
## Data Protection
|
||||
|
||||
### 1. Encryption at Rest
|
||||
|
||||
#### Kubernetes Secrets Encryption
|
||||
|
||||
All secrets encrypted at rest with AES-256:
|
||||
|
||||
```yaml
|
||||
# Encryption configuration
|
||||
apiVersion: apiserver.config.k8s.io/v1
|
||||
kind: EncryptionConfiguration
|
||||
resources:
|
||||
- resources:
|
||||
- secrets
|
||||
providers:
|
||||
- aescbc:
|
||||
keys:
|
||||
- name: key1
|
||||
secret: <base64-encoded-32-byte-key>
|
||||
- identity: {}
|
||||
```
|
||||
|
||||
#### PostgreSQL pgcrypto Extension
|
||||
|
||||
Available for column-level encryption:
|
||||
|
||||
```sql
|
||||
-- Enable extension
|
||||
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
|
||||
|
||||
-- Encrypt sensitive data
|
||||
INSERT INTO users (name, ssn_encrypted)
|
||||
VALUES (
|
||||
'John Doe',
|
||||
pgp_sym_encrypt('123-45-6789', 'encryption_key')
|
||||
);
|
||||
|
||||
-- Decrypt data
|
||||
SELECT name, pgp_sym_decrypt(ssn_encrypted::bytea, 'encryption_key')
|
||||
FROM users;
|
||||
```
|
||||
|
||||
**Available Functions:**
|
||||
- `pgp_sym_encrypt()` - Symmetric encryption
|
||||
- `pgp_pub_encrypt()` - Public key encryption
|
||||
- `gen_salt()` - Password hashing
|
||||
- `digest()` - Hash functions
|
||||
|
||||
### 2. Backup Strategy
|
||||
|
||||
#### Automated Encrypted Backups
|
||||
|
||||
**Script Location:** `/scripts/encrypted-backup.sh`
|
||||
|
||||
**Features:**
|
||||
- Backs up all 14 PostgreSQL databases
|
||||
- Uses `pg_dump` for data export
|
||||
- Compresses with `gzip` for space efficiency
|
||||
- Encrypts with GPG for security
|
||||
- Output format: `<db>_<name>_<timestamp>.sql.gz.gpg`
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Create encrypted backup
|
||||
./scripts/encrypted-backup.sh
|
||||
|
||||
# Decrypt and restore
|
||||
gpg --decrypt backup_file.sql.gz.gpg | gunzip | psql -U user -d database
|
||||
```
|
||||
|
||||
**Recommended Schedule:**
|
||||
- **Daily backups:** Retain 30 days
|
||||
- **Weekly backups:** Retain 90 days
|
||||
- **Monthly backups:** Retain 1 year
|
||||
|
||||
### 3. Audit Logging
|
||||
|
||||
PostgreSQL logging configuration includes:
|
||||
|
||||
```yaml
|
||||
# Log all connections and disconnections
|
||||
log_connections = on
|
||||
log_disconnections = on
|
||||
|
||||
# Log all SQL statements
|
||||
log_statement = 'all'
|
||||
|
||||
# Log query duration
|
||||
log_duration = on
|
||||
log_min_duration_statement = 1000 # Log queries > 1 second
|
||||
|
||||
# Log detail
|
||||
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
|
||||
```
|
||||
|
||||
**Log Rotation:**
|
||||
- Daily or 100MB size limit
|
||||
- 7-day retention minimum
|
||||
- Ship to centralized logging (recommended)
|
||||
|
||||
---
|
||||
|
||||
## Compliance
|
||||
|
||||
### GDPR (European Data Protection)
|
||||
|
||||
| Requirement | Implementation | Status |
|
||||
|-------------|----------------|--------|
|
||||
| Article 32 - Encryption | TLS for transit, pgcrypto for rest | ✅ Compliant |
|
||||
| Article 5(1)(f) - Security | Strong passwords, access control | ✅ Compliant |
|
||||
| Article 33 - Breach notification | Audit logs for breach detection | ✅ Compliant |
|
||||
|
||||
**Legal Status:** Privacy policy claims are now accurate - encryption is implemented.
|
||||
|
||||
### PCI-DSS (Payment Card Data)
|
||||
|
||||
| Requirement | Implementation | Status |
|
||||
|-------------|----------------|--------|
|
||||
| Requirement 3.4 - Encrypt transmission | TLS 1.2+ for all connections | ✅ Compliant |
|
||||
| Requirement 3.5 - Protect stored data | pgcrypto extension available | ✅ Compliant |
|
||||
| Requirement 10 - Track access | PostgreSQL audit logging | ✅ Compliant |
|
||||
|
||||
### SOC 2 (Security Controls)
|
||||
|
||||
| Control | Implementation | Status |
|
||||
|---------|----------------|--------|
|
||||
| CC6.1 - Access controls | Audit logs, RBAC | ✅ Compliant |
|
||||
| CC6.6 - Encryption in transit | TLS for all database connections | ✅ Compliant |
|
||||
| CC6.7 - Encryption at rest | Kubernetes secrets + pgcrypto | ✅ Compliant |
|
||||
|
||||
---
|
||||
|
||||
## Monitoring and Maintenance
|
||||
|
||||
### Certificate Management
|
||||
|
||||
#### Certificate Expiry Monitoring
|
||||
|
||||
**PostgreSQL and Redis Certificates Expire:** October 17, 2028
|
||||
|
||||
**Renewal Process:**
|
||||
```bash
|
||||
# 1. Regenerate certificates (90 days before expiry)
|
||||
cd infrastructure/tls && ./generate-certificates.sh
|
||||
|
||||
# 2. Update Kubernetes secrets
|
||||
kubectl delete secret postgres-tls redis-tls -n bakery-ia
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/postgres-tls-secret.yaml
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/redis-tls-secret.yaml
|
||||
|
||||
# 3. Restart database pods (automatic)
|
||||
kubectl rollout restart deployment -l app.kubernetes.io/component=database -n bakery-ia
|
||||
```
|
||||
|
||||
### Password Rotation
|
||||
|
||||
**Recommended:** Every 90 days
|
||||
|
||||
**Process:**
|
||||
```bash
|
||||
# 1. Generate new passwords
|
||||
./scripts/generate-passwords.sh > new-passwords.txt
|
||||
|
||||
# 2. Update .env file
|
||||
./scripts/update-env-passwords.sh
|
||||
|
||||
# 3. Update Kubernetes secrets
|
||||
./scripts/update-k8s-secrets.sh
|
||||
|
||||
# 4. Apply secrets
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets.yaml
|
||||
|
||||
# 5. Restart databases and services
|
||||
kubectl rollout restart deployment -n bakery-ia
|
||||
```
|
||||
|
||||
### Health Checks
|
||||
|
||||
#### Verify PostgreSQL SSL
|
||||
```bash
|
||||
# Check SSL is enabled
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SHOW ssl;"'
|
||||
# Expected: on
|
||||
|
||||
# Check certificate permissions
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- ls -la /tls/
|
||||
# Expected: server-key.pem has 600 permissions
|
||||
```
|
||||
|
||||
#### Verify Redis TLS
|
||||
```bash
|
||||
# Test Redis connection with TLS
|
||||
kubectl exec -n bakery-ia <redis-pod> -- redis-cli \
|
||||
--tls \
|
||||
--cert /tls/redis-cert.pem \
|
||||
--key /tls/redis-key.pem \
|
||||
--cacert /tls/ca-cert.pem \
|
||||
-a $REDIS_PASSWORD \
|
||||
ping
|
||||
# Expected: PONG
|
||||
```
|
||||
|
||||
#### Verify PVCs
|
||||
```bash
|
||||
# Check all PVCs are bound
|
||||
kubectl get pvc -n bakery-ia
|
||||
# Expected: All PVCs in "Bound" state
|
||||
```
|
||||
|
||||
### Audit Log Review
|
||||
|
||||
```bash
|
||||
# View PostgreSQL logs
|
||||
kubectl logs -n bakery-ia <db-pod>
|
||||
|
||||
# Search for failed connections
|
||||
kubectl logs -n bakery-ia <db-pod> | grep -i "authentication failed"
|
||||
|
||||
# Search for long-running queries
|
||||
kubectl logs -n bakery-ia <db-pod> | grep -i "duration:"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### PostgreSQL Connection Issues
|
||||
|
||||
#### Services Can't Connect After Deployment
|
||||
|
||||
**Symptom:** Services show SSL/TLS errors in logs
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Restart all services to pick up new TLS configuration
|
||||
kubectl rollout restart deployment -n bakery-ia \
|
||||
--selector='app.kubernetes.io/component=service'
|
||||
```
|
||||
|
||||
#### "SSL not supported" Error
|
||||
|
||||
**Symptom:** `PostgreSQL server rejected SSL upgrade`
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check if TLS secret exists
|
||||
kubectl get secret postgres-tls -n bakery-ia
|
||||
|
||||
# Check if mounted in pod
|
||||
kubectl describe pod <db-pod> -n bakery-ia | grep -A 5 "tls-certs"
|
||||
|
||||
# Restart database pod
|
||||
kubectl delete pod <db-pod> -n bakery-ia
|
||||
```
|
||||
|
||||
#### Certificate Permission Denied
|
||||
|
||||
**Symptom:** `FATAL: could not load server certificate file`
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check init container logs
|
||||
kubectl logs -n bakery-ia <pod> -c fix-tls-permissions
|
||||
|
||||
# Verify certificate permissions
|
||||
kubectl exec -n bakery-ia <pod> -- ls -la /tls/
|
||||
# server-key.pem should have 600 permissions
|
||||
```
|
||||
|
||||
### Redis Connection Issues
|
||||
|
||||
#### Connection Timeout
|
||||
|
||||
**Symptom:** `SSL handshake is taking longer than 60.0 seconds`
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check Redis logs
|
||||
kubectl logs -n bakery-ia <redis-pod>
|
||||
|
||||
# Test Redis directly
|
||||
kubectl exec -n bakery-ia <redis-pod> -- redis-cli \
|
||||
--tls --cert /tls/redis-cert.pem \
|
||||
--key /tls/redis-key.pem \
|
||||
--cacert /tls/ca-cert.pem \
|
||||
PING
|
||||
```
|
||||
|
||||
### Data Persistence Issues
|
||||
|
||||
#### PVC Not Binding
|
||||
|
||||
**Symptom:** PVC stuck in "Pending" state
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check PVC status
|
||||
kubectl describe pvc <pvc-name> -n bakery-ia
|
||||
|
||||
# Check storage class
|
||||
kubectl get storageclass
|
||||
|
||||
# For Kind, ensure local-path provisioner is running
|
||||
kubectl get pods -n local-path-storage
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
### Security Documentation
|
||||
- [RBAC Implementation](./rbac-implementation.md) - Role-based access control
|
||||
- [TLS Configuration](./tls-configuration.md) - TLS/SSL setup details
|
||||
- [Security Checklist](./security-checklist.md) - Deployment checklist
|
||||
|
||||
### Source Reports
|
||||
- [Database Security Analysis Report](../DATABASE_SECURITY_ANALYSIS_REPORT.md)
|
||||
- [Security Implementation Complete](../SECURITY_IMPLEMENTATION_COMPLETE.md)
|
||||
|
||||
### External References
|
||||
- [PostgreSQL SSL Documentation](https://www.postgresql.org/docs/17/ssl-tcp.html)
|
||||
- [Redis TLS Documentation](https://redis.io/docs/manual/security/encryption/)
|
||||
- [Kubernetes Secrets Encryption](https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/)
|
||||
- [pgcrypto Documentation](https://www.postgresql.org/docs/17/pgcrypto.html)
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Common Commands
|
||||
|
||||
```bash
|
||||
# Verify database security
|
||||
kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database
|
||||
kubectl get pvc -n bakery-ia
|
||||
kubectl get secrets -n bakery-ia | grep tls
|
||||
|
||||
# Check certificate expiry
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- \
|
||||
openssl x509 -in /tls/server-cert.pem -noout -dates
|
||||
|
||||
# View audit logs
|
||||
kubectl logs -n bakery-ia <db-pod> | tail -n 100
|
||||
|
||||
# Restart all databases
|
||||
kubectl rollout restart deployment -n bakery-ia \
|
||||
-l app.kubernetes.io/component=database
|
||||
```
|
||||
|
||||
### Security Validation Checklist
|
||||
|
||||
- [ ] All database pods running and healthy
|
||||
- [ ] All PVCs in "Bound" state
|
||||
- [ ] TLS certificates mounted with correct permissions
|
||||
- [ ] PostgreSQL accepts TLS connections
|
||||
- [ ] Redis accepts TLS connections
|
||||
- [ ] pgcrypto extension loaded
|
||||
- [ ] Services connect without TLS errors
|
||||
- [ ] Audit logs being generated
|
||||
- [ ] Passwords are strong (32+ characters)
|
||||
- [ ] Backup script tested and working
|
||||
|
||||
---
|
||||
|
||||
**Document Version:** 1.0
|
||||
**Last Review:** November 2025
|
||||
**Next Review:** May 2026
|
||||
**Owner:** Security Team
|
||||
600
docs/06-security/rbac-implementation.md
Normal file
600
docs/06-security/rbac-implementation.md
Normal file
@@ -0,0 +1,600 @@
|
||||
# Role-Based Access Control (RBAC) Implementation Guide
|
||||
|
||||
**Last Updated:** November 2025
|
||||
**Status:** Implementation in Progress
|
||||
**Platform:** Bakery-IA Microservices
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Role System Architecture](#role-system-architecture)
|
||||
3. [Access Control Implementation](#access-control-implementation)
|
||||
4. [Service-by-Service RBAC Matrix](#service-by-service-rbac-matrix)
|
||||
5. [Implementation Guidelines](#implementation-guidelines)
|
||||
6. [Testing Strategy](#testing-strategy)
|
||||
7. [Related Documentation](#related-documentation)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This guide provides comprehensive information about implementing Role-Based Access Control (RBAC) across the Bakery-IA platform, consisting of 15 microservices with 250+ API endpoints.
|
||||
|
||||
### Key Components
|
||||
|
||||
- **4 User Roles:** Viewer → Member → Admin → Owner (hierarchical)
|
||||
- **3 Subscription Tiers:** Starter → Professional → Enterprise
|
||||
- **250+ API Endpoints:** Requiring granular access control
|
||||
- **Tenant Isolation:** All services enforce tenant-level data isolation
|
||||
|
||||
### Implementation Status
|
||||
|
||||
**Implemented:**
|
||||
- ✅ JWT authentication across all services
|
||||
- ✅ Tenant isolation via path parameters
|
||||
- ✅ Basic admin role checks in auth service
|
||||
- ✅ Subscription tier checking framework
|
||||
|
||||
**In Progress:**
|
||||
- 🔧 Role decorators on service endpoints
|
||||
- 🔧 Subscription tier enforcement on premium features
|
||||
- 🔧 Fine-grained resource permissions
|
||||
- 🔧 Audit logging for sensitive operations
|
||||
|
||||
---
|
||||
|
||||
## Role System Architecture
|
||||
|
||||
### User Role Hierarchy
|
||||
|
||||
Defined in `shared/auth/access_control.py`:
|
||||
|
||||
```python
|
||||
class UserRole(Enum):
|
||||
VIEWER = "viewer" # Read-only access
|
||||
MEMBER = "member" # Read + basic write operations
|
||||
ADMIN = "admin" # Full operational access
|
||||
OWNER = "owner" # Full control including tenant settings
|
||||
|
||||
ROLE_HIERARCHY = {
|
||||
UserRole.VIEWER: 1,
|
||||
UserRole.MEMBER: 2,
|
||||
UserRole.ADMIN: 3,
|
||||
UserRole.OWNER: 4,
|
||||
}
|
||||
```
|
||||
|
||||
### Permission Matrix by Action
|
||||
|
||||
| Action Type | Viewer | Member | Admin | Owner |
|
||||
|-------------|--------|--------|-------|-------|
|
||||
| Read data | ✓ | ✓ | ✓ | ✓ |
|
||||
| Create records | ✗ | ✓ | ✓ | ✓ |
|
||||
| Update records | ✗ | ✓ | ✓ | ✓ |
|
||||
| Delete records | ✗ | ✗ | ✓ | ✓ |
|
||||
| Manage users | ✗ | ✗ | ✓ | ✓ |
|
||||
| Configure settings | ✗ | ✗ | ✓ | ✓ |
|
||||
| Billing/subscription | ✗ | ✗ | ✗ | ✓ |
|
||||
| Delete tenant | ✗ | ✗ | ✗ | ✓ |
|
||||
|
||||
### Subscription Tier System
|
||||
|
||||
```python
|
||||
class SubscriptionTier(Enum):
|
||||
STARTER = "starter" # Basic features
|
||||
PROFESSIONAL = "professional" # Advanced analytics & ML
|
||||
ENTERPRISE = "enterprise" # Full feature set + priority support
|
||||
|
||||
TIER_HIERARCHY = {
|
||||
SubscriptionTier.STARTER: 1,
|
||||
SubscriptionTier.PROFESSIONAL: 2,
|
||||
SubscriptionTier.ENTERPRISE: 3,
|
||||
}
|
||||
```
|
||||
|
||||
### Tier Features Matrix
|
||||
|
||||
| Feature | Starter | Professional | Enterprise |
|
||||
|---------|---------|--------------|------------|
|
||||
| Basic Inventory | ✓ | ✓ | ✓ |
|
||||
| Basic Sales | ✓ | ✓ | ✓ |
|
||||
| Basic Recipes | ✓ | ✓ | ✓ |
|
||||
| ML Forecasting | ✓ (7-day) | ✓ (30+ day) | ✓ (unlimited) |
|
||||
| Model Training | ✓ (1/day, 1k rows) | ✓ (5/day, 10k rows) | ✓ (unlimited) |
|
||||
| Advanced Analytics | ✗ | ✓ | ✓ |
|
||||
| Custom Reports | ✗ | ✓ | ✓ |
|
||||
| Production Optimization | ✓ (basic) | ✓ (advanced) | ✓ (AI-powered) |
|
||||
| Historical Data | 7 days | 90 days | Unlimited |
|
||||
| Multi-location | 1 | 2 | Unlimited |
|
||||
| API Access | ✗ | ✗ | ✓ |
|
||||
| Priority Support | ✗ | ✗ | ✓ |
|
||||
| Max Users | 5 | 20 | Unlimited |
|
||||
| Max Products | 50 | 500 | Unlimited |
|
||||
|
||||
---
|
||||
|
||||
## Access Control Implementation
|
||||
|
||||
### Available Decorators
|
||||
|
||||
The platform provides these decorators in `shared/auth/access_control.py`:
|
||||
|
||||
#### Subscription Tier Enforcement
|
||||
```python
|
||||
# Require specific subscription tier(s)
|
||||
@require_subscription_tier(['professional', 'enterprise'])
|
||||
async def advanced_analytics(...):
|
||||
pass
|
||||
|
||||
# Convenience decorators
|
||||
@enterprise_tier_required
|
||||
async def enterprise_feature(...):
|
||||
pass
|
||||
|
||||
@analytics_tier_required # Requires professional or enterprise
|
||||
async def analytics_endpoint(...):
|
||||
pass
|
||||
```
|
||||
|
||||
#### Role-Based Enforcement
|
||||
```python
|
||||
# Require specific role(s)
|
||||
@require_user_role(['admin', 'owner'])
|
||||
async def delete_resource(...):
|
||||
pass
|
||||
|
||||
# Convenience decorators
|
||||
@admin_role_required
|
||||
async def admin_only(...):
|
||||
pass
|
||||
|
||||
@owner_role_required
|
||||
async def owner_only(...):
|
||||
pass
|
||||
```
|
||||
|
||||
#### Combined Enforcement
|
||||
```python
|
||||
# Require both tier and role
|
||||
@require_tier_and_role(['professional', 'enterprise'], ['admin', 'owner'])
|
||||
async def premium_admin_feature(...):
|
||||
pass
|
||||
```
|
||||
|
||||
### FastAPI Dependencies
|
||||
|
||||
Available in `shared/auth/tenant_access.py`:
|
||||
|
||||
```python
|
||||
from fastapi import Depends
|
||||
from shared.auth.tenant_access import (
|
||||
get_current_user_dep,
|
||||
verify_tenant_access_dep,
|
||||
verify_tenant_permission_dep
|
||||
)
|
||||
|
||||
# Basic authentication
|
||||
@router.get("/{tenant_id}/resource")
|
||||
async def get_resource(
|
||||
tenant_id: str,
|
||||
current_user: Dict = Depends(get_current_user_dep)
|
||||
):
|
||||
pass
|
||||
|
||||
# Tenant access verification
|
||||
@router.get("/{tenant_id}/resource")
|
||||
async def get_resource(
|
||||
tenant_id: str = Depends(verify_tenant_access_dep)
|
||||
):
|
||||
pass
|
||||
|
||||
# Resource permission check
|
||||
@router.delete("/{tenant_id}/resource/{id}")
|
||||
async def delete_resource(
|
||||
tenant_id: str = Depends(verify_tenant_permission_dep("resource", "delete"))
|
||||
):
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Service-by-Service RBAC Matrix
|
||||
|
||||
### Authentication Service
|
||||
|
||||
**Critical Operations:**
|
||||
- User deletion requires **Admin** role + audit logging
|
||||
- Password changes should enforce strong password policy
|
||||
- Email verification prevents account takeover
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/register` | POST | Public | Any | Rate limited |
|
||||
| `/login` | POST | Public | Any | Rate limited (3-5 attempts) |
|
||||
| `/delete/{user_id}` | DELETE | **Admin** | Any | 🔴 CRITICAL - Audit logged |
|
||||
| `/change-password` | POST | Authenticated | Any | Own account only |
|
||||
| `/profile` | GET/PUT | Authenticated | Any | Own account only |
|
||||
|
||||
**Recommendations:**
|
||||
- ✅ IMPLEMENTED: Admin role check on deletion
|
||||
- 🔧 ADD: Rate limiting on login/register
|
||||
- 🔧 ADD: Audit log for user deletion
|
||||
- 🔧 ADD: MFA for admin accounts
|
||||
- 🔧 ADD: Password strength validation
|
||||
|
||||
### Tenant Service
|
||||
|
||||
**Critical Operations:**
|
||||
- Tenant deletion/deactivation (Owner only)
|
||||
- Subscription changes (Owner only)
|
||||
- Role modifications (Admin+, prevent owner changes)
|
||||
- Member removal (Admin+)
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/{tenant_id}` | GET | **Viewer** | Any | Tenant member |
|
||||
| `/{tenant_id}` | PUT | **Admin** | Any | Admin+ only |
|
||||
| `/{tenant_id}/deactivate` | POST | **Owner** | Any | 🔴 CRITICAL - Owner only |
|
||||
| `/{tenant_id}/members` | GET | **Viewer** | Any | View team |
|
||||
| `/{tenant_id}/members` | POST | **Admin** | Any | Invite users |
|
||||
| `/{tenant_id}/members/{user_id}/role` | PUT | **Admin** | Any | Change roles |
|
||||
| `/{tenant_id}/members/{user_id}` | DELETE | **Admin** | Any | 🔴 Remove member |
|
||||
| `/subscriptions/{tenant_id}/upgrade` | POST | **Owner** | Any | 🔴 CRITICAL |
|
||||
| `/subscriptions/{tenant_id}/cancel` | POST | **Owner** | Any | 🔴 CRITICAL |
|
||||
|
||||
**Recommendations:**
|
||||
- ✅ IMPLEMENTED: Role checks for member management
|
||||
- 🔧 ADD: Prevent removing the last owner
|
||||
- 🔧 ADD: Prevent owner from changing their own role
|
||||
- 🔧 ADD: Subscription change confirmation
|
||||
- 🔧 ADD: Audit log for all tenant modifications
|
||||
|
||||
### Sales Service
|
||||
|
||||
**Critical Operations:**
|
||||
- Sales record deletion (affects financial reports)
|
||||
- Product deletion (affects historical data)
|
||||
- Bulk imports (data integrity)
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/{tenant_id}/sales` | GET | **Viewer** | Any | Read sales data |
|
||||
| `/{tenant_id}/sales` | POST | **Member** | Any | Create sales |
|
||||
| `/{tenant_id}/sales/{id}` | DELETE | **Admin** | Any | 🔴 Affects reports |
|
||||
| `/{tenant_id}/products/{id}` | DELETE | **Admin** | Any | 🔴 Affects history |
|
||||
| `/{tenant_id}/analytics/*` | GET | **Viewer** | **Professional** | 💰 Premium |
|
||||
|
||||
**Recommendations:**
|
||||
- 🔧 ADD: Soft delete for sales records (audit trail)
|
||||
- 🔧 ADD: Subscription tier check on analytics endpoints
|
||||
- 🔧 ADD: Prevent deletion of products with sales history
|
||||
|
||||
### Inventory Service
|
||||
|
||||
**Critical Operations:**
|
||||
- Ingredient deletion (affects recipes)
|
||||
- Manual stock adjustments (inventory manipulation)
|
||||
- Compliance record deletion (regulatory violation)
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/{tenant_id}/ingredients` | GET | **Viewer** | Any | List ingredients |
|
||||
| `/{tenant_id}/ingredients/{id}` | DELETE | **Admin** | Any | 🔴 Affects recipes |
|
||||
| `/{tenant_id}/stock/adjustments` | POST | **Admin** | Any | 🔴 Manual adjustment |
|
||||
| `/{tenant_id}/analytics/*` | GET | **Viewer** | **Professional** | 💰 Premium |
|
||||
| `/{tenant_id}/reports/cost-analysis` | GET | **Admin** | **Professional** | 💰 Sensitive |
|
||||
|
||||
**Recommendations:**
|
||||
- 🔧 ADD: Prevent deletion of ingredients used in recipes
|
||||
- 🔧 ADD: Audit log for all stock adjustments
|
||||
- 🔧 ADD: Compliance records cannot be deleted
|
||||
- 🔧 ADD: Role check: only Admin+ can see cost data
|
||||
|
||||
### Production Service
|
||||
|
||||
**Critical Operations:**
|
||||
- Batch deletion (affects inventory and tracking)
|
||||
- Schedule changes (affects production timeline)
|
||||
- Quality check modifications (compliance)
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/{tenant_id}/batches` | GET | **Viewer** | Any | View batches |
|
||||
| `/{tenant_id}/batches/{id}` | DELETE | **Admin** | Any | 🔴 Affects tracking |
|
||||
| `/{tenant_id}/schedules/{id}` | PUT | **Admin** | Any | Schedule changes |
|
||||
| `/{tenant_id}/capacity/optimize` | POST | **Admin** | Any | Basic optimization |
|
||||
| `/{tenant_id}/efficiency-trends` | GET | **Viewer** | **Professional** | 💰 Historical trends |
|
||||
| `/{tenant_id}/capacity-analysis` | GET | **Admin** | **Professional** | 💰 Advanced analysis |
|
||||
|
||||
**Tier-Based Features:**
|
||||
- **Starter:** Basic capacity, 7-day history, simple optimization
|
||||
- **Professional:** Advanced metrics, 90-day history, advanced algorithms
|
||||
- **Enterprise:** Predictive maintenance, unlimited history, AI-powered
|
||||
|
||||
**Recommendations:**
|
||||
- 🔧 ADD: Optimization depth limits per tier
|
||||
- 🔧 ADD: Historical data limits (7/90/unlimited days)
|
||||
- 🔧 ADD: Prevent deletion of completed batches
|
||||
|
||||
### Forecasting Service
|
||||
|
||||
**Critical Operations:**
|
||||
- Forecast generation (consumes ML resources)
|
||||
- Bulk operations (resource intensive)
|
||||
- Scenario creation (computational cost)
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/{tenant_id}/forecasts` | GET | **Viewer** | Any | View forecasts |
|
||||
| `/{tenant_id}/forecasts/generate` | POST | **Admin** | Any | Trigger ML forecast |
|
||||
| `/{tenant_id}/scenarios` | GET | **Viewer** | **Enterprise** | 💰 Scenario modeling |
|
||||
| `/{tenant_id}/scenarios` | POST | **Admin** | **Enterprise** | 💰 Create scenario |
|
||||
| `/{tenant_id}/analytics/accuracy` | GET | **Viewer** | **Professional** | 💰 Model metrics |
|
||||
|
||||
**Tier-Based Limits:**
|
||||
- **Starter:** 7-day forecasts, 10/day quota
|
||||
- **Professional:** 30+ day forecasts, 100/day quota, accuracy metrics
|
||||
- **Enterprise:** Unlimited forecasts, scenario modeling, custom parameters
|
||||
|
||||
**Recommendations:**
|
||||
- 🔧 ADD: Forecast horizon limits per tier
|
||||
- 🔧 ADD: Rate limiting based on tier (ML cost)
|
||||
- 🔧 ADD: Quota limits per subscription tier
|
||||
- 🔧 ADD: Scenario modeling only for Enterprise
|
||||
|
||||
### Training Service
|
||||
|
||||
**Critical Operations:**
|
||||
- Model training (expensive ML operations)
|
||||
- Model deployment (affects production forecasts)
|
||||
- Model retraining (overwrites existing models)
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/{tenant_id}/training-jobs` | POST | **Admin** | Any | Start training |
|
||||
| `/{tenant_id}/training-jobs/{id}/cancel` | POST | **Admin** | Any | Cancel training |
|
||||
| `/{tenant_id}/models/{id}/deploy` | POST | **Admin** | Any | 🔴 Deploy model |
|
||||
| `/{tenant_id}/models/{id}/artifacts` | GET | **Admin** | **Enterprise** | 💰 Download artifacts |
|
||||
| `/ws/{tenant_id}/training` | WebSocket | **Admin** | Any | Real-time updates |
|
||||
|
||||
**Tier-Based Quotas:**
|
||||
- **Starter:** 1 training job/day, 1k rows max, simple Prophet
|
||||
- **Professional:** 5 jobs/day, 10k rows max, model versioning
|
||||
- **Enterprise:** Unlimited jobs, unlimited rows, custom parameters
|
||||
|
||||
**Recommendations:**
|
||||
- 🔧 ADD: Training quota per subscription tier
|
||||
- 🔧 ADD: Dataset size limits per tier
|
||||
- 🔧 ADD: Queue priority based on subscription
|
||||
- 🔧 ADD: Artifact download only for Enterprise
|
||||
|
||||
### Orders Service
|
||||
|
||||
**Critical Operations:**
|
||||
- Order cancellation (affects production and customer)
|
||||
- Customer deletion (GDPR compliance required)
|
||||
- Procurement scheduling (affects inventory)
|
||||
|
||||
| Endpoint | Method | Min Role | Min Tier | Notes |
|
||||
|----------|--------|----------|----------|-------|
|
||||
| `/{tenant_id}/orders` | GET | **Viewer** | Any | View orders |
|
||||
| `/{tenant_id}/orders/{id}/cancel` | POST | **Admin** | Any | 🔴 Cancel order |
|
||||
| `/{tenant_id}/customers/{id}` | DELETE | **Admin** | Any | 🔴 GDPR compliance |
|
||||
| `/{tenant_id}/procurement/requirements` | GET | **Admin** | **Professional** | 💰 Planning |
|
||||
| `/{tenant_id}/procurement/schedule` | POST | **Admin** | **Professional** | 💰 Scheduling |
|
||||
|
||||
**Recommendations:**
|
||||
- 🔧 ADD: Order cancellation requires reason/notes
|
||||
- 🔧 ADD: Customer deletion with GDPR-compliant export
|
||||
- 🔧 ADD: Soft delete for orders (audit trail)
|
||||
|
||||
---
|
||||
|
||||
## Implementation Guidelines
|
||||
|
||||
### Step 1: Add Role Decorators
|
||||
|
||||
```python
|
||||
from shared.auth.access_control import require_user_role
|
||||
|
||||
@router.delete("/{tenant_id}/sales/{sale_id}")
|
||||
@require_user_role(['admin', 'owner'])
|
||||
async def delete_sale(
|
||||
tenant_id: str,
|
||||
sale_id: str,
|
||||
current_user: Dict = Depends(get_current_user_dep)
|
||||
):
|
||||
# Existing logic...
|
||||
pass
|
||||
```
|
||||
|
||||
### Step 2: Add Subscription Tier Checks
|
||||
|
||||
```python
|
||||
from shared.auth.access_control import require_subscription_tier
|
||||
|
||||
@router.post("/{tenant_id}/forecasts/generate")
|
||||
@require_user_role(['admin', 'owner'])
|
||||
async def generate_forecast(
|
||||
tenant_id: str,
|
||||
horizon_days: int,
|
||||
current_user: Dict = Depends(get_current_user_dep)
|
||||
):
|
||||
# Check tier-based limits
|
||||
tier = current_user.get('subscription_tier', 'starter')
|
||||
max_horizon = {
|
||||
'starter': 7,
|
||||
'professional': 90,
|
||||
'enterprise': 365
|
||||
}
|
||||
|
||||
if horizon_days > max_horizon.get(tier, 7):
|
||||
raise HTTPException(
|
||||
status_code=402,
|
||||
detail=f"Forecast horizon limited to {max_horizon[tier]} days for {tier} tier"
|
||||
)
|
||||
|
||||
# Check daily quota
|
||||
daily_quota = {'starter': 10, 'professional': 100, 'enterprise': None}
|
||||
if not await check_quota(tenant_id, 'forecasts', daily_quota[tier]):
|
||||
raise HTTPException(
|
||||
status_code=429,
|
||||
detail=f"Daily forecast quota exceeded for {tier} tier"
|
||||
)
|
||||
|
||||
# Existing logic...
|
||||
```
|
||||
|
||||
### Step 3: Add Audit Logging
|
||||
|
||||
```python
|
||||
from shared.audit import log_audit_event
|
||||
|
||||
@router.delete("/{tenant_id}/customers/{customer_id}")
|
||||
@require_user_role(['admin', 'owner'])
|
||||
async def delete_customer(
|
||||
tenant_id: str,
|
||||
customer_id: str,
|
||||
current_user: Dict = Depends(get_current_user_dep)
|
||||
):
|
||||
# Existing deletion logic...
|
||||
|
||||
# Add audit log
|
||||
await log_audit_event(
|
||||
tenant_id=tenant_id,
|
||||
user_id=current_user["user_id"],
|
||||
action="customer.delete",
|
||||
resource_type="customer",
|
||||
resource_id=customer_id,
|
||||
severity="high"
|
||||
)
|
||||
```
|
||||
|
||||
### Step 4: Implement Rate Limiting
|
||||
|
||||
```python
|
||||
from shared.rate_limit import check_quota
|
||||
|
||||
@router.post("/{tenant_id}/training-jobs")
|
||||
@require_user_role(['admin', 'owner'])
|
||||
async def create_training_job(
|
||||
tenant_id: str,
|
||||
dataset_rows: int,
|
||||
current_user: Dict = Depends(get_current_user_dep)
|
||||
):
|
||||
tier = current_user.get('subscription_tier', 'starter')
|
||||
|
||||
# Check daily quota
|
||||
daily_limits = {'starter': 1, 'professional': 5, 'enterprise': None}
|
||||
if not await check_quota(tenant_id, 'training_jobs', daily_limits[tier], period=86400):
|
||||
raise HTTPException(
|
||||
status_code=429,
|
||||
detail=f"Daily training job limit reached for {tier} tier ({daily_limits[tier]}/day)"
|
||||
)
|
||||
|
||||
# Check dataset size limit
|
||||
dataset_limits = {'starter': 1000, 'professional': 10000, 'enterprise': None}
|
||||
if dataset_limits[tier] and dataset_rows > dataset_limits[tier]:
|
||||
raise HTTPException(
|
||||
status_code=402,
|
||||
detail=f"Dataset size limited to {dataset_limits[tier]} rows for {tier} tier"
|
||||
)
|
||||
|
||||
# Existing logic...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```python
|
||||
# Test role enforcement
|
||||
def test_delete_requires_admin_role():
|
||||
response = client.delete(
|
||||
"/api/v1/tenant123/sales/sale456",
|
||||
headers={"Authorization": f"Bearer {member_token}"}
|
||||
)
|
||||
assert response.status_code == 403
|
||||
assert "insufficient_permissions" in response.json()["detail"]["error"]
|
||||
|
||||
# Test subscription tier enforcement
|
||||
def test_forecasting_horizon_limit_starter():
|
||||
response = client.post(
|
||||
"/api/v1/tenant123/forecasts/generate",
|
||||
json={"horizon_days": 30}, # Exceeds 7-day limit
|
||||
headers={"Authorization": f"Bearer {starter_user_token}"}
|
||||
)
|
||||
assert response.status_code == 402 # Payment Required
|
||||
assert "limited to 7 days" in response.json()["detail"]
|
||||
|
||||
# Test training job quota
|
||||
def test_training_job_daily_quota_starter():
|
||||
# First job succeeds
|
||||
response1 = client.post(
|
||||
"/api/v1/tenant123/training-jobs",
|
||||
json={"dataset_rows": 500},
|
||||
headers={"Authorization": f"Bearer {starter_admin_token}"}
|
||||
)
|
||||
assert response1.status_code == 200
|
||||
|
||||
# Second job on same day fails (1/day limit)
|
||||
response2 = client.post(
|
||||
"/api/v1/tenant123/training-jobs",
|
||||
json={"dataset_rows": 500},
|
||||
headers={"Authorization": f"Bearer {starter_admin_token}"}
|
||||
)
|
||||
assert response2.status_code == 429 # Too Many Requests
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```python
|
||||
# Test tenant isolation
|
||||
def test_user_cannot_access_other_tenant():
|
||||
response = client.get(
|
||||
"/api/v1/tenant456/sales", # Different tenant
|
||||
headers={"Authorization": f"Bearer {user_token}"}
|
||||
)
|
||||
assert response.status_code == 403
|
||||
```
|
||||
|
||||
### Security Tests
|
||||
|
||||
```python
|
||||
# Test rate limiting
|
||||
def test_training_job_rate_limit():
|
||||
for i in range(6):
|
||||
response = client.post(
|
||||
"/api/v1/tenant123/training-jobs",
|
||||
headers={"Authorization": f"Bearer {admin_token}"}
|
||||
)
|
||||
assert response.status_code == 429 # Too Many Requests
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
### Security Documentation
|
||||
- [Database Security](./database-security.md) - Database security implementation
|
||||
- [TLS Configuration](./tls-configuration.md) - TLS/SSL setup details
|
||||
- [Security Checklist](./security-checklist.md) - Deployment checklist
|
||||
|
||||
### Source Reports
|
||||
- [RBAC Analysis Report](../RBAC_ANALYSIS_REPORT.md) - Complete analysis
|
||||
|
||||
### Code References
|
||||
- `shared/auth/access_control.py` - Role and tier decorators
|
||||
- `shared/auth/tenant_access.py` - FastAPI dependencies
|
||||
- `services/tenant/app/models/tenants.py` - Tenant member model
|
||||
|
||||
---
|
||||
|
||||
**Document Version:** 1.0
|
||||
**Last Review:** November 2025
|
||||
**Next Review:** February 2026
|
||||
**Owner:** Security & Platform Team
|
||||
704
docs/06-security/security-checklist.md
Normal file
704
docs/06-security/security-checklist.md
Normal file
@@ -0,0 +1,704 @@
|
||||
# Security Deployment Checklist
|
||||
|
||||
**Last Updated:** November 2025
|
||||
**Status:** Production Deployment Guide
|
||||
**Security Grade Target:** A-
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Pre-Deployment Checklist](#pre-deployment-checklist)
|
||||
3. [Deployment Steps](#deployment-steps)
|
||||
4. [Verification Checklist](#verification-checklist)
|
||||
5. [Post-Deployment Tasks](#post-deployment-tasks)
|
||||
6. [Ongoing Maintenance](#ongoing-maintenance)
|
||||
7. [Security Hardening Roadmap](#security-hardening-roadmap)
|
||||
8. [Related Documentation](#related-documentation)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This checklist ensures all security measures are properly implemented before deploying the Bakery IA platform to production.
|
||||
|
||||
### Security Grade Targets
|
||||
|
||||
| Phase | Security Grade | Timeframe |
|
||||
|-------|----------------|-----------|
|
||||
| Pre-Implementation | D- | Baseline |
|
||||
| Phase 1 Complete | C+ | Week 1-2 |
|
||||
| Phase 2 Complete | B | Week 3-4 |
|
||||
| Phase 3 Complete | A- | Week 5-6 |
|
||||
| Full Hardening | A | Month 3 |
|
||||
|
||||
---
|
||||
|
||||
## Pre-Deployment Checklist
|
||||
|
||||
### Infrastructure Preparation
|
||||
|
||||
#### Certificate Infrastructure
|
||||
- [ ] Generate TLS certificates using `/infrastructure/tls/generate-certificates.sh`
|
||||
- [ ] Verify CA certificate created (10-year validity)
|
||||
- [ ] Verify PostgreSQL server certificates (3-year validity)
|
||||
- [ ] Verify Redis server certificates (3-year validity)
|
||||
- [ ] Store CA private key securely (NOT in version control)
|
||||
- [ ] Document certificate expiry dates (October 2028)
|
||||
|
||||
#### Kubernetes Cluster
|
||||
- [ ] Kubernetes cluster running (Kind, GKE, EKS, or AKS)
|
||||
- [ ] `kubectl` configured and working
|
||||
- [ ] Namespace `bakery-ia` created
|
||||
- [ ] Storage class available for PVCs
|
||||
- [ ] Sufficient resources (CPU: 4+ cores, RAM: 8GB+, Storage: 50GB+)
|
||||
|
||||
#### Secrets Management
|
||||
- [ ] Generate strong passwords (32 characters): `openssl rand -base64 32`
|
||||
- [ ] Create `.env` file with new passwords (use `.env.example` as template)
|
||||
- [ ] Update `infrastructure/kubernetes/base/secrets.yaml` with base64-encoded passwords
|
||||
- [ ] Generate AES-256 key for Kubernetes secrets encryption
|
||||
- [ ] **Verify passwords are NOT default values** (`*_pass123` is insecure!)
|
||||
- [ ] Store backup of passwords in secure password manager
|
||||
- [ ] Document password rotation schedule (every 90 days)
|
||||
|
||||
### Security Configuration Files
|
||||
|
||||
#### Database Security
|
||||
- [ ] PostgreSQL TLS secret created: `postgres-tls-secret.yaml`
|
||||
- [ ] Redis TLS secret created: `redis-tls-secret.yaml`
|
||||
- [ ] PostgreSQL logging ConfigMap created: `postgres-logging-config.yaml`
|
||||
- [ ] PostgreSQL init ConfigMap includes pgcrypto extension
|
||||
|
||||
#### Application Security
|
||||
- [ ] All database URLs include `?ssl=require` parameter
|
||||
- [ ] Redis URLs use `rediss://` protocol
|
||||
- [ ] Service-to-service authentication configured
|
||||
- [ ] CORS configured for frontend
|
||||
- [ ] Rate limiting enabled on authentication endpoints
|
||||
|
||||
---
|
||||
|
||||
## Deployment Steps
|
||||
|
||||
### Phase 1: Database Security (CRITICAL - Week 1)
|
||||
|
||||
**Time Required:** 2-3 hours
|
||||
|
||||
#### Step 1.1: Deploy PersistentVolumeClaims
|
||||
```bash
|
||||
# Verify PVCs exist in database YAML files
|
||||
grep -r "PersistentVolumeClaim" infrastructure/kubernetes/base/components/databases/
|
||||
|
||||
# Apply database deployments (includes PVCs)
|
||||
kubectl apply -f infrastructure/kubernetes/base/components/databases/
|
||||
|
||||
# Verify PVCs are bound
|
||||
kubectl get pvc -n bakery-ia
|
||||
```
|
||||
|
||||
**Expected:** 15 PVCs (14 PostgreSQL + 1 Redis) in "Bound" state
|
||||
|
||||
- [ ] All PostgreSQL PVCs created (2Gi each)
|
||||
- [ ] Redis PVC created
|
||||
- [ ] All PVCs in "Bound" state
|
||||
- [ ] Storage class supports dynamic provisioning
|
||||
|
||||
#### Step 1.2: Deploy TLS Certificates
|
||||
```bash
|
||||
# Create TLS secrets
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/postgres-tls-secret.yaml
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/redis-tls-secret.yaml
|
||||
|
||||
# Verify secrets created
|
||||
kubectl get secrets -n bakery-ia | grep tls
|
||||
```
|
||||
|
||||
**Expected:** `postgres-tls` and `redis-tls` secrets exist
|
||||
|
||||
- [ ] PostgreSQL TLS secret created
|
||||
- [ ] Redis TLS secret created
|
||||
- [ ] Secrets contain all required keys (cert, key, ca)
|
||||
|
||||
#### Step 1.3: Deploy PostgreSQL Configuration
|
||||
```bash
|
||||
# Apply PostgreSQL logging config
|
||||
kubectl apply -f infrastructure/kubernetes/base/configmaps/postgres-logging-config.yaml
|
||||
|
||||
# Apply PostgreSQL init config (pgcrypto)
|
||||
kubectl apply -f infrastructure/kubernetes/base/configs/postgres-init-config.yaml
|
||||
|
||||
# Verify ConfigMaps
|
||||
kubectl get configmap -n bakery-ia | grep postgres
|
||||
```
|
||||
|
||||
- [ ] PostgreSQL logging ConfigMap created
|
||||
- [ ] PostgreSQL init ConfigMap created (includes pgcrypto)
|
||||
- [ ] Configuration includes SSL settings
|
||||
|
||||
#### Step 1.4: Update Application Secrets
|
||||
```bash
|
||||
# Apply updated secrets with strong passwords
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets.yaml
|
||||
|
||||
# Verify secrets updated
|
||||
kubectl get secret bakery-ia-secrets -n bakery-ia -o yaml
|
||||
```
|
||||
|
||||
- [ ] All database passwords updated (32+ characters)
|
||||
- [ ] Redis password updated
|
||||
- [ ] JWT secret updated
|
||||
- [ ] Database connection URLs include SSL parameters
|
||||
|
||||
#### Step 1.5: Deploy Databases
|
||||
```bash
|
||||
# Deploy all databases
|
||||
kubectl apply -f infrastructure/kubernetes/base/components/databases/
|
||||
|
||||
# Wait for databases to be ready (may take 5-10 minutes)
|
||||
kubectl wait --for=condition=ready pod -l app.kubernetes.io/component=database -n bakery-ia --timeout=600s
|
||||
|
||||
# Check database pod status
|
||||
kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database
|
||||
```
|
||||
|
||||
**Expected:** All 14 PostgreSQL + 1 Redis pods in "Running" state
|
||||
|
||||
- [ ] All 14 PostgreSQL database pods running
|
||||
- [ ] Redis pod running
|
||||
- [ ] No pod crashes or restarts
|
||||
- [ ] Init containers completed successfully
|
||||
|
||||
### Phase 2: Service Deployment (Week 2)
|
||||
|
||||
#### Step 2.1: Deploy Database Migrations
|
||||
```bash
|
||||
# Apply migration jobs
|
||||
kubectl apply -f infrastructure/kubernetes/base/migrations/
|
||||
|
||||
# Wait for migrations to complete
|
||||
kubectl wait --for=condition=complete job -l app.kubernetes.io/component=migration -n bakery-ia --timeout=600s
|
||||
|
||||
# Check migration status
|
||||
kubectl get jobs -n bakery-ia | grep migration
|
||||
```
|
||||
|
||||
**Expected:** All migration jobs show "COMPLETIONS = 1/1"
|
||||
|
||||
- [ ] All database migration jobs completed successfully
|
||||
- [ ] No migration errors in logs
|
||||
- [ ] Database schemas created
|
||||
|
||||
#### Step 2.2: Deploy Services
|
||||
```bash
|
||||
# Deploy all microservices
|
||||
kubectl apply -f infrastructure/kubernetes/base/components/services/
|
||||
|
||||
# Wait for services to be ready
|
||||
kubectl wait --for=condition=ready pod -l app.kubernetes.io/component=service -n bakery-ia --timeout=600s
|
||||
|
||||
# Check service status
|
||||
kubectl get pods -n bakery-ia -l app.kubernetes.io/component=service
|
||||
```
|
||||
|
||||
**Expected:** All 15 service pods in "Running" state
|
||||
|
||||
- [ ] All microservice pods running
|
||||
- [ ] Services connect to databases with TLS
|
||||
- [ ] No SSL/TLS errors in logs
|
||||
- [ ] Health endpoints responding
|
||||
|
||||
#### Step 2.3: Deploy Gateway and Frontend
|
||||
```bash
|
||||
# Deploy API gateway
|
||||
kubectl apply -f infrastructure/kubernetes/base/components/gateway/
|
||||
|
||||
# Deploy frontend
|
||||
kubectl apply -f infrastructure/kubernetes/base/components/frontend/
|
||||
|
||||
# Check deployment status
|
||||
kubectl get pods -n bakery-ia
|
||||
```
|
||||
|
||||
- [ ] Gateway pod running
|
||||
- [ ] Frontend pod running
|
||||
- [ ] Ingress configured (if applicable)
|
||||
|
||||
### Phase 3: Security Hardening (Week 3-4)
|
||||
|
||||
#### Step 3.1: Enable Kubernetes Secrets Encryption
|
||||
```bash
|
||||
# REQUIRES CLUSTER RECREATION
|
||||
|
||||
# Delete existing cluster (WARNING: destroys all data)
|
||||
kind delete cluster --name bakery-ia-local
|
||||
|
||||
# Create cluster with encryption enabled
|
||||
kind create cluster --config kind-config.yaml
|
||||
|
||||
# Re-deploy entire stack
|
||||
kubectl apply -f infrastructure/kubernetes/base/namespace.yaml
|
||||
./scripts/apply-security-changes.sh
|
||||
```
|
||||
|
||||
- [ ] Encryption configuration file created
|
||||
- [ ] Kind cluster configured with encryption
|
||||
- [ ] All secrets encrypted at rest
|
||||
- [ ] Encryption verified (check kube-apiserver logs)
|
||||
|
||||
#### Step 3.2: Configure Audit Logging
|
||||
```bash
|
||||
# Verify PostgreSQL logging enabled
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SHOW log_statement;"'
|
||||
|
||||
# Should show: all
|
||||
```
|
||||
|
||||
- [ ] PostgreSQL logs all statements
|
||||
- [ ] Connection logging enabled
|
||||
- [ ] Query duration logging enabled
|
||||
- [ ] Log rotation configured
|
||||
|
||||
#### Step 3.3: Enable pgcrypto Extension
|
||||
```bash
|
||||
# Verify pgcrypto installed
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SELECT * FROM pg_extension WHERE extname='"'"'pgcrypto'"'"';"'
|
||||
|
||||
# Should return one row
|
||||
```
|
||||
|
||||
- [ ] pgcrypto extension available in all databases
|
||||
- [ ] Encryption functions tested
|
||||
- [ ] Documentation for using column-level encryption provided
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
### Database Security Verification
|
||||
|
||||
#### PostgreSQL TLS
|
||||
```bash
|
||||
# 1. Verify SSL enabled
|
||||
kubectl exec -n bakery-ia auth-db-<pod-id> -- sh -c \
|
||||
'psql -U auth_user -d auth_db -c "SHOW ssl;"'
|
||||
# Expected: on
|
||||
|
||||
# 2. Verify TLS version
|
||||
kubectl exec -n bakery-ia auth-db-<pod-id> -- sh -c \
|
||||
'psql -U auth_user -d auth_db -c "SHOW ssl_min_protocol_version;"'
|
||||
# Expected: TLSv1.2
|
||||
|
||||
# 3. Verify certificate permissions
|
||||
kubectl exec -n bakery-ia auth-db-<pod-id> -- ls -la /tls/
|
||||
# Expected: server-key.pem = 600, server-cert.pem = 644
|
||||
|
||||
# 4. Check certificate expiry
|
||||
kubectl exec -n bakery-ia auth-db-<pod-id> -- \
|
||||
openssl x509 -in /tls/server-cert.pem -noout -dates
|
||||
# Expected: notAfter=Oct 17 00:00:00 2028 GMT
|
||||
```
|
||||
|
||||
**Verification Checklist:**
|
||||
- [ ] SSL enabled on all 14 PostgreSQL databases
|
||||
- [ ] TLS 1.2+ enforced
|
||||
- [ ] Certificates have correct permissions (key=600, cert=644)
|
||||
- [ ] Certificates valid until 2028
|
||||
- [ ] All certificates owned by postgres user
|
||||
|
||||
#### Redis TLS
|
||||
```bash
|
||||
# 1. Test Redis TLS connection
|
||||
kubectl exec -n bakery-ia redis-<pod-id> -- redis-cli \
|
||||
--tls \
|
||||
--cert /tls/redis-cert.pem \
|
||||
--key /tls/redis-key.pem \
|
||||
--cacert /tls/ca-cert.pem \
|
||||
-a <redis-password> \
|
||||
ping
|
||||
# Expected: PONG
|
||||
|
||||
# 2. Verify plaintext port disabled
|
||||
kubectl exec -n bakery-ia redis-<pod-id> -- redis-cli -a <redis-password> ping
|
||||
# Expected: Connection refused
|
||||
```
|
||||
|
||||
**Verification Checklist:**
|
||||
- [ ] Redis responds to TLS connections
|
||||
- [ ] Plaintext connections refused
|
||||
- [ ] Password authentication working
|
||||
- [ ] No "wrong version number" errors in logs
|
||||
|
||||
#### Service Connections
|
||||
```bash
|
||||
# 1. Check migration jobs
|
||||
kubectl get jobs -n bakery-ia | grep migration
|
||||
# Expected: All show "1/1" completions
|
||||
|
||||
# 2. Check service logs for SSL enforcement
|
||||
kubectl logs -n bakery-ia auth-service-<pod-id> | grep "SSL enforcement"
|
||||
# Expected: "SSL enforcement added to database URL"
|
||||
|
||||
# 3. Check for connection errors
|
||||
kubectl logs -n bakery-ia auth-service-<pod-id> | grep -i "error" | grep -i "ssl"
|
||||
# Expected: No SSL/TLS errors
|
||||
```
|
||||
|
||||
**Verification Checklist:**
|
||||
- [ ] All migration jobs completed successfully
|
||||
- [ ] Services show SSL enforcement in logs
|
||||
- [ ] No TLS/SSL connection errors
|
||||
- [ ] All services can connect to databases
|
||||
- [ ] Health endpoints return 200 OK
|
||||
|
||||
### Data Persistence Verification
|
||||
|
||||
```bash
|
||||
# 1. Check all PVCs
|
||||
kubectl get pvc -n bakery-ia
|
||||
# Expected: 15 PVCs, all "Bound"
|
||||
|
||||
# 2. Check PVC sizes
|
||||
kubectl get pvc -n bakery-ia -o custom-columns=NAME:.metadata.name,SIZE:.spec.resources.requests.storage
|
||||
# Expected: PostgreSQL=2Gi, Redis=1Gi
|
||||
|
||||
# 3. Test data persistence (restart a database)
|
||||
kubectl delete pod auth-db-<pod-id> -n bakery-ia
|
||||
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=auth-db -n bakery-ia --timeout=120s
|
||||
# Data should persist after restart
|
||||
```
|
||||
|
||||
**Verification Checklist:**
|
||||
- [ ] All 15 PVCs in "Bound" state
|
||||
- [ ] Correct storage sizes allocated
|
||||
- [ ] Data persists across pod restarts
|
||||
- [ ] No emptyDir volumes for databases
|
||||
|
||||
### Password Security Verification
|
||||
|
||||
```bash
|
||||
# 1. Check password strength
|
||||
kubectl get secret bakery-ia-secrets -n bakery-ia -o jsonpath='{.data.AUTH_DB_PASSWORD}' | base64 -d | wc -c
|
||||
# Expected: 32 or more characters
|
||||
|
||||
# 2. Verify passwords are NOT defaults
|
||||
kubectl get secret bakery-ia-secrets -n bakery-ia -o jsonpath='{.data.AUTH_DB_PASSWORD}' | base64 -d
|
||||
# Should NOT be: auth_pass123
|
||||
```
|
||||
|
||||
**Verification Checklist:**
|
||||
- [ ] All passwords 32+ characters
|
||||
- [ ] Passwords use cryptographically secure random generation
|
||||
- [ ] No default passwords (`*_pass123`) in use
|
||||
- [ ] Passwords backed up in secure location
|
||||
- [ ] Password rotation schedule documented
|
||||
|
||||
### Compliance Verification
|
||||
|
||||
**GDPR Article 32:**
|
||||
- [ ] Encryption in transit implemented (TLS)
|
||||
- [ ] Encryption at rest available (pgcrypto + K8s)
|
||||
- [ ] Privacy policy claims are accurate
|
||||
- [ ] User data access logging enabled
|
||||
|
||||
**PCI-DSS:**
|
||||
- [ ] Requirement 3.4: Transmission encryption (TLS) ✓
|
||||
- [ ] Requirement 3.5: Stored data protection (pgcrypto) ✓
|
||||
- [ ] Requirement 10: Access tracking (audit logs) ✓
|
||||
|
||||
**SOC 2:**
|
||||
- [ ] CC6.1: Access controls (RBAC) ✓
|
||||
- [ ] CC6.6: Transit encryption (TLS) ✓
|
||||
- [ ] CC6.7: Rest encryption (K8s + pgcrypto) ✓
|
||||
|
||||
---
|
||||
|
||||
## Post-Deployment Tasks
|
||||
|
||||
### Immediate (First 24 Hours)
|
||||
|
||||
#### Backup Configuration
|
||||
```bash
|
||||
# 1. Test backup script
|
||||
./scripts/encrypted-backup.sh
|
||||
|
||||
# 2. Verify backup created
|
||||
ls -lh /path/to/backups/
|
||||
|
||||
# 3. Test restore process
|
||||
gpg --decrypt backup_file.sql.gz.gpg | gunzip | head -n 10
|
||||
```
|
||||
|
||||
- [ ] Backup script tested and working
|
||||
- [ ] Backups encrypted with GPG
|
||||
- [ ] Restore process documented and tested
|
||||
- [ ] Backup storage location configured
|
||||
- [ ] Backup retention policy defined
|
||||
|
||||
#### Monitoring Setup
|
||||
```bash
|
||||
# 1. Set up certificate expiry monitoring
|
||||
# Add to monitoring system: Alert 90 days before October 2028
|
||||
|
||||
# 2. Set up database health checks
|
||||
# Monitor: Connection count, query performance, disk usage
|
||||
|
||||
# 3. Set up audit log monitoring
|
||||
# Monitor: Failed login attempts, privilege escalations
|
||||
```
|
||||
|
||||
- [ ] Certificate expiry alerts configured
|
||||
- [ ] Database health monitoring enabled
|
||||
- [ ] Audit log monitoring configured
|
||||
- [ ] Security event alerts configured
|
||||
- [ ] Performance monitoring enabled
|
||||
|
||||
### First Week
|
||||
|
||||
#### Security Audit
|
||||
```bash
|
||||
# 1. Review audit logs
|
||||
kubectl logs -n bakery-ia <db-pod> | grep -i "authentication failed"
|
||||
|
||||
# 2. Review access patterns
|
||||
kubectl logs -n bakery-ia <db-pod> | grep -i "connection received"
|
||||
|
||||
# 3. Check for anomalies
|
||||
kubectl logs -n bakery-ia <db-pod> | grep -iE "(error|warning|fatal)"
|
||||
```
|
||||
|
||||
- [ ] Audit logs reviewed for suspicious activity
|
||||
- [ ] No unauthorized access attempts
|
||||
- [ ] All services connecting properly
|
||||
- [ ] No security warnings in logs
|
||||
|
||||
#### Documentation
|
||||
- [ ] Update runbooks with new security procedures
|
||||
- [ ] Document certificate rotation process
|
||||
- [ ] Document password rotation process
|
||||
- [ ] Update disaster recovery plan
|
||||
- [ ] Share security documentation with team
|
||||
|
||||
### First Month
|
||||
|
||||
#### Access Control Implementation
|
||||
- [ ] Implement role decorators on critical endpoints
|
||||
- [ ] Add subscription tier checks on premium features
|
||||
- [ ] Implement rate limiting on ML operations
|
||||
- [ ] Add audit logging for destructive operations
|
||||
- [ ] Test RBAC enforcement
|
||||
|
||||
#### Backup and Recovery
|
||||
- [ ] Set up automated daily backups (2 AM)
|
||||
- [ ] Configure backup rotation (30/90/365 days)
|
||||
- [ ] Test disaster recovery procedure
|
||||
- [ ] Document recovery time objectives (RTO)
|
||||
- [ ] Document recovery point objectives (RPO)
|
||||
|
||||
---
|
||||
|
||||
## Ongoing Maintenance
|
||||
|
||||
### Daily
|
||||
- [ ] Monitor database health (automated)
|
||||
- [ ] Check backup completion (automated)
|
||||
- [ ] Review critical alerts
|
||||
|
||||
### Weekly
|
||||
- [ ] Review audit logs for anomalies
|
||||
- [ ] Check certificate expiry dates
|
||||
- [ ] Verify backup integrity
|
||||
- [ ] Review access control logs
|
||||
|
||||
### Monthly
|
||||
- [ ] Review security posture
|
||||
- [ ] Update security documentation
|
||||
- [ ] Test backup restore process
|
||||
- [ ] Review and update RBAC policies
|
||||
- [ ] Check for security updates
|
||||
|
||||
### Quarterly (Every 90 Days)
|
||||
- [ ] **Rotate all passwords**
|
||||
- [ ] Review and update security policies
|
||||
- [ ] Conduct security audit
|
||||
- [ ] Update disaster recovery plan
|
||||
- [ ] Review compliance status
|
||||
- [ ] Security team training
|
||||
|
||||
### Annually
|
||||
- [ ] Full security assessment
|
||||
- [ ] Penetration testing
|
||||
- [ ] Compliance audit (GDPR, PCI-DSS, SOC 2)
|
||||
- [ ] Update security roadmap
|
||||
- [ ] Review and update all security documentation
|
||||
|
||||
### Before Certificate Expiry (Oct 2028 - Alert 90 Days Prior)
|
||||
- [ ] Generate new TLS certificates
|
||||
- [ ] Test new certificates in staging
|
||||
- [ ] Schedule maintenance window
|
||||
- [ ] Update Kubernetes secrets
|
||||
- [ ] Restart database pods
|
||||
- [ ] Verify new certificates working
|
||||
- [ ] Update documentation with new expiry dates
|
||||
|
||||
---
|
||||
|
||||
## Security Hardening Roadmap
|
||||
|
||||
### Completed (Security Grade: A-)
|
||||
- ✅ TLS encryption for all database connections
|
||||
- ✅ Strong password policy (32-character passwords)
|
||||
- ✅ Data persistence with PVCs
|
||||
- ✅ Kubernetes secrets encryption
|
||||
- ✅ PostgreSQL audit logging
|
||||
- ✅ pgcrypto extension for encryption at rest
|
||||
- ✅ Automated encrypted backups
|
||||
|
||||
### Phase 1: Critical Security (Weeks 1-2)
|
||||
- [ ] Add role decorators to all deletion endpoints
|
||||
- [ ] Implement owner-only checks for billing/subscription
|
||||
- [ ] Add service-to-service authentication
|
||||
- [ ] Implement audit logging for critical operations
|
||||
- [ ] Add rate limiting on authentication endpoints
|
||||
|
||||
### Phase 2: Premium Feature Gating (Weeks 3-4)
|
||||
- [ ] Implement forecast horizon limits per tier
|
||||
- [ ] Implement training job quotas per tier
|
||||
- [ ] Implement dataset size limits for ML
|
||||
- [ ] Add tier checks to advanced analytics
|
||||
- [ ] Add tier checks to scenario modeling
|
||||
- [ ] Implement usage quota tracking
|
||||
|
||||
### Phase 3: Advanced Access Control (Month 2)
|
||||
- [ ] Fine-grained resource permissions
|
||||
- [ ] Department-based access control
|
||||
- [ ] Approval workflows for critical operations
|
||||
- [ ] Data retention policies
|
||||
- [ ] GDPR data export functionality
|
||||
|
||||
### Phase 4: Infrastructure Hardening (Month 3)
|
||||
- [ ] Network policies for service isolation
|
||||
- [ ] Pod security policies
|
||||
- [ ] Resource quotas and limits
|
||||
- [ ] Container image scanning
|
||||
- [ ] Secrets management with HashiCorp Vault (optional)
|
||||
|
||||
### Phase 5: Advanced Features (Month 4-6)
|
||||
- [ ] Mutual TLS (mTLS) for service-to-service
|
||||
- [ ] Database activity monitoring (DAM)
|
||||
- [ ] SIEM integration
|
||||
- [ ] Automated certificate rotation
|
||||
- [ ] Multi-region disaster recovery
|
||||
|
||||
### Long-term (6+ Months)
|
||||
- [ ] Migrate to managed database services (AWS RDS, Cloud SQL)
|
||||
- [ ] Implement HashiCorp Vault for secrets
|
||||
- [ ] Deploy Istio service mesh
|
||||
- [ ] Implement zero-trust networking
|
||||
- [ ] SOC 2 Type II certification
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
### Security Guides
|
||||
- [Database Security](./database-security.md) - Complete database security guide
|
||||
- [RBAC Implementation](./rbac-implementation.md) - Access control details
|
||||
- [TLS Configuration](./tls-configuration.md) - TLS/SSL setup guide
|
||||
|
||||
### Source Reports
|
||||
- [Database Security Analysis Report](../DATABASE_SECURITY_ANALYSIS_REPORT.md)
|
||||
- [Security Implementation Complete](../SECURITY_IMPLEMENTATION_COMPLETE.md)
|
||||
- [RBAC Analysis Report](../RBAC_ANALYSIS_REPORT.md)
|
||||
- [TLS Implementation Complete](../TLS_IMPLEMENTATION_COMPLETE.md)
|
||||
|
||||
### Operational Guides
|
||||
- [Backup and Recovery Guide](../operations/backup-recovery.md) (if exists)
|
||||
- [Monitoring Guide](../operations/monitoring.md) (if exists)
|
||||
- [Incident Response Plan](../operations/incident-response.md) (if exists)
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Common Verification Commands
|
||||
|
||||
```bash
|
||||
# Verify all databases running
|
||||
kubectl get pods -n bakery-ia -l app.kubernetes.io/component=database
|
||||
|
||||
# Verify all PVCs bound
|
||||
kubectl get pvc -n bakery-ia
|
||||
|
||||
# Verify TLS secrets
|
||||
kubectl get secrets -n bakery-ia | grep tls
|
||||
|
||||
# Check certificate expiry
|
||||
kubectl exec -n bakery-ia <pod> -- \
|
||||
openssl x509 -in /tls/server-cert.pem -noout -dates
|
||||
|
||||
# Test database connection
|
||||
kubectl exec -n bakery-ia <pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SELECT version();"'
|
||||
|
||||
# Test Redis connection
|
||||
kubectl exec -n bakery-ia <pod> -- redis-cli \
|
||||
--tls --cert /tls/redis-cert.pem \
|
||||
--key /tls/redis-key.pem \
|
||||
--cacert /tls/ca-cert.pem \
|
||||
-a $REDIS_PASSWORD ping
|
||||
|
||||
# View recent audit logs
|
||||
kubectl logs -n bakery-ia <db-pod> --tail=100
|
||||
|
||||
# Restart all services
|
||||
kubectl rollout restart deployment -n bakery-ia
|
||||
```
|
||||
|
||||
### Emergency Procedures
|
||||
|
||||
**Database Pod Not Starting:**
|
||||
```bash
|
||||
# 1. Check init container logs
|
||||
kubectl logs -n bakery-ia <pod> -c fix-tls-permissions
|
||||
|
||||
# 2. Check main container logs
|
||||
kubectl logs -n bakery-ia <pod>
|
||||
|
||||
# 3. Describe pod for events
|
||||
kubectl describe pod <pod> -n bakery-ia
|
||||
```
|
||||
|
||||
**Services Can't Connect to Database:**
|
||||
```bash
|
||||
# 1. Verify database is listening
|
||||
kubectl exec -n bakery-ia <db-pod> -- netstat -tlnp
|
||||
|
||||
# 2. Check service logs
|
||||
kubectl logs -n bakery-ia <service-pod> | grep -i "database\|error"
|
||||
|
||||
# 3. Restart service
|
||||
kubectl rollout restart deployment/<service> -n bakery-ia
|
||||
```
|
||||
|
||||
**Lost Database Password:**
|
||||
```bash
|
||||
# 1. Recover from backup
|
||||
kubectl get secret bakery-ia-secrets -n bakery-ia -o jsonpath='{.data.AUTH_DB_PASSWORD}' | base64 -d
|
||||
|
||||
# 2. Or check .env file (if available)
|
||||
grep AUTH_DB_PASSWORD .env
|
||||
|
||||
# 3. Last resort: Reset password (requires database restart)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Document Version:** 1.0
|
||||
**Last Review:** November 2025
|
||||
**Next Review:** February 2026
|
||||
**Owner:** Security Team
|
||||
**Approval Required:** DevOps Lead, Security Lead
|
||||
738
docs/06-security/tls-configuration.md
Normal file
738
docs/06-security/tls-configuration.md
Normal file
@@ -0,0 +1,738 @@
|
||||
# TLS/SSL Configuration Guide
|
||||
|
||||
**Last Updated:** November 2025
|
||||
**Status:** Production Ready
|
||||
**Protocol:** TLS 1.2+
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Certificate Infrastructure](#certificate-infrastructure)
|
||||
3. [PostgreSQL TLS Configuration](#postgresql-tls-configuration)
|
||||
4. [Redis TLS Configuration](#redis-tls-configuration)
|
||||
5. [Client Configuration](#client-configuration)
|
||||
6. [Deployment](#deployment)
|
||||
7. [Verification](#verification)
|
||||
8. [Troubleshooting](#troubleshooting)
|
||||
9. [Maintenance](#maintenance)
|
||||
10. [Related Documentation](#related-documentation)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This guide provides detailed information about TLS/SSL implementation for all database and cache connections in the Bakery IA platform.
|
||||
|
||||
### What's Encrypted
|
||||
|
||||
- ✅ **14 PostgreSQL databases** with TLS 1.2+ encryption
|
||||
- ✅ **1 Redis cache** with TLS encryption
|
||||
- ✅ **All microservice connections** to databases
|
||||
- ✅ **Self-signed CA** with 10-year validity
|
||||
- ✅ **Certificate management** via Kubernetes Secrets
|
||||
|
||||
### Security Benefits
|
||||
|
||||
- **Confidentiality:** All data in transit is encrypted
|
||||
- **Integrity:** TLS prevents man-in-the-middle attacks
|
||||
- **Compliance:** Meets PCI-DSS, GDPR, and SOC 2 requirements
|
||||
- **Performance:** Minimal overhead (<5% CPU) with significant security gains
|
||||
|
||||
### Performance Impact
|
||||
|
||||
| Metric | Before | After | Change |
|
||||
|--------|--------|-------|--------|
|
||||
| Connection Latency | ~5ms | ~8-10ms | +60% (acceptable) |
|
||||
| Query Performance | Baseline | Same | No change |
|
||||
| Network Throughput | Baseline | -10% to -15% | TLS overhead |
|
||||
| CPU Usage | Baseline | +2-5% | Encryption cost |
|
||||
|
||||
---
|
||||
|
||||
## Certificate Infrastructure
|
||||
|
||||
### Certificate Hierarchy
|
||||
|
||||
```
|
||||
Root CA (10-year validity)
|
||||
├── PostgreSQL Server Certificates (3-year validity)
|
||||
│ └── Valid for: *.bakery-ia.svc.cluster.local
|
||||
└── Redis Server Certificate (3-year validity)
|
||||
└── Valid for: redis-service.bakery-ia.svc.cluster.local
|
||||
```
|
||||
|
||||
### Certificate Details
|
||||
|
||||
**Root CA:**
|
||||
- **Algorithm:** RSA 4096-bit
|
||||
- **Signature:** SHA-256
|
||||
- **Validity:** 10 years (expires 2035)
|
||||
- **Common Name:** Bakery IA Internal CA
|
||||
|
||||
**Server Certificates:**
|
||||
- **Algorithm:** RSA 4096-bit
|
||||
- **Signature:** SHA-256
|
||||
- **Validity:** 3 years (expires October 2028)
|
||||
- **Subject Alternative Names:**
|
||||
- PostgreSQL: `*.bakery-ia.svc.cluster.local`, `localhost`
|
||||
- Redis: `redis-service.bakery-ia.svc.cluster.local`, `localhost`
|
||||
|
||||
### Certificate Files
|
||||
|
||||
```
|
||||
infrastructure/tls/
|
||||
├── ca/
|
||||
│ ├── ca-cert.pem # CA certificate (public)
|
||||
│ └── ca-key.pem # CA private key (KEEP SECURE!)
|
||||
├── postgres/
|
||||
│ ├── server-cert.pem # PostgreSQL server certificate
|
||||
│ ├── server-key.pem # PostgreSQL private key
|
||||
│ ├── ca-cert.pem # CA for client validation
|
||||
│ └── san.cnf # Subject Alternative Names config
|
||||
├── redis/
|
||||
│ ├── redis-cert.pem # Redis server certificate
|
||||
│ ├── redis-key.pem # Redis private key
|
||||
│ ├── ca-cert.pem # CA for client validation
|
||||
│ └── san.cnf # Subject Alternative Names config
|
||||
└── generate-certificates.sh # Regeneration script
|
||||
```
|
||||
|
||||
### Generating Certificates
|
||||
|
||||
To regenerate certificates (e.g., before expiry):
|
||||
|
||||
```bash
|
||||
cd infrastructure/tls
|
||||
./generate-certificates.sh
|
||||
```
|
||||
|
||||
This script:
|
||||
1. Creates a new Certificate Authority (CA)
|
||||
2. Generates server certificates for PostgreSQL
|
||||
3. Generates server certificates for Redis
|
||||
4. Signs all certificates with the CA
|
||||
5. Outputs certificates in PEM format
|
||||
|
||||
---
|
||||
|
||||
## PostgreSQL TLS Configuration
|
||||
|
||||
### Server Configuration
|
||||
|
||||
PostgreSQL requires specific configuration to enable TLS:
|
||||
|
||||
**postgresql.conf:**
|
||||
```ini
|
||||
# Network Configuration
|
||||
listen_addresses = '*'
|
||||
port = 5432
|
||||
|
||||
# SSL/TLS Configuration
|
||||
ssl = on
|
||||
ssl_cert_file = '/tls/server-cert.pem'
|
||||
ssl_key_file = '/tls/server-key.pem'
|
||||
ssl_ca_file = '/tls/ca-cert.pem'
|
||||
ssl_prefer_server_ciphers = on
|
||||
ssl_min_protocol_version = 'TLSv1.2'
|
||||
|
||||
# Cipher suites (secure defaults)
|
||||
ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'
|
||||
```
|
||||
|
||||
### Kubernetes Deployment Configuration
|
||||
|
||||
All 14 PostgreSQL deployments use this structure:
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: auth-db
|
||||
namespace: bakery-ia
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
securityContext:
|
||||
fsGroup: 70 # postgres group
|
||||
|
||||
# Init container to fix certificate permissions
|
||||
initContainers:
|
||||
- name: fix-tls-permissions
|
||||
image: busybox:latest
|
||||
securityContext:
|
||||
runAsUser: 0 # Run as root to chown files
|
||||
command: ['sh', '-c']
|
||||
args:
|
||||
- |
|
||||
cp /tls-source/* /tls/
|
||||
chmod 600 /tls/server-key.pem
|
||||
chmod 644 /tls/server-cert.pem /tls/ca-cert.pem
|
||||
chown 70:70 /tls/*
|
||||
volumeMounts:
|
||||
- name: tls-certs-source
|
||||
mountPath: /tls-source
|
||||
readOnly: true
|
||||
- name: tls-certs-writable
|
||||
mountPath: /tls
|
||||
|
||||
# PostgreSQL container
|
||||
containers:
|
||||
- name: postgres
|
||||
image: postgres:17-alpine
|
||||
command:
|
||||
- docker-entrypoint.sh
|
||||
- -c
|
||||
- config_file=/etc/postgresql/postgresql.conf
|
||||
volumeMounts:
|
||||
- name: tls-certs-writable
|
||||
mountPath: /tls
|
||||
- name: postgres-config
|
||||
mountPath: /etc/postgresql
|
||||
- name: postgres-data
|
||||
mountPath: /var/lib/postgresql/data
|
||||
|
||||
volumes:
|
||||
# TLS certificates from Kubernetes Secret (read-only)
|
||||
- name: tls-certs-source
|
||||
secret:
|
||||
secretName: postgres-tls
|
||||
# Writable TLS directory (emptyDir)
|
||||
- name: tls-certs-writable
|
||||
emptyDir: {}
|
||||
# PostgreSQL configuration
|
||||
- name: postgres-config
|
||||
configMap:
|
||||
name: postgres-logging-config
|
||||
# Data persistence
|
||||
- name: postgres-data
|
||||
persistentVolumeClaim:
|
||||
claimName: auth-db-pvc
|
||||
```
|
||||
|
||||
### Why Init Container?
|
||||
|
||||
PostgreSQL has strict requirements:
|
||||
1. **Permission Check:** Private key must have 0600 permissions
|
||||
2. **Ownership Check:** Files must be owned by postgres user (UID 70)
|
||||
3. **Kubernetes Limitation:** Secret mounts are read-only with fixed permissions
|
||||
|
||||
**Solution:** Init container copies certificates to emptyDir with correct permissions.
|
||||
|
||||
### Kubernetes Secret
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: postgres-tls
|
||||
namespace: bakery-ia
|
||||
type: Opaque
|
||||
data:
|
||||
server-cert.pem: <base64-encoded-certificate>
|
||||
server-key.pem: <base64-encoded-private-key>
|
||||
ca-cert.pem: <base64-encoded-ca-certificate>
|
||||
```
|
||||
|
||||
Create from files:
|
||||
```bash
|
||||
kubectl create secret generic postgres-tls \
|
||||
--from-file=server-cert.pem=infrastructure/tls/postgres/server-cert.pem \
|
||||
--from-file=server-key.pem=infrastructure/tls/postgres/server-key.pem \
|
||||
--from-file=ca-cert.pem=infrastructure/tls/postgres/ca-cert.pem \
|
||||
-n bakery-ia
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Redis TLS Configuration
|
||||
|
||||
### Server Configuration
|
||||
|
||||
Redis TLS is configured via command-line arguments:
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: redis
|
||||
namespace: bakery-ia
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: redis
|
||||
image: redis:7-alpine
|
||||
command:
|
||||
- redis-server
|
||||
- --requirepass
|
||||
- $(REDIS_PASSWORD)
|
||||
- --tls-port
|
||||
- "6379"
|
||||
- --port
|
||||
- "0" # Disable non-TLS port
|
||||
- --tls-cert-file
|
||||
- /tls/redis-cert.pem
|
||||
- --tls-key-file
|
||||
- /tls/redis-key.pem
|
||||
- --tls-ca-cert-file
|
||||
- /tls/ca-cert.pem
|
||||
- --tls-auth-clients
|
||||
- "no" # Don't require client certificates
|
||||
env:
|
||||
- name: REDIS_PASSWORD
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: bakery-ia-secrets
|
||||
key: REDIS_PASSWORD
|
||||
volumeMounts:
|
||||
- name: tls-certs
|
||||
mountPath: /tls
|
||||
readOnly: true
|
||||
- name: redis-data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: tls-certs
|
||||
secret:
|
||||
secretName: redis-tls
|
||||
- name: redis-data
|
||||
persistentVolumeClaim:
|
||||
claimName: redis-pvc
|
||||
```
|
||||
|
||||
### Configuration Explained
|
||||
|
||||
- `--tls-port 6379`: Enable TLS on port 6379
|
||||
- `--port 0`: Disable plaintext connections entirely
|
||||
- `--tls-auth-clients no`: Don't require client certificates (use password instead)
|
||||
- `--requirepass`: Require password authentication
|
||||
|
||||
### Kubernetes Secret
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: redis-tls
|
||||
namespace: bakery-ia
|
||||
type: Opaque
|
||||
data:
|
||||
redis-cert.pem: <base64-encoded-certificate>
|
||||
redis-key.pem: <base64-encoded-private-key>
|
||||
ca-cert.pem: <base64-encoded-ca-certificate>
|
||||
```
|
||||
|
||||
Create from files:
|
||||
```bash
|
||||
kubectl create secret generic redis-tls \
|
||||
--from-file=redis-cert.pem=infrastructure/tls/redis/redis-cert.pem \
|
||||
--from-file=redis-key.pem=infrastructure/tls/redis/redis-key.pem \
|
||||
--from-file=ca-cert.pem=infrastructure/tls/redis/ca-cert.pem \
|
||||
-n bakery-ia
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Client Configuration
|
||||
|
||||
### PostgreSQL Client Configuration
|
||||
|
||||
Services connect to PostgreSQL using asyncpg with SSL enforcement.
|
||||
|
||||
**Connection String Format:**
|
||||
```python
|
||||
# Base format
|
||||
postgresql+asyncpg://user:password@host:5432/database
|
||||
|
||||
# With SSL enforcement (automatically added)
|
||||
postgresql+asyncpg://user:password@host:5432/database?ssl=require
|
||||
```
|
||||
|
||||
**Implementation in `shared/database/base.py`:**
|
||||
```python
|
||||
class DatabaseManager:
|
||||
def __init__(self, database_url: str):
|
||||
# Enforce SSL for PostgreSQL connections
|
||||
if database_url.startswith('postgresql') and '?ssl=' not in database_url:
|
||||
separator = '&' if '?' in database_url else '?'
|
||||
database_url = f"{database_url}{separator}ssl=require"
|
||||
|
||||
self.database_url = database_url
|
||||
logger.info(f"SSL enforcement added to database URL")
|
||||
```
|
||||
|
||||
**Important:** asyncpg uses `ssl=require`, NOT `sslmode=require` (psycopg2 syntax).
|
||||
|
||||
### Redis Client Configuration
|
||||
|
||||
Services connect to Redis using TLS protocol.
|
||||
|
||||
**Connection String Format:**
|
||||
```python
|
||||
# Base format (without TLS)
|
||||
redis://:password@redis-service:6379
|
||||
|
||||
# With TLS (rediss:// protocol)
|
||||
rediss://:password@redis-service:6379?ssl_cert_reqs=none
|
||||
```
|
||||
|
||||
**Implementation in `shared/config/base.py`:**
|
||||
```python
|
||||
class BaseConfig:
|
||||
@property
|
||||
def REDIS_URL(self) -> str:
|
||||
redis_host = os.getenv("REDIS_HOST", "redis-service")
|
||||
redis_port = os.getenv("REDIS_PORT", "6379")
|
||||
redis_password = os.getenv("REDIS_PASSWORD", "")
|
||||
redis_tls_enabled = os.getenv("REDIS_TLS_ENABLED", "true").lower() == "true"
|
||||
|
||||
if redis_tls_enabled:
|
||||
# Use rediss:// for TLS
|
||||
protocol = "rediss"
|
||||
ssl_params = "?ssl_cert_reqs=none" # Don't verify self-signed certs
|
||||
else:
|
||||
protocol = "redis"
|
||||
ssl_params = ""
|
||||
|
||||
password_part = f":{redis_password}@" if redis_password else ""
|
||||
return f"{protocol}://{password_part}{redis_host}:{redis_port}{ssl_params}"
|
||||
```
|
||||
|
||||
**Why `ssl_cert_reqs=none`?**
|
||||
- We use self-signed certificates for internal cluster communication
|
||||
- Certificate validation would require distributing CA cert to all services
|
||||
- Network isolation provides adequate security within cluster
|
||||
- For external connections, use `ssl_cert_reqs=required` with proper CA
|
||||
|
||||
---
|
||||
|
||||
## Deployment
|
||||
|
||||
### Full Deployment Process
|
||||
|
||||
#### Option 1: Fresh Cluster (Recommended)
|
||||
|
||||
```bash
|
||||
# 1. Delete existing cluster (if any)
|
||||
kind delete cluster --name bakery-ia-local
|
||||
|
||||
# 2. Create new cluster with encryption enabled
|
||||
kind create cluster --config kind-config.yaml
|
||||
|
||||
# 3. Create namespace
|
||||
kubectl apply -f infrastructure/kubernetes/base/namespace.yaml
|
||||
|
||||
# 4. Create TLS secrets
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/postgres-tls-secret.yaml
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/redis-tls-secret.yaml
|
||||
|
||||
# 5. Create ConfigMap with PostgreSQL config
|
||||
kubectl apply -f infrastructure/kubernetes/base/configmaps/postgres-logging-config.yaml
|
||||
|
||||
# 6. Deploy databases
|
||||
kubectl apply -f infrastructure/kubernetes/base/components/databases/
|
||||
|
||||
# 7. Deploy services
|
||||
kubectl apply -f infrastructure/kubernetes/base/
|
||||
```
|
||||
|
||||
#### Option 2: Update Existing Cluster
|
||||
|
||||
```bash
|
||||
# 1. Apply TLS secrets
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/postgres-tls-secret.yaml
|
||||
kubectl apply -f infrastructure/kubernetes/base/secrets/redis-tls-secret.yaml
|
||||
|
||||
# 2. Apply PostgreSQL config
|
||||
kubectl apply -f infrastructure/kubernetes/base/configmaps/postgres-logging-config.yaml
|
||||
|
||||
# 3. Update database deployments
|
||||
kubectl apply -f infrastructure/kubernetes/base/components/databases/
|
||||
|
||||
# 4. Restart all services to pick up new TLS configuration
|
||||
kubectl rollout restart deployment -n bakery-ia \
|
||||
--selector='app.kubernetes.io/component=service'
|
||||
```
|
||||
|
||||
### Applying Changes Script
|
||||
|
||||
A convenience script is provided:
|
||||
|
||||
```bash
|
||||
./scripts/apply-security-changes.sh
|
||||
```
|
||||
|
||||
This script:
|
||||
1. Applies TLS secrets
|
||||
2. Applies ConfigMaps
|
||||
3. Updates database deployments
|
||||
4. Waits for pods to be ready
|
||||
5. Restarts services
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
### Verify PostgreSQL TLS
|
||||
|
||||
```bash
|
||||
# 1. Check SSL is enabled
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SHOW ssl;"'
|
||||
# Expected output: on
|
||||
|
||||
# 2. Check TLS protocol version
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SHOW ssl_min_protocol_version;"'
|
||||
# Expected output: TLSv1.2
|
||||
|
||||
# 3. Check listening on all interfaces
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SHOW listen_addresses;"'
|
||||
# Expected output: *
|
||||
|
||||
# 4. Check certificate permissions
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- ls -la /tls/
|
||||
# Expected output:
|
||||
# -rw------- 1 postgres postgres ... server-key.pem
|
||||
# -rw-r--r-- 1 postgres postgres ... server-cert.pem
|
||||
# -rw-r--r-- 1 postgres postgres ... ca-cert.pem
|
||||
|
||||
# 5. Verify certificate details
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- \
|
||||
openssl x509 -in /tls/server-cert.pem -noout -dates
|
||||
# Shows NotBefore and NotAfter dates
|
||||
```
|
||||
|
||||
### Verify Redis TLS
|
||||
|
||||
```bash
|
||||
# 1. Check Redis is running
|
||||
kubectl get pods -n bakery-ia -l app.kubernetes.io/name=redis
|
||||
# Expected: STATUS = Running
|
||||
|
||||
# 2. Check Redis logs for TLS initialization
|
||||
kubectl logs -n bakery-ia <redis-pod> | grep -i "tls"
|
||||
# Should show TLS port enabled, no "wrong version number" errors
|
||||
|
||||
# 3. Test Redis connection with TLS
|
||||
kubectl exec -n bakery-ia <redis-pod> -- redis-cli \
|
||||
--tls \
|
||||
--cert /tls/redis-cert.pem \
|
||||
--key /tls/redis-key.pem \
|
||||
--cacert /tls/ca-cert.pem \
|
||||
-a $REDIS_PASSWORD \
|
||||
ping
|
||||
# Expected output: PONG
|
||||
|
||||
# 4. Verify TLS-only (plaintext disabled)
|
||||
kubectl exec -n bakery-ia <redis-pod> -- redis-cli -a $REDIS_PASSWORD ping
|
||||
# Expected: Connection refused (port 6379 is TLS-only)
|
||||
```
|
||||
|
||||
### Verify Service Connections
|
||||
|
||||
```bash
|
||||
# 1. Check migration jobs completed successfully
|
||||
kubectl get jobs -n bakery-ia | grep migration
|
||||
# All should show "COMPLETIONS = 1/1"
|
||||
|
||||
# 2. Check service logs for SSL enforcement
|
||||
kubectl logs -n bakery-ia <service-pod> | grep "SSL enforcement"
|
||||
# Should show: "SSL enforcement added to database URL"
|
||||
|
||||
# 3. Check for connection errors
|
||||
kubectl logs -n bakery-ia <service-pod> | grep -i "error"
|
||||
# Should NOT show TLS/SSL related errors
|
||||
|
||||
# 4. Test service endpoint
|
||||
kubectl port-forward -n bakery-ia svc/auth-service 8001:8001
|
||||
curl http://localhost:8001/health
|
||||
# Should return healthy status
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### PostgreSQL Won't Start
|
||||
|
||||
#### Symptom: "could not load server certificate file"
|
||||
|
||||
**Check init container logs:**
|
||||
```bash
|
||||
kubectl logs -n bakery-ia <pod> -c fix-tls-permissions
|
||||
```
|
||||
|
||||
**Check certificate permissions:**
|
||||
```bash
|
||||
kubectl exec -n bakery-ia <pod> -- ls -la /tls/
|
||||
```
|
||||
|
||||
**Expected:**
|
||||
- server-key.pem: 600 (rw-------)
|
||||
- server-cert.pem: 644 (rw-r--r--)
|
||||
- ca-cert.pem: 644 (rw-r--r--)
|
||||
- Owned by: postgres:postgres (70:70)
|
||||
|
||||
#### Symptom: "private key file has group or world access"
|
||||
|
||||
**Cause:** server-key.pem permissions too permissive
|
||||
|
||||
**Fix:** Init container should set chmod 600 on private key:
|
||||
```bash
|
||||
chmod 600 /tls/server-key.pem
|
||||
```
|
||||
|
||||
#### Symptom: "external-db-service:5432 - no response"
|
||||
|
||||
**Cause:** PostgreSQL not listening on network interfaces
|
||||
|
||||
**Check:**
|
||||
```bash
|
||||
kubectl exec -n bakery-ia <pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SHOW listen_addresses;"'
|
||||
```
|
||||
|
||||
**Should be:** `*` (all interfaces)
|
||||
|
||||
**Fix:** Ensure `listen_addresses = '*'` in postgresql.conf
|
||||
|
||||
### Services Can't Connect
|
||||
|
||||
#### Symptom: "connect() got an unexpected keyword argument 'sslmode'"
|
||||
|
||||
**Cause:** Using psycopg2 syntax with asyncpg
|
||||
|
||||
**Fix:** Use `ssl=require` not `sslmode=require` in connection string
|
||||
|
||||
#### Symptom: "SSL not supported by this database"
|
||||
|
||||
**Cause:** PostgreSQL not configured for SSL
|
||||
|
||||
**Check PostgreSQL logs:**
|
||||
```bash
|
||||
kubectl logs -n bakery-ia <db-pod>
|
||||
```
|
||||
|
||||
**Verify SSL configuration:**
|
||||
```bash
|
||||
kubectl exec -n bakery-ia <db-pod> -- sh -c \
|
||||
'psql -U $POSTGRES_USER -d $POSTGRES_DB -c "SHOW ssl;"'
|
||||
```
|
||||
|
||||
### Redis Connection Issues
|
||||
|
||||
#### Symptom: "SSL handshake is taking longer than 60.0 seconds"
|
||||
|
||||
**Cause:** Self-signed certificate validation issue
|
||||
|
||||
**Fix:** Use `ssl_cert_reqs=none` in Redis connection string
|
||||
|
||||
#### Symptom: "wrong version number" in Redis logs
|
||||
|
||||
**Cause:** Client trying to connect without TLS to TLS-only port
|
||||
|
||||
**Check client configuration:**
|
||||
```bash
|
||||
kubectl logs -n bakery-ia <service-pod> | grep "REDIS_URL"
|
||||
```
|
||||
|
||||
**Should use:** `rediss://` protocol (note double 's')
|
||||
|
||||
---
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Certificate Rotation
|
||||
|
||||
Certificates expire October 2028. Rotate **90 days before expiry**.
|
||||
|
||||
**Process:**
|
||||
```bash
|
||||
# 1. Generate new certificates
|
||||
cd infrastructure/tls
|
||||
./generate-certificates.sh
|
||||
|
||||
# 2. Update Kubernetes secrets
|
||||
kubectl delete secret postgres-tls redis-tls -n bakery-ia
|
||||
kubectl create secret generic postgres-tls \
|
||||
--from-file=server-cert.pem=postgres/server-cert.pem \
|
||||
--from-file=server-key.pem=postgres/server-key.pem \
|
||||
--from-file=ca-cert.pem=postgres/ca-cert.pem \
|
||||
-n bakery-ia
|
||||
kubectl create secret generic redis-tls \
|
||||
--from-file=redis-cert.pem=redis/redis-cert.pem \
|
||||
--from-file=redis-key.pem=redis/redis-key.pem \
|
||||
--from-file=ca-cert.pem=redis/ca-cert.pem \
|
||||
-n bakery-ia
|
||||
|
||||
# 3. Restart database pods (triggers automatic update)
|
||||
kubectl rollout restart deployment -n bakery-ia \
|
||||
-l app.kubernetes.io/component=database
|
||||
kubectl rollout restart deployment -n bakery-ia \
|
||||
-l app.kubernetes.io/component=cache
|
||||
```
|
||||
|
||||
### Certificate Expiry Monitoring
|
||||
|
||||
Set up monitoring to alert 90 days before expiry:
|
||||
|
||||
```bash
|
||||
# Check certificate expiry date
|
||||
kubectl exec -n bakery-ia <postgres-pod> -- \
|
||||
openssl x509 -in /tls/server-cert.pem -noout -enddate
|
||||
|
||||
# Output: notAfter=Oct 17 00:00:00 2028 GMT
|
||||
```
|
||||
|
||||
**Recommended:** Create a Kubernetes CronJob to check expiry monthly.
|
||||
|
||||
### Upgrading to Mutual TLS (mTLS)
|
||||
|
||||
For enhanced security, require client certificates:
|
||||
|
||||
**PostgreSQL:**
|
||||
```ini
|
||||
# postgresql.conf
|
||||
ssl_ca_file = '/tls/ca-cert.pem'
|
||||
# Also requires client to present valid certificate
|
||||
```
|
||||
|
||||
**Redis:**
|
||||
```bash
|
||||
redis-server \
|
||||
--tls-auth-clients yes # Change from "no"
|
||||
# Other args...
|
||||
```
|
||||
|
||||
**Clients would need:**
|
||||
- Client certificate signed by CA
|
||||
- Client private key
|
||||
- CA certificate
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
### Security Documentation
|
||||
- [Database Security](./database-security.md) - Complete database security guide
|
||||
- [RBAC Implementation](./rbac-implementation.md) - Access control
|
||||
- [Security Checklist](./security-checklist.md) - Deployment verification
|
||||
|
||||
### Source Documentation
|
||||
- [TLS Implementation Complete](../TLS_IMPLEMENTATION_COMPLETE.md)
|
||||
- [Security Implementation Complete](../SECURITY_IMPLEMENTATION_COMPLETE.md)
|
||||
|
||||
### External References
|
||||
- [PostgreSQL SSL/TLS Documentation](https://www.postgresql.org/docs/17/ssl-tcp.html)
|
||||
- [Redis TLS Documentation](https://redis.io/docs/manual/security/encryption/)
|
||||
- [TLS Best Practices](https://ssl-config.mozilla.org/)
|
||||
|
||||
---
|
||||
|
||||
**Document Version:** 1.0
|
||||
**Last Review:** November 2025
|
||||
**Next Review:** May 2026
|
||||
**Owner:** Security Team
|
||||
Reference in New Issue
Block a user