471 lines
12 KiB
Markdown
471 lines
12 KiB
Markdown
# Completion Checklist - Tenant & User Deletion System
|
||
|
||
**Current Status:** 75% Complete
|
||
**Time to 100%:** ~4 hours implementation + 2 days testing
|
||
|
||
---
|
||
|
||
## Phase 1: Complete Remaining Services (1.5 hours)
|
||
|
||
### POS Service (30 minutes)
|
||
|
||
- [ ] Create `services/pos/app/services/tenant_deletion_service.py`
|
||
- [ ] Copy template from QUICK_START_REMAINING_SERVICES.md
|
||
- [ ] Import models: POSConfiguration, POSTransaction, POSSession
|
||
- [ ] Implement `get_tenant_data_preview()`
|
||
- [ ] Implement `delete_tenant_data()` with correct order:
|
||
- [ ] 1. POSTransaction
|
||
- [ ] 2. POSSession
|
||
- [ ] 3. POSConfiguration
|
||
|
||
- [ ] Add endpoints to `services/pos/app/api/{router}.py`
|
||
- [ ] DELETE /tenant/{tenant_id}
|
||
- [ ] GET /tenant/{tenant_id}/deletion-preview
|
||
|
||
- [ ] Test manually:
|
||
```bash
|
||
curl -X GET "http://localhost:8000/api/v1/pos/tenant/{id}/deletion-preview"
|
||
curl -X DELETE "http://localhost:8000/api/v1/pos/tenant/{id}"
|
||
```
|
||
|
||
### External Service (30 minutes)
|
||
|
||
- [ ] Create `services/external/app/services/tenant_deletion_service.py`
|
||
- [ ] Copy template
|
||
- [ ] Import models: ExternalDataCache, APIKeyUsage
|
||
- [ ] Implement `get_tenant_data_preview()`
|
||
- [ ] Implement `delete_tenant_data()` with order:
|
||
- [ ] 1. APIKeyUsage
|
||
- [ ] 2. ExternalDataCache
|
||
|
||
- [ ] Add endpoints to `services/external/app/api/{router}.py`
|
||
- [ ] DELETE /tenant/{tenant_id}
|
||
- [ ] GET /tenant/{tenant_id}/deletion-preview
|
||
|
||
- [ ] Test manually
|
||
|
||
### Alert Processor Service (30 minutes)
|
||
|
||
- [ ] Create `services/alert_processor/app/services/tenant_deletion_service.py`
|
||
- [ ] Copy template
|
||
- [ ] Import models: Alert, AlertRule, AlertHistory
|
||
- [ ] Implement `get_tenant_data_preview()`
|
||
- [ ] Implement `delete_tenant_data()` with order:
|
||
- [ ] 1. AlertHistory
|
||
- [ ] 2. Alert
|
||
- [ ] 3. AlertRule
|
||
|
||
- [ ] Add endpoints to `services/alert_processor/app/api/{router}.py`
|
||
- [ ] DELETE /tenant/{tenant_id}
|
||
- [ ] GET /tenant/{tenant_id}/deletion-preview
|
||
|
||
- [ ] Test manually
|
||
|
||
---
|
||
|
||
## Phase 2: Refactor Existing Services (2.5 hours)
|
||
|
||
### Forecasting Service (45 minutes)
|
||
|
||
- [ ] Review existing deletion logic in forecasting service
|
||
- [ ] Create new `services/forecasting/app/services/tenant_deletion_service.py`
|
||
- [ ] Extend BaseTenantDataDeletionService
|
||
- [ ] Move existing logic into standard pattern
|
||
- [ ] Import models: Forecast, PredictionBatch, etc.
|
||
|
||
- [ ] Update endpoints to use new pattern
|
||
- [ ] Replace existing DELETE logic
|
||
- [ ] Add deletion-preview endpoint
|
||
|
||
- [ ] Test both endpoints
|
||
|
||
### Training Service (45 minutes)
|
||
|
||
- [ ] Review existing deletion logic
|
||
- [ ] Create new `services/training/app/services/tenant_deletion_service.py`
|
||
- [ ] Extend BaseTenantDataDeletionService
|
||
- [ ] Move existing logic into standard pattern
|
||
- [ ] Import models: TrainingJob, TrainedModel, ModelArtifact
|
||
|
||
- [ ] Update endpoints to use new pattern
|
||
|
||
- [ ] Test both endpoints
|
||
|
||
### Notification Service (45 minutes)
|
||
|
||
- [ ] Review existing deletion logic
|
||
- [ ] Create new `services/notification/app/services/tenant_deletion_service.py`
|
||
- [ ] Extend BaseTenantDataDeletionService
|
||
- [ ] Move existing logic into standard pattern
|
||
- [ ] Import models: Notification, NotificationPreference, etc.
|
||
|
||
- [ ] Update endpoints to use new pattern
|
||
|
||
- [ ] Test both endpoints
|
||
|
||
---
|
||
|
||
## Phase 3: Integration (2 hours)
|
||
|
||
### Update Auth Service
|
||
|
||
- [ ] Open `services/auth/app/services/admin_delete.py`
|
||
|
||
- [ ] Import DeletionOrchestrator:
|
||
```python
|
||
from app.services.deletion_orchestrator import DeletionOrchestrator
|
||
```
|
||
|
||
- [ ] Update `_delete_tenant_data()` method:
|
||
```python
|
||
async def _delete_tenant_data(self, tenant_id: str):
|
||
orchestrator = DeletionOrchestrator(auth_token=self.get_service_token())
|
||
job = await orchestrator.orchestrate_tenant_deletion(
|
||
tenant_id=tenant_id,
|
||
tenant_name=tenant_info.get("name"),
|
||
initiated_by=self.requesting_user_id
|
||
)
|
||
return job.to_dict()
|
||
```
|
||
|
||
- [ ] Remove old manual service calls
|
||
|
||
- [ ] Test complete user deletion flow
|
||
|
||
### Verify Service URLs
|
||
|
||
- [ ] Check orchestrator SERVICE_DELETION_ENDPOINTS
|
||
- [ ] Update URLs for your environment:
|
||
- [ ] Development: localhost ports
|
||
- [ ] Staging: service names
|
||
- [ ] Production: service names
|
||
|
||
---
|
||
|
||
## Phase 4: Testing (2 days)
|
||
|
||
### Unit Tests (Day 1)
|
||
|
||
- [ ] Test TenantDataDeletionResult
|
||
```python
|
||
def test_deletion_result_creation():
|
||
result = TenantDataDeletionResult("tenant-123", "test-service")
|
||
assert result.tenant_id == "tenant-123"
|
||
assert result.success == True
|
||
```
|
||
|
||
- [ ] Test BaseTenantDataDeletionService
|
||
```python
|
||
async def test_safe_delete_handles_errors():
|
||
# Test error handling
|
||
```
|
||
|
||
- [ ] Test each service deletion class
|
||
```python
|
||
async def test_orders_deletion():
|
||
# Create test data
|
||
# Call delete_tenant_data()
|
||
# Verify data deleted
|
||
```
|
||
|
||
- [ ] Test DeletionOrchestrator
|
||
```python
|
||
async def test_orchestrator_parallel_execution():
|
||
# Mock service responses
|
||
# Verify all called
|
||
```
|
||
|
||
- [ ] Test DeletionJob tracking
|
||
```python
|
||
def test_job_status_tracking():
|
||
# Create job
|
||
# Check status transitions
|
||
```
|
||
|
||
### Integration Tests (Day 1-2)
|
||
|
||
- [ ] Test tenant deletion endpoint
|
||
```python
|
||
async def test_delete_tenant_endpoint():
|
||
response = await client.delete(f"/api/v1/tenants/{tenant_id}")
|
||
assert response.status_code == 200
|
||
```
|
||
|
||
- [ ] Test service-to-service calls
|
||
```python
|
||
async def test_orders_deletion_via_orchestrator():
|
||
# Create tenant with orders
|
||
# Delete tenant
|
||
# Verify orders deleted
|
||
```
|
||
|
||
- [ ] Test CASCADE deletes
|
||
```python
|
||
async def test_cascade_deletes_children():
|
||
# Create parent with children
|
||
# Delete parent
|
||
# Verify children also deleted
|
||
```
|
||
|
||
- [ ] Test error handling
|
||
```python
|
||
async def test_partial_failure_handling():
|
||
# Mock one service failure
|
||
# Verify job shows failure
|
||
# Verify other services succeeded
|
||
```
|
||
|
||
### E2E Tests (Day 2)
|
||
|
||
- [ ] Test complete tenant deletion
|
||
```python
|
||
async def test_complete_tenant_deletion():
|
||
# Create tenant with data in all services
|
||
# Delete tenant
|
||
# Verify all data deleted
|
||
# Check deletion job status
|
||
```
|
||
|
||
- [ ] Test complete user deletion
|
||
```python
|
||
async def test_user_deletion_with_owned_tenants():
|
||
# Create user with owned tenants
|
||
# Create other admins
|
||
# Delete user
|
||
# Verify ownership transferred
|
||
# Verify user data deleted
|
||
```
|
||
|
||
- [ ] Test owner deletion with tenant deletion
|
||
```python
|
||
async def test_owner_deletion_no_other_admins():
|
||
# Create user with tenant (no other admins)
|
||
# Delete user
|
||
# Verify tenant deleted
|
||
# Verify all cascade deletes
|
||
```
|
||
|
||
### Manual Testing (Throughout)
|
||
|
||
- [ ] Test with small dataset (<100 records)
|
||
- [ ] Test with medium dataset (1,000 records)
|
||
- [ ] Test with large dataset (10,000+ records)
|
||
- [ ] Measure performance
|
||
- [ ] Verify database queries are efficient
|
||
- [ ] Check logs for errors
|
||
- [ ] Verify audit trail
|
||
|
||
---
|
||
|
||
## Phase 5: Database Persistence (1 day)
|
||
|
||
### Create Migration
|
||
|
||
- [ ] Create deletion_jobs table:
|
||
```sql
|
||
CREATE TABLE deletion_jobs (
|
||
id UUID PRIMARY KEY,
|
||
tenant_id UUID NOT NULL,
|
||
tenant_name VARCHAR(255),
|
||
initiated_by UUID,
|
||
status VARCHAR(50) NOT NULL,
|
||
service_results JSONB,
|
||
total_items_deleted INTEGER DEFAULT 0,
|
||
started_at TIMESTAMP WITH TIME ZONE,
|
||
completed_at TIMESTAMP WITH TIME ZONE,
|
||
error_log TEXT[],
|
||
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
|
||
);
|
||
|
||
CREATE INDEX idx_deletion_jobs_tenant ON deletion_jobs(tenant_id);
|
||
CREATE INDEX idx_deletion_jobs_status ON deletion_jobs(status);
|
||
CREATE INDEX idx_deletion_jobs_initiated ON deletion_jobs(initiated_by);
|
||
```
|
||
|
||
- [ ] Run migration in dev
|
||
- [ ] Run migration in staging
|
||
|
||
### Update Orchestrator
|
||
|
||
- [ ] Add database session to DeletionOrchestrator
|
||
- [ ] Save job to database in orchestrate_tenant_deletion()
|
||
- [ ] Update job status in database
|
||
- [ ] Query jobs from database in get_job_status()
|
||
- [ ] Query jobs from database in list_jobs()
|
||
|
||
### Add Job API Endpoints
|
||
|
||
- [ ] Create `services/auth/app/api/deletion_jobs.py`
|
||
```python
|
||
@router.get("/deletion-jobs/{job_id}")
|
||
async def get_job_status(job_id: str):
|
||
# Query from database
|
||
|
||
@router.get("/deletion-jobs")
|
||
async def list_deletion_jobs(
|
||
tenant_id: Optional[str] = None,
|
||
status: Optional[str] = None,
|
||
limit: int = 100
|
||
):
|
||
# Query from database with filters
|
||
```
|
||
|
||
- [ ] Test job status endpoints
|
||
|
||
---
|
||
|
||
## Phase 6: Production Prep (2 days)
|
||
|
||
### Performance Testing
|
||
|
||
- [ ] Create test dataset with 100K records
|
||
- [ ] Run deletion and measure time
|
||
- [ ] Identify bottlenecks
|
||
- [ ] Optimize slow queries
|
||
- [ ] Add batch processing if needed
|
||
- [ ] Re-test and verify improvement
|
||
|
||
### Monitoring Setup
|
||
|
||
- [ ] Add Prometheus metrics:
|
||
```python
|
||
deletion_duration_seconds = Histogram(...)
|
||
deletion_items_deleted = Counter(...)
|
||
deletion_errors_total = Counter(...)
|
||
deletion_jobs_status = Gauge(...)
|
||
```
|
||
|
||
- [ ] Create Grafana dashboard:
|
||
- [ ] Active deletions gauge
|
||
- [ ] Deletion rate graph
|
||
- [ ] Error rate graph
|
||
- [ ] Average duration graph
|
||
- [ ] Items deleted by service
|
||
|
||
- [ ] Configure alerts:
|
||
- [ ] Alert if deletion >5 minutes
|
||
- [ ] Alert if >10% error rate
|
||
- [ ] Alert if service timeouts
|
||
|
||
### Documentation Updates
|
||
|
||
- [ ] Update API documentation
|
||
- [ ] Create operations runbook
|
||
- [ ] Document rollback procedures
|
||
- [ ] Create troubleshooting guide
|
||
|
||
### Rollout Plan
|
||
|
||
- [ ] Deploy to dev environment
|
||
- [ ] Run full test suite
|
||
- [ ] Deploy to staging
|
||
- [ ] Run smoke tests
|
||
- [ ] Deploy to production with feature flag
|
||
- [ ] Monitor for 24 hours
|
||
- [ ] Enable for all tenants
|
||
|
||
---
|
||
|
||
## Phase 7: Optional Enhancements (Future)
|
||
|
||
### Soft Delete (2 days)
|
||
|
||
- [ ] Add deleted_at column to tenants table
|
||
- [ ] Implement 30-day retention
|
||
- [ ] Add restoration endpoint
|
||
- [ ] Add cleanup job for expired deletions
|
||
- [ ] Update queries to filter deleted tenants
|
||
|
||
### Advanced Features (1 week)
|
||
|
||
- [ ] WebSocket progress updates
|
||
- [ ] Email notifications on completion
|
||
- [ ] Deletion reports (PDF download)
|
||
- [ ] Scheduled deletions
|
||
- [ ] Deletion preview aggregation
|
||
|
||
---
|
||
|
||
## Sign-Off Checklist
|
||
|
||
### Code Quality
|
||
|
||
- [ ] All services implemented
|
||
- [ ] All endpoints tested
|
||
- [ ] No compiler warnings
|
||
- [ ] Code reviewed
|
||
- [ ] Documentation complete
|
||
|
||
### Testing
|
||
|
||
- [ ] Unit tests passing (>80% coverage)
|
||
- [ ] Integration tests passing
|
||
- [ ] E2E tests passing
|
||
- [ ] Performance tests passing
|
||
- [ ] Manual testing complete
|
||
|
||
### Production Readiness
|
||
|
||
- [ ] Monitoring configured
|
||
- [ ] Alerts configured
|
||
- [ ] Logging verified
|
||
- [ ] Rollback plan documented
|
||
- [ ] Runbook created
|
||
|
||
### Security & Compliance
|
||
|
||
- [ ] Authorization verified
|
||
- [ ] Audit logging enabled
|
||
- [ ] GDPR compliance verified
|
||
- [ ] Data retention policy documented
|
||
- [ ] Security review completed
|
||
|
||
---
|
||
|
||
## Quick Reference
|
||
|
||
### Files to Create (3 new services):
|
||
1. `services/pos/app/services/tenant_deletion_service.py`
|
||
2. `services/external/app/services/tenant_deletion_service.py`
|
||
3. `services/alert_processor/app/services/tenant_deletion_service.py`
|
||
|
||
### Files to Modify (3 refactored services):
|
||
1. `services/forecasting/app/services/tenant_deletion_service.py`
|
||
2. `services/training/app/services/tenant_deletion_service.py`
|
||
3. `services/notification/app/services/tenant_deletion_service.py`
|
||
|
||
### Files to Update (integration):
|
||
1. `services/auth/app/services/admin_delete.py`
|
||
|
||
### Tests to Write (~50 tests):
|
||
- 10 unit tests (base classes)
|
||
- 24 service-specific tests (2 per service × 12 services)
|
||
- 10 integration tests
|
||
- 6 E2E tests
|
||
|
||
### Time Estimate:
|
||
- Implementation: 4 hours
|
||
- Testing: 2 days
|
||
- Deployment: 2 days
|
||
- **Total: ~5 days**
|
||
|
||
---
|
||
|
||
## Success Criteria
|
||
|
||
✅ All 12 services have deletion logic
|
||
✅ All deletion endpoints working
|
||
✅ Orchestrator coordinating successfully
|
||
✅ Job tracking persisted to database
|
||
✅ All tests passing
|
||
✅ Performance acceptable (<5 min for large tenants)
|
||
✅ Monitoring in place
|
||
✅ Documentation complete
|
||
✅ Production deployment successful
|
||
|
||
---
|
||
|
||
**Keep this checklist handy and mark items as you complete them!**
|
||
|
||
**Remember:** Templates and examples are in QUICK_START_REMAINING_SERVICES.md
|