Files
bakery-ia/docs/archive/COMPLETION_CHECKLIST.md
2025-11-05 13:34:56 +01:00

12 KiB
Raw Blame History

Completion Checklist - Tenant & User Deletion System

Current Status: 75% Complete Time to 100%: ~4 hours implementation + 2 days testing


Phase 1: Complete Remaining Services (1.5 hours)

POS Service (30 minutes)

  • Create services/pos/app/services/tenant_deletion_service.py

    • Copy template from QUICK_START_REMAINING_SERVICES.md
    • Import models: POSConfiguration, POSTransaction, POSSession
    • Implement get_tenant_data_preview()
    • Implement delete_tenant_data() with correct order:
      • 1. POSTransaction
      • 2. POSSession
      • 3. POSConfiguration
  • Add endpoints to services/pos/app/api/{router}.py

    • DELETE /tenant/{tenant_id}
    • GET /tenant/{tenant_id}/deletion-preview
  • Test manually:

    curl -X GET "http://localhost:8000/api/v1/pos/tenant/{id}/deletion-preview"
    curl -X DELETE "http://localhost:8000/api/v1/pos/tenant/{id}"
    

External Service (30 minutes)

  • Create services/external/app/services/tenant_deletion_service.py

    • Copy template
    • Import models: ExternalDataCache, APIKeyUsage
    • Implement get_tenant_data_preview()
    • Implement delete_tenant_data() with order:
      • 1. APIKeyUsage
      • 2. ExternalDataCache
  • Add endpoints to services/external/app/api/{router}.py

    • DELETE /tenant/{tenant_id}
    • GET /tenant/{tenant_id}/deletion-preview
  • Test manually

Alert Processor Service (30 minutes)

  • Create services/alert_processor/app/services/tenant_deletion_service.py

    • Copy template
    • Import models: Alert, AlertRule, AlertHistory
    • Implement get_tenant_data_preview()
    • Implement delete_tenant_data() with order:
      • 1. AlertHistory
      • 2. Alert
      • 3. AlertRule
  • Add endpoints to services/alert_processor/app/api/{router}.py

    • DELETE /tenant/{tenant_id}
    • GET /tenant/{tenant_id}/deletion-preview
  • Test manually


Phase 2: Refactor Existing Services (2.5 hours)

Forecasting Service (45 minutes)

  • Review existing deletion logic in forecasting service

  • Create new services/forecasting/app/services/tenant_deletion_service.py

    • Extend BaseTenantDataDeletionService
    • Move existing logic into standard pattern
    • Import models: Forecast, PredictionBatch, etc.
  • Update endpoints to use new pattern

    • Replace existing DELETE logic
    • Add deletion-preview endpoint
  • Test both endpoints

Training Service (45 minutes)

  • Review existing deletion logic

  • Create new services/training/app/services/tenant_deletion_service.py

    • Extend BaseTenantDataDeletionService
    • Move existing logic into standard pattern
    • Import models: TrainingJob, TrainedModel, ModelArtifact
  • Update endpoints to use new pattern

  • Test both endpoints

Notification Service (45 minutes)

  • Review existing deletion logic

  • Create new services/notification/app/services/tenant_deletion_service.py

    • Extend BaseTenantDataDeletionService
    • Move existing logic into standard pattern
    • Import models: Notification, NotificationPreference, etc.
  • Update endpoints to use new pattern

  • Test both endpoints


Phase 3: Integration (2 hours)

Update Auth Service

  • Open services/auth/app/services/admin_delete.py

  • Import DeletionOrchestrator:

    from app.services.deletion_orchestrator import DeletionOrchestrator
    
  • Update _delete_tenant_data() method:

    async def _delete_tenant_data(self, tenant_id: str):
        orchestrator = DeletionOrchestrator(auth_token=self.get_service_token())
        job = await orchestrator.orchestrate_tenant_deletion(
            tenant_id=tenant_id,
            tenant_name=tenant_info.get("name"),
            initiated_by=self.requesting_user_id
        )
        return job.to_dict()
    
  • Remove old manual service calls

  • Test complete user deletion flow

Verify Service URLs

  • Check orchestrator SERVICE_DELETION_ENDPOINTS
  • Update URLs for your environment:
    • Development: localhost ports
    • Staging: service names
    • Production: service names

Phase 4: Testing (2 days)

Unit Tests (Day 1)

  • Test TenantDataDeletionResult

    def test_deletion_result_creation():
        result = TenantDataDeletionResult("tenant-123", "test-service")
        assert result.tenant_id == "tenant-123"
        assert result.success == True
    
  • Test BaseTenantDataDeletionService

    async def test_safe_delete_handles_errors():
        # Test error handling
    
  • Test each service deletion class

    async def test_orders_deletion():
        # Create test data
        # Call delete_tenant_data()
        # Verify data deleted
    
  • Test DeletionOrchestrator

    async def test_orchestrator_parallel_execution():
        # Mock service responses
        # Verify all called
    
  • Test DeletionJob tracking

    def test_job_status_tracking():
        # Create job
        # Check status transitions
    

Integration Tests (Day 1-2)

  • Test tenant deletion endpoint

    async def test_delete_tenant_endpoint():
        response = await client.delete(f"/api/v1/tenants/{tenant_id}")
        assert response.status_code == 200
    
  • Test service-to-service calls

    async def test_orders_deletion_via_orchestrator():
        # Create tenant with orders
        # Delete tenant
        # Verify orders deleted
    
  • Test CASCADE deletes

    async def test_cascade_deletes_children():
        # Create parent with children
        # Delete parent
        # Verify children also deleted
    
  • Test error handling

    async def test_partial_failure_handling():
        # Mock one service failure
        # Verify job shows failure
        # Verify other services succeeded
    

E2E Tests (Day 2)

  • Test complete tenant deletion

    async def test_complete_tenant_deletion():
        # Create tenant with data in all services
        # Delete tenant
        # Verify all data deleted
        # Check deletion job status
    
  • Test complete user deletion

    async def test_user_deletion_with_owned_tenants():
        # Create user with owned tenants
        # Create other admins
        # Delete user
        # Verify ownership transferred
        # Verify user data deleted
    
  • Test owner deletion with tenant deletion

    async def test_owner_deletion_no_other_admins():
        # Create user with tenant (no other admins)
        # Delete user
        # Verify tenant deleted
        # Verify all cascade deletes
    

Manual Testing (Throughout)

  • Test with small dataset (<100 records)
  • Test with medium dataset (1,000 records)
  • Test with large dataset (10,000+ records)
  • Measure performance
  • Verify database queries are efficient
  • Check logs for errors
  • Verify audit trail

Phase 5: Database Persistence (1 day)

Create Migration

  • Create deletion_jobs table:

    CREATE TABLE deletion_jobs (
        id UUID PRIMARY KEY,
        tenant_id UUID NOT NULL,
        tenant_name VARCHAR(255),
        initiated_by UUID,
        status VARCHAR(50) NOT NULL,
        service_results JSONB,
        total_items_deleted INTEGER DEFAULT 0,
        started_at TIMESTAMP WITH TIME ZONE,
        completed_at TIMESTAMP WITH TIME ZONE,
        error_log TEXT[],
        created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
    );
    
    CREATE INDEX idx_deletion_jobs_tenant ON deletion_jobs(tenant_id);
    CREATE INDEX idx_deletion_jobs_status ON deletion_jobs(status);
    CREATE INDEX idx_deletion_jobs_initiated ON deletion_jobs(initiated_by);
    
  • Run migration in dev

  • Run migration in staging

Update Orchestrator

  • Add database session to DeletionOrchestrator
  • Save job to database in orchestrate_tenant_deletion()
  • Update job status in database
  • Query jobs from database in get_job_status()
  • Query jobs from database in list_jobs()

Add Job API Endpoints

  • Create services/auth/app/api/deletion_jobs.py

    @router.get("/deletion-jobs/{job_id}")
    async def get_job_status(job_id: str):
        # Query from database
    
    @router.get("/deletion-jobs")
    async def list_deletion_jobs(
        tenant_id: Optional[str] = None,
        status: Optional[str] = None,
        limit: int = 100
    ):
        # Query from database with filters
    
  • Test job status endpoints


Phase 6: Production Prep (2 days)

Performance Testing

  • Create test dataset with 100K records
  • Run deletion and measure time
  • Identify bottlenecks
  • Optimize slow queries
  • Add batch processing if needed
  • Re-test and verify improvement

Monitoring Setup

  • Add Prometheus metrics:

    deletion_duration_seconds = Histogram(...)
    deletion_items_deleted = Counter(...)
    deletion_errors_total = Counter(...)
    deletion_jobs_status = Gauge(...)
    
  • Create Grafana dashboard:

    • Active deletions gauge
    • Deletion rate graph
    • Error rate graph
    • Average duration graph
    • Items deleted by service
  • Configure alerts:

    • Alert if deletion >5 minutes
    • Alert if >10% error rate
    • Alert if service timeouts

Documentation Updates

  • Update API documentation
  • Create operations runbook
  • Document rollback procedures
  • Create troubleshooting guide

Rollout Plan

  • Deploy to dev environment
  • Run full test suite
  • Deploy to staging
  • Run smoke tests
  • Deploy to production with feature flag
  • Monitor for 24 hours
  • Enable for all tenants

Phase 7: Optional Enhancements (Future)

Soft Delete (2 days)

  • Add deleted_at column to tenants table
  • Implement 30-day retention
  • Add restoration endpoint
  • Add cleanup job for expired deletions
  • Update queries to filter deleted tenants

Advanced Features (1 week)

  • WebSocket progress updates
  • Email notifications on completion
  • Deletion reports (PDF download)
  • Scheduled deletions
  • Deletion preview aggregation

Sign-Off Checklist

Code Quality

  • All services implemented
  • All endpoints tested
  • No compiler warnings
  • Code reviewed
  • Documentation complete

Testing

  • Unit tests passing (>80% coverage)
  • Integration tests passing
  • E2E tests passing
  • Performance tests passing
  • Manual testing complete

Production Readiness

  • Monitoring configured
  • Alerts configured
  • Logging verified
  • Rollback plan documented
  • Runbook created

Security & Compliance

  • Authorization verified
  • Audit logging enabled
  • GDPR compliance verified
  • Data retention policy documented
  • Security review completed

Quick Reference

Files to Create (3 new services):

  1. services/pos/app/services/tenant_deletion_service.py
  2. services/external/app/services/tenant_deletion_service.py
  3. services/alert_processor/app/services/tenant_deletion_service.py

Files to Modify (3 refactored services):

  1. services/forecasting/app/services/tenant_deletion_service.py
  2. services/training/app/services/tenant_deletion_service.py
  3. services/notification/app/services/tenant_deletion_service.py

Files to Update (integration):

  1. services/auth/app/services/admin_delete.py

Tests to Write (~50 tests):

  • 10 unit tests (base classes)
  • 24 service-specific tests (2 per service × 12 services)
  • 10 integration tests
  • 6 E2E tests

Time Estimate:

  • Implementation: 4 hours
  • Testing: 2 days
  • Deployment: 2 days
  • Total: ~5 days

Success Criteria

All 12 services have deletion logic All deletion endpoints working Orchestrator coordinating successfully Job tracking persisted to database All tests passing Performance acceptable (<5 min for large tenants) Monitoring in place Documentation complete Production deployment successful


Keep this checklist handy and mark items as you complete them!

Remember: Templates and examples are in QUICK_START_REMAINING_SERVICES.md