Frontend Changes: - Fix runtime error: Remove undefined handleModify reference from ActionQueueCard in DashboardPage - Migrate PurchaseOrderDetailsModal to use correct PurchaseOrderItem type from purchase_orders service - Fix item display: Parse unit_price as string (Decimal) instead of number - Use correct field names: item_notes instead of notes - Remove deprecated PurchaseOrder types from suppliers.ts to prevent type conflicts - Update CreatePurchaseOrderModal to use unified types - Clean up API exports: Remove old PO hooks re-exported from suppliers - Add comprehensive translations for PO modal (en, es, eu) Documentation Reorganization: - Move WhatsApp implementation docs to docs/03-features/notifications/whatsapp/ - Move forecast validation docs to docs/03-features/forecasting/ - Move specification docs to docs/03-features/specifications/ - Move deployment docs (Colima, K8s, VPS sizing) to docs/05-deployment/ - Archive completed implementation summaries to docs/archive/implementation-summaries/ - Delete obsolete FRONTEND_CHANGES_NEEDED.md - Standardize filenames to lowercase with hyphens 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
24 KiB
POI Detection System - Implementation Documentation
Overview
The POI (Point of Interest) Detection System is a comprehensive location-based feature engineering solution for bakery demand forecasting. It automatically detects nearby points of interest (schools, offices, transport hubs, competitors, etc.) and generates ML features that improve prediction accuracy for location-specific demand patterns.
System Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Bakery SaaS Platform │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ External Data Service (POI MODULE) │ │
│ ├──────────────────────────────────────────────────────────┤ │
│ │ POI Detection Service → Overpass API (OpenStreetMap) │ │
│ │ POI Feature Selector → Relevance Filtering │ │
│ │ Competitor Analyzer → Competitive Pressure Modeling │ │
│ │ POI Cache Service → Redis (90-day TTL) │ │
│ │ TenantPOIContext → PostgreSQL Storage │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ │ POI Features │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Training Service (ENHANCED) │ │
│ ├──────────────────────────────────────────────────────────┤ │
│ │ Training Data Orchestrator → Fetches POI Features │ │
│ │ Data Processor → Merges POI Features into Training Data │ │
│ │ Prophet + XGBoost Trainer → Uses POI as Regressors │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Trained Models │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Forecasting Service (ENHANCED) │ │
│ ├──────────────────────────────────────────────────────────┤ │
│ │ POI Feature Service → Fetches POI Features │ │
│ │ Prediction Engine → Uses Same POI Features as Training │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Implementation Status
✅ Phase 1: Core POI Detection Infrastructure (COMPLETED)
Files Created:
services/external/app/models/poi_context.py- POI context data modelservices/external/app/core/poi_config.py- POI categories and configurationservices/external/app/services/poi_detection_service.py- POI detection via Overpass APIservices/external/app/services/poi_feature_selector.py- Feature relevance filteringservices/external/app/services/competitor_analyzer.py- Competitive pressure analysisservices/external/app/cache/poi_cache_service.py- Redis caching layerservices/external/app/repositories/poi_context_repository.py- Data access layerservices/external/app/api/poi_context.py- REST API endpointsservices/external/app/core/redis_client.py- Redis client accessorservices/external/migrations/versions/20251110_1554_add_poi_context.py- Database migration
Files Modified:
services/external/app/main.py- Added POI router and tableservices/external/requirements.txt- Added overpy dependency
Key Features:
- 9 POI categories: schools, offices, gyms/sports, residential, tourism, competitors, transport hubs, coworking, retail
- Research-based search radii (400m-1000m) per category
- Multi-tier feature engineering:
- Tier 1: Proximity-weighted scores (PRIMARY)
- Tier 2: Distance band counts (0-100m, 100-300m, 300-500m, 500-1000m)
- Tier 3: Distance to nearest POI
- Tier 4: Binary flags
- Feature relevance thresholds to filter low-signal features
- Competitive pressure modeling with market classification
- 90-day Redis cache with 180-day refresh cycle
- Complete REST API for detection, retrieval, refresh, deletion
✅ Phase 2: ML Training Pipeline Integration (COMPLETED)
Files Created:
services/training/app/ml/poi_feature_integrator.py- POI feature integration for training
Files Modified:
services/training/app/services/training_orchestrator.py:- Added
poi_featurestoTrainingDataSet - Added
POIFeatureIntegratorinitialization - Modified
_collect_external_datato fetch POI features concurrently - Added
_collect_poi_featuresmethod - Updated
TrainingDataSetcreation to include POI features
- Added
services/training/app/ml/data_processor.py:- Added
poi_featuresparameter toprepare_training_data - Added
_add_poi_featuresmethod - Integrated POI features into training data preparation flow
- Added
poi_featuresparameter toprepare_prediction_features - Added POI features to prediction feature generation
- Added
services/training/app/ml/trainer.py:- Updated training calls to pass
poi_featuresfromtraining_dataset - Updated test data preparation to include POI features
- Updated training calls to pass
Key Features:
- Automatic POI feature fetching during training data preparation
- POI features added as static columns (broadcast to all dates)
- Concurrent fetching with weather and traffic data
- Graceful fallback if POI service unavailable
- Feature consistency between training and testing
✅ Phase 3: Forecasting Service Integration (COMPLETED)
Files Created:
services/forecasting/app/services/poi_feature_service.py- POI feature service for forecasting
Files Modified:
services/forecasting/app/ml/predictor.py:- Added
POIFeatureServiceinitialization - Modified
_prepare_prophet_dataframeto fetch POI features - Ensured feature parity between training and prediction
- Added
Key Features:
- POI features fetched from External service for each prediction
- Same POI features used in both training and prediction (consistency)
- Automatic feature retrieval based on tenant_id
- Graceful handling of missing POI context
✅ Phase 4: Frontend POI Visualization (COMPLETED)
Status: Complete frontend implementation with geocoding and visualization
Files Created:
frontend/src/types/poi.ts- Complete TypeScript type definitions with POI_CATEGORY_METADATAfrontend/src/services/api/poiContextApi.ts- API client for POI operationsfrontend/src/services/api/geocodingApi.ts- Geocoding API client (Nominatim)frontend/src/hooks/usePOIContext.ts- React hook for POI state managementfrontend/src/hooks/useAddressAutocomplete.ts- Address autocomplete hook with debouncingfrontend/src/components/ui/AddressAutocomplete.tsx- Reusable address input componentfrontend/src/components/domain/settings/POIMap.tsx- Interactive Leaflet map with POI markersfrontend/src/components/domain/settings/POISummaryCard.tsx- POI summary statistics cardfrontend/src/components/domain/settings/POICategoryAccordion.tsx- Expandable category detailsfrontend/src/components/domain/settings/POIContextView.tsx- Main POI management viewfrontend/src/components/domain/onboarding/steps/POIDetectionStep.tsx- Onboarding wizard step
Key Features:
- Address autocomplete with real-time suggestions (Nominatim API)
- Interactive map with color-coded POI markers by category
- Distance rings visualization (100m, 300m, 500m)
- Detailed category analysis with distance distribution
- Automatic POI detection during onboarding
- POI refresh functionality with competitive insights
- Full TypeScript type safety
- Map with bakery marker at center
- Color-coded POI markers by category
- Distance rings (100m, 300m, 500m)
- Expandable category accordions with details
- Refresh button for manual POI re-detection
- Integration into Settings page and Onboarding wizard
✅ Phase 5: Background Refresh Jobs & Geocoding (COMPLETED)
Status: Complete implementation of periodic POI refresh and address geocoding
Files Created (Background Jobs):
services/external/app/models/poi_refresh_job.py- POI refresh job data modelservices/external/app/services/poi_refresh_service.py- POI refresh job management serviceservices/external/app/services/poi_scheduler.py- Background scheduler for periodic refreshservices/external/app/api/poi_refresh_jobs.py- REST API for job managementservices/external/migrations/versions/20251110_1801_df9709132952_add_poi_refresh_jobs_table.py- Database migration
Files Created (Geocoding):
services/external/app/services/nominatim_service.py- Nominatim geocoding serviceservices/external/app/api/geocoding.py- Geocoding REST API endpoints
Files Modified:
services/external/app/main.py- Integrated scheduler startup/shutdown, added routersservices/external/app/api/poi_context.py- Auto-schedules refresh job after POI detection
Key Features - Background Refresh:
- Automatic 6-month refresh cycle: Jobs scheduled 180 days after initial POI detection
- Hourly scheduler: Checks for pending jobs every hour and executes them
- Change detection: Analyzes differences between old and new POI results
- Retry logic: Up to 3 attempts with 1-hour retry delay
- Concurrent execution: Configurable max concurrent jobs (default: 5)
- Job tracking: Complete audit trail with status, timestamps, results, errors
- Manual triggers: API endpoints for immediate job execution
- Auto-scheduling: Next refresh automatically scheduled on completion
Key Features - Geocoding:
- Address autocomplete: Real-time suggestions from Nominatim API
- Forward geocoding: Convert address to coordinates
- Reverse geocoding: Convert coordinates to address
- Rate limiting: Respects 1 req/sec for public Nominatim API
- Production ready: Easy switch to self-hosted Nominatim instance
- Country filtering: Default to Spain (configurable)
Background Job API Endpoints:
POST /api/v1/poi-refresh-jobs/schedule- Schedule a refresh jobGET /api/v1/poi-refresh-jobs/{job_id}- Get job detailsGET /api/v1/poi-refresh-jobs/tenant/{tenant_id}- Get tenant's jobsPOST /api/v1/poi-refresh-jobs/{job_id}/execute- Manually execute jobGET /api/v1/poi-refresh-jobs/pending- Get pending jobsPOST /api/v1/poi-refresh-jobs/process-pending- Process all pending jobsPOST /api/v1/poi-refresh-jobs/trigger-scheduler- Trigger immediate scheduler checkGET /api/v1/poi-refresh-jobs/scheduler/status- Get scheduler status
Geocoding API Endpoints:
GET /api/v1/geocoding/search?q={query}- Address search/autocompleteGET /api/v1/geocoding/geocode?address={address}- Forward geocodingGET /api/v1/geocoding/reverse?lat={lat}&lon={lon}- Reverse geocodingGET /api/v1/geocoding/validate?lat={lat}&lon={lon}- Coordinate validationGET /api/v1/geocoding/health- Service health check
Scheduler Lifecycle:
- Startup: Scheduler automatically starts with External service
- Runtime: Runs in background, checking every 3600 seconds (1 hour)
- Shutdown: Gracefully stops when service shuts down
- Immediate check: Can be triggered via API for testing/debugging
POI Categories & Configuration
Detected Categories
| Category | OSM Query | Search Radius | Weight | Impact |
|---|---|---|---|---|
| Schools | amenity~"school|kindergarten|university" |
500m | 1.5 | Morning drop-off rush |
| Offices | office |
800m | 1.3 | Weekday lunch demand |
| Gyms/Sports | leisure~"fitness_centre|sports_centre" |
600m | 0.8 | Morning/evening activity |
| Residential | building~"residential|apartments" |
400m | 1.0 | Base demand |
| Tourism | tourism~"attraction|museum|hotel" |
1000m | 1.2 | Tourist foot traffic |
| Competitors | shop~"bakery|pastry" |
1000m | -0.5 | Competition pressure |
| Transport Hubs | railway~"station|subway_entrance" |
800m | 1.4 | Commuter traffic |
| Coworking | amenity="coworking_space" |
600m | 1.1 | Flexible workers |
| Retail | shop |
500m | 0.9 | General foot traffic |
Feature Relevance Thresholds
Features are only included in ML models if they pass relevance criteria:
Example - Schools:
min_proximity_score: 0.5 (moderate proximity required)max_distance_to_nearest_m: 500 (must be within 500m)min_count: 1 (at least 1 school)
If a bakery has no schools within 500m → school features NOT added (prevents noise)
Feature Engineering Strategy
Hybrid Multi-Tier Approach
Research Basis: Academic studies (2023-2024) show single-method approaches underperform
Tier 1: Proximity-Weighted Scores (PRIMARY)
proximity_score = Σ(1 / (1 + distance_km)) for each POI
weighted_proximity_score = proximity_score × category.weight
Example:
- Bakery 200m from 5 schools: score = 5 × (1/1.2) = 4.17
- Bakery 100m from 1 school: score = 1 × (1/1.1) = 0.91
- First bakery has higher school impact despite further distance!
Tier 2: Distance Band Counts
count_0_100m = count(POIs within 100m)
count_100_300m = count(POIs within 100-300m)
count_300_500m = count(POIs within 300-500m)
count_500_1000m = count(POIs within 500-1000m)
Tier 3: Distance to Nearest
distance_to_nearest_m = min(distances)
Tier 4: Binary Flags
has_within_100m = any(distance <= 100m)
has_within_300m = any(distance <= 300m)
has_within_500m = any(distance <= 500m)
Competitive Pressure Modeling
Special treatment for competitor bakeries:
Zones:
- Direct (<100m): -1.0 multiplier per competitor (strong negative)
- Nearby (100-500m): -0.5 multiplier (moderate negative)
- Market (500-1000m):
- If 5+ bakeries → +0.3 (bakery district = destination area)
- If 2-4 bakeries → -0.2 (competitive market)
API Endpoints
POST /api/v1/poi-context/{tenant_id}/detect
Detect POIs for a tenant's bakery location.
Query Parameters:
latitude(float, required): Bakery latitudelongitude(float, required): Bakery longitudeforce_refresh(bool, optional): Force re-detection, skip cache
Response:
{
"status": "success",
"source": "detection", // or "cache"
"poi_context": {
"id": "uuid",
"tenant_id": "uuid",
"location": {"latitude": 40.4168, "longitude": -3.7038},
"total_pois_detected": 42,
"high_impact_categories": ["schools", "transport_hubs"],
"ml_features": {
"poi_schools_proximity_score": 3.45,
"poi_schools_count_0_100m": 2,
"poi_schools_distance_to_nearest_m": 85.0,
// ... 81+ more features
}
},
"feature_selection": {
"relevant_categories": ["schools", "transport_hubs", "offices"],
"relevance_report": [...]
},
"competitor_analysis": {
"competitive_pressure_score": -1.5,
"direct_competitors_count": 1,
"competitive_zone": "high_competition",
"market_type": "competitive_market"
},
"competitive_insights": [
"⚠️ High competition: 1 direct competitor(s) within 100m. Focus on differentiation and quality."
]
}
GET /api/v1/poi-context/{tenant_id}
Retrieve stored POI context for a tenant.
Response:
{
"poi_context": {...},
"is_stale": false,
"needs_refresh": false
}
POST /api/v1/poi-context/{tenant_id}/refresh
Refresh POI context (re-detect POIs).
DELETE /api/v1/poi-context/{tenant_id}
Delete POI context for a tenant.
GET /api/v1/poi-context/{tenant_id}/feature-importance
Get feature importance summary.
GET /api/v1/poi-context/{tenant_id}/competitor-analysis
Get detailed competitor analysis.
GET /api/v1/poi-context/health
Check POI detection service health (Overpass API accessibility).
GET /api/v1/poi-context/cache/stats
Get cache statistics (key count, memory usage).
Database Schema
Table: tenant_poi_contexts
CREATE TABLE tenant_poi_contexts (
id UUID PRIMARY KEY,
tenant_id UUID UNIQUE NOT NULL,
-- Location
latitude FLOAT NOT NULL,
longitude FLOAT NOT NULL,
-- POI Detection Data
poi_detection_results JSONB NOT NULL DEFAULT '{}',
ml_features JSONB NOT NULL DEFAULT '{}',
total_pois_detected INTEGER DEFAULT 0,
high_impact_categories JSONB DEFAULT '[]',
relevant_categories JSONB DEFAULT '[]',
-- Detection Metadata
detection_timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
detection_source VARCHAR(50) DEFAULT 'overpass_api',
detection_status VARCHAR(20) DEFAULT 'completed',
detection_error VARCHAR(500),
-- Refresh Strategy
next_refresh_date TIMESTAMP WITH TIME ZONE,
refresh_interval_days INTEGER DEFAULT 180,
last_refreshed_at TIMESTAMP WITH TIME ZONE,
-- Timestamps
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
CREATE INDEX idx_tenant_poi_location ON tenant_poi_contexts (latitude, longitude);
CREATE INDEX idx_tenant_poi_refresh ON tenant_poi_contexts (next_refresh_date);
CREATE INDEX idx_tenant_poi_status ON tenant_poi_contexts (detection_status);
ML Model Integration
Training Pipeline
POI features are automatically fetched and integrated during training:
# TrainingDataOrchestrator fetches POI features
poi_features = await poi_feature_integrator.fetch_poi_features(
tenant_id=tenant_id,
latitude=lat,
longitude=lon
)
# Features added to TrainingDataSet
training_dataset = TrainingDataSet(
sales_data=filtered_sales,
weather_data=weather_data,
traffic_data=traffic_data,
poi_features=poi_features, # NEW
...
)
# Data processor merges POI features into training data
daily_sales = self._add_poi_features(daily_sales, poi_features)
# Prophet model uses POI features as regressors
for feature_name in poi_features.keys():
model.add_regressor(feature_name, mode='additive')
Forecasting Pipeline
POI features are fetched and used for predictions:
# POI Feature Service retrieves features
poi_features = await poi_feature_service.get_poi_features(tenant_id)
# Features added to prediction dataframe
df = await data_processor.prepare_prediction_features(
future_dates=future_dates,
weather_forecast=weather_df,
poi_features=poi_features, # SAME features as training
...
)
# Prophet generates forecast with POI features
forecast = model.predict(df)
Feature Consistency
Critical: POI features MUST be identical in training and prediction!
- Training: POI features fetched from External service
- Prediction: POI features fetched from External service (same tenant)
- Features are static (location-based, don't vary by date)
- Stored in
TenantPOIContextensures consistency
Performance Optimizations
Caching Strategy
Redis Cache:
- TTL: 90 days
- Cache key: Rounded coordinates (4 decimals ≈ 10m precision)
- Allows reuse for bakeries in close proximity
- Reduces Overpass API load
Database Storage:
- POI context stored in PostgreSQL
- Refresh cycle: 180 days (6 months)
- Background job refreshes stale contexts
API Rate Limiting
Overpass API:
- Public endpoint: Rate limited
- Retry logic: 3 attempts with 2-second delay
- Timeout: 30 seconds per query
- Concurrent queries: All POI categories fetched in parallel
Recommendation: Self-host Overpass API instance for production
Testing & Validation
Model Performance Impact
Expected improvements with POI features:
- MAPE improvement: 5-10% for bakeries with significant POI presence
- Accuracy maintained: For bakeries with no relevant POIs (features filtered out)
- Feature count: 81+ POI features per bakery (if all categories relevant)
A/B Testing
Compare models with and without POI features:
# Model A: Without POI features
model_a = train_model(sales, weather, traffic)
# Model B: With POI features
model_b = train_model(sales, weather, traffic, poi_features)
# Compare MAPE, MAE, R² score
Troubleshooting
Common Issues
1. No POI context found
- Cause: POI detection not run during onboarding
- Fix: Call
/api/v1/poi-context/{tenant_id}/detectendpoint
2. Overpass API timeout
- Cause: API overload or network issues
- Fix: Retry mechanism handles this automatically; check health endpoint
3. POI features not in model
- Cause: Feature relevance thresholds filter out low-signal features
- Fix: Expected behavior; check relevance report
4. Feature count mismatch between training and prediction
- Cause: POI context refreshed between training and prediction
- Fix: Models store feature manifest; prediction uses same features
Future Enhancements
-
Neighborhood Clustering
- Group bakeries by neighborhood type (business district, residential, tourist)
- Reduce from 81+ individual features to 4-5 cluster features
- Enable transfer learning across similar neighborhoods
-
Automated POI Verification
- User confirmation of auto-detected POIs
- Manual addition/removal of POIs
-
Temporal POI Features
- School session times (morning vs. afternoon)
- Office hours variations (hybrid work)
- Event-based POIs (concerts, sports matches)
-
Multi-City Support
- City-specific POI weights
- Regional calendar integration (school holidays vary by region)
-
POI Change Detection
- Monitor for new POIs (e.g., new school opens)
- Automatic re-training when significant POI changes detected
References
Academic Research
- "Gravity models for potential spatial healthcare access measurement" (2023)
- "What determines travel time and distance decay in spatial interaction" (2024)
- "Location Profiling for Retail-Site Recommendation Using Machine Learning" (2024)
- "Predicting ride-hailing passenger demand: A POI-based adaptive clustering" (2024)
Technical Documentation
- Overpass API: https://wiki.openstreetmap.org/wiki/Overpass_API
- OpenStreetMap Tags: https://wiki.openstreetmap.org/wiki/Map_features
- Facebook Prophet: https://facebook.github.io/prophet/
License & Attribution
POI data from OpenStreetMap contributors (© OpenStreetMap contributors) Licensed under Open Database License (ODbL)