Files
bakery-ia/TIMEZONE_AWARE_DATETIME_FIX.md

235 lines
8.2 KiB
Markdown
Raw Normal View History

# Timezone-Aware Datetime Fix
**Date:** 2025-10-09
**Status:** ✅ RESOLVED
## Problem
Error in forecasting service logs:
```
[error] Failed to get cached prediction
error=can't compare offset-naive and offset-aware datetimes
```
## Root Cause
The forecasting service database uses `DateTime(timezone=True)` for all timestamp columns, which means they store timezone-aware datetime objects. However, the code was using `datetime.utcnow()` throughout, which returns timezone-naive datetime objects.
When comparing these two types (e.g., checking if cache has expired), Python raises:
```
TypeError: can't compare offset-naive and offset-aware datetimes
```
## Database Schema
All datetime columns in forecasting service models use `DateTime(timezone=True)`:
```python
# From app/models/predictions.py
class PredictionCache(Base):
forecast_date = Column(DateTime(timezone=True), nullable=False)
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
expires_at = Column(DateTime(timezone=True), nullable=False) # ← Compared with datetime.utcnow()
# ... other columns
class ModelPerformanceMetric(Base):
evaluation_date = Column(DateTime(timezone=True), nullable=False)
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
# ... other columns
# From app/models/forecasts.py
class Forecast(Base):
forecast_date = Column(DateTime(timezone=True), nullable=False, index=True)
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
class PredictionBatch(Base):
requested_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
completed_at = Column(DateTime(timezone=True))
```
## Solution
Replaced all `datetime.utcnow()` calls with `datetime.now(timezone.utc)` throughout the forecasting service.
### Before (BROKEN):
```python
# Returns timezone-naive datetime
cache_entry.expires_at < datetime.utcnow() # TypeError!
```
### After (WORKING):
```python
# Returns timezone-aware datetime
cache_entry.expires_at < datetime.now(timezone.utc) # Works!
```
## Files Fixed
### 1. Import statements updated
Added `timezone` to imports in all affected files:
```python
from datetime import datetime, timedelta, timezone
```
### 2. All datetime.utcnow() replaced
Fixed in 9 files across the forecasting service:
1. **[services/forecasting/app/repositories/prediction_cache_repository.py](services/forecasting/app/repositories/prediction_cache_repository.py)**
- Line 53: Cache expiration time calculation
- Line 105: Cache expiry check (the main error)
- Line 175: Cleanup expired cache entries
- Line 212: Cache statistics query
2. **[services/forecasting/app/repositories/prediction_batch_repository.py](services/forecasting/app/repositories/prediction_batch_repository.py)**
- Lines 84, 113, 143, 184: Batch completion timestamps
- Line 273: Recent activity queries
- Line 318: Cleanup old batches
- Line 357: Batch progress calculations
3. **[services/forecasting/app/repositories/forecast_repository.py](services/forecasting/app/repositories/forecast_repository.py)**
- Lines 162, 241: Forecast accuracy and trend analysis date ranges
4. **[services/forecasting/app/repositories/performance_metric_repository.py](services/forecasting/app/repositories/performance_metric_repository.py)**
- Line 101: Performance trends date range calculation
5. **[services/forecasting/app/repositories/base.py](services/forecasting/app/repositories/base.py)**
- Lines 116, 118: Recent records queries
- Lines 124, 159, 161: Cleanup and statistics
6. **[services/forecasting/app/services/forecasting_service.py](services/forecasting/app/services/forecasting_service.py)**
- Lines 292, 365, 393, 409, 447, 553: Processing time calculations and timestamps
7. **[services/forecasting/app/api/forecasting_operations.py](services/forecasting/app/api/forecasting_operations.py)**
- Line 274: API response timestamps
8. **[services/forecasting/app/api/scenario_operations.py](services/forecasting/app/api/scenario_operations.py)**
- Lines 68, 134, 163: Scenario simulation timestamps
9. **[services/forecasting/app/services/messaging.py](services/forecasting/app/services/messaging.py)**
- Message timestamps
## Verification
```bash
# Before fix
$ grep -r "datetime\.utcnow()" services/forecasting/app --include="*.py" | wc -l
20
# After fix
$ grep -r "datetime\.utcnow()" services/forecasting/app --include="*.py" | wc -l
0
```
## Why This Matters
### Timezone-Naive (datetime.utcnow())
```python
>>> datetime.utcnow()
datetime.datetime(2025, 10, 9, 9, 10, 37, 123456) # No timezone info
```
### Timezone-Aware (datetime.now(timezone.utc))
```python
>>> datetime.now(timezone.utc)
datetime.datetime(2025, 10, 9, 9, 10, 37, 123456, tzinfo=datetime.timezone.utc) # Has timezone
```
When PostgreSQL stores `DateTime(timezone=True)` columns, it stores them as timezone-aware. Comparing these with timezone-naive datetimes fails.
## Impact
This fix resolves:
- ✅ Cache expiration checks
- ✅ Batch status updates
- ✅ Performance metric queries
- ✅ Forecast analytics date ranges
- ✅ Cleanup operations
- ✅ Recent activity queries
## Best Practice
**Always use timezone-aware datetimes with PostgreSQL `DateTime(timezone=True)` columns:**
```python
# ✅ GOOD
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
expires_at = datetime.now(timezone.utc) + timedelta(hours=24)
if record.created_at < datetime.now(timezone.utc):
...
# ❌ BAD
created_at = Column(DateTime(timezone=True), default=datetime.utcnow) # No timezone!
expires_at = datetime.utcnow() + timedelta(hours=24) # Naive!
if record.created_at < datetime.utcnow(): # TypeError!
...
```
## Additional Issue Found and Fixed
### Local Import Shadowing
After the initial fix, a new error appeared:
```
[error] Multi-day forecast generation failed
error=cannot access local variable 'timezone' where it is not associated with a value
```
**Cause:** In `forecasting_service.py` line 428, there was a local import inside a conditional block that shadowed the module-level import:
```python
# Module level (line 9)
from datetime import datetime, date, timedelta, timezone
# Inside function (line 428) - PROBLEM
if day_offset > 0:
from datetime import timedelta, timezone # ← Creates LOCAL variable
current_date = current_date + timedelta(days=day_offset)
# Later in same function (line 447)
processing_time = (datetime.now(timezone.utc) - start_time) # ← Error! timezone not accessible
```
When Python sees the local import on line 428, it creates a local variable `timezone` that only exists within that `if` block. When line 447 tries to use `timezone.utc`, Python looks for the local variable but can't find it (it's out of scope), resulting in: "cannot access local variable 'timezone' where it is not associated with a value".
**Fix:** Removed the redundant local import since `timezone` is already imported at module level:
```python
# Before (BROKEN)
if day_offset > 0:
from datetime import timedelta, timezone
current_date = current_date + timedelta(days=day_offset)
# After (WORKING)
if day_offset > 0:
current_date = current_date + timedelta(days=day_offset)
```
**File:** [services/forecasting/app/services/forecasting_service.py](services/forecasting/app/services/forecasting_service.py#L427-L428)
## Deployment
```bash
# Restart forecasting service to apply changes
kubectl -n bakery-ia rollout restart deployment forecasting-service
# Monitor for errors
kubectl -n bakery-ia logs -f deployment/forecasting-service | grep -E "(can't compare|cannot access)"
```
## Related Issues
This same issue may exist in other services. Search for:
```bash
# Find services using timezone-aware columns
grep -r "DateTime(timezone=True)" services/*/app/models --include="*.py"
# Find services using datetime.utcnow()
grep -r "datetime\.utcnow()" services/*/app --include="*.py"
```
## References
- Python datetime docs: https://docs.python.org/3/library/datetime.html#aware-and-naive-objects
- SQLAlchemy DateTime: https://docs.sqlalchemy.org/en/20/core/type_basics.html#sqlalchemy.types.DateTime
- PostgreSQL TIMESTAMP WITH TIME ZONE: https://www.postgresql.org/docs/current/datatype-datetime.html