235 lines
8.2 KiB
Markdown
235 lines
8.2 KiB
Markdown
|
|
# Timezone-Aware Datetime Fix
|
||
|
|
|
||
|
|
**Date:** 2025-10-09
|
||
|
|
**Status:** ✅ RESOLVED
|
||
|
|
|
||
|
|
## Problem
|
||
|
|
|
||
|
|
Error in forecasting service logs:
|
||
|
|
```
|
||
|
|
[error] Failed to get cached prediction
|
||
|
|
error=can't compare offset-naive and offset-aware datetimes
|
||
|
|
```
|
||
|
|
|
||
|
|
## Root Cause
|
||
|
|
|
||
|
|
The forecasting service database uses `DateTime(timezone=True)` for all timestamp columns, which means they store timezone-aware datetime objects. However, the code was using `datetime.utcnow()` throughout, which returns timezone-naive datetime objects.
|
||
|
|
|
||
|
|
When comparing these two types (e.g., checking if cache has expired), Python raises:
|
||
|
|
```
|
||
|
|
TypeError: can't compare offset-naive and offset-aware datetimes
|
||
|
|
```
|
||
|
|
|
||
|
|
## Database Schema
|
||
|
|
|
||
|
|
All datetime columns in forecasting service models use `DateTime(timezone=True)`:
|
||
|
|
|
||
|
|
```python
|
||
|
|
# From app/models/predictions.py
|
||
|
|
class PredictionCache(Base):
|
||
|
|
forecast_date = Column(DateTime(timezone=True), nullable=False)
|
||
|
|
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
|
||
|
|
expires_at = Column(DateTime(timezone=True), nullable=False) # ← Compared with datetime.utcnow()
|
||
|
|
# ... other columns
|
||
|
|
|
||
|
|
class ModelPerformanceMetric(Base):
|
||
|
|
evaluation_date = Column(DateTime(timezone=True), nullable=False)
|
||
|
|
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
|
||
|
|
# ... other columns
|
||
|
|
|
||
|
|
# From app/models/forecasts.py
|
||
|
|
class Forecast(Base):
|
||
|
|
forecast_date = Column(DateTime(timezone=True), nullable=False, index=True)
|
||
|
|
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
|
||
|
|
|
||
|
|
class PredictionBatch(Base):
|
||
|
|
requested_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
|
||
|
|
completed_at = Column(DateTime(timezone=True))
|
||
|
|
```
|
||
|
|
|
||
|
|
## Solution
|
||
|
|
|
||
|
|
Replaced all `datetime.utcnow()` calls with `datetime.now(timezone.utc)` throughout the forecasting service.
|
||
|
|
|
||
|
|
### Before (BROKEN):
|
||
|
|
```python
|
||
|
|
# Returns timezone-naive datetime
|
||
|
|
cache_entry.expires_at < datetime.utcnow() # ❌ TypeError!
|
||
|
|
```
|
||
|
|
|
||
|
|
### After (WORKING):
|
||
|
|
```python
|
||
|
|
# Returns timezone-aware datetime
|
||
|
|
cache_entry.expires_at < datetime.now(timezone.utc) # ✅ Works!
|
||
|
|
```
|
||
|
|
|
||
|
|
## Files Fixed
|
||
|
|
|
||
|
|
### 1. Import statements updated
|
||
|
|
Added `timezone` to imports in all affected files:
|
||
|
|
```python
|
||
|
|
from datetime import datetime, timedelta, timezone
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. All datetime.utcnow() replaced
|
||
|
|
Fixed in 9 files across the forecasting service:
|
||
|
|
|
||
|
|
1. **[services/forecasting/app/repositories/prediction_cache_repository.py](services/forecasting/app/repositories/prediction_cache_repository.py)**
|
||
|
|
- Line 53: Cache expiration time calculation
|
||
|
|
- Line 105: Cache expiry check (the main error)
|
||
|
|
- Line 175: Cleanup expired cache entries
|
||
|
|
- Line 212: Cache statistics query
|
||
|
|
|
||
|
|
2. **[services/forecasting/app/repositories/prediction_batch_repository.py](services/forecasting/app/repositories/prediction_batch_repository.py)**
|
||
|
|
- Lines 84, 113, 143, 184: Batch completion timestamps
|
||
|
|
- Line 273: Recent activity queries
|
||
|
|
- Line 318: Cleanup old batches
|
||
|
|
- Line 357: Batch progress calculations
|
||
|
|
|
||
|
|
3. **[services/forecasting/app/repositories/forecast_repository.py](services/forecasting/app/repositories/forecast_repository.py)**
|
||
|
|
- Lines 162, 241: Forecast accuracy and trend analysis date ranges
|
||
|
|
|
||
|
|
4. **[services/forecasting/app/repositories/performance_metric_repository.py](services/forecasting/app/repositories/performance_metric_repository.py)**
|
||
|
|
- Line 101: Performance trends date range calculation
|
||
|
|
|
||
|
|
5. **[services/forecasting/app/repositories/base.py](services/forecasting/app/repositories/base.py)**
|
||
|
|
- Lines 116, 118: Recent records queries
|
||
|
|
- Lines 124, 159, 161: Cleanup and statistics
|
||
|
|
|
||
|
|
6. **[services/forecasting/app/services/forecasting_service.py](services/forecasting/app/services/forecasting_service.py)**
|
||
|
|
- Lines 292, 365, 393, 409, 447, 553: Processing time calculations and timestamps
|
||
|
|
|
||
|
|
7. **[services/forecasting/app/api/forecasting_operations.py](services/forecasting/app/api/forecasting_operations.py)**
|
||
|
|
- Line 274: API response timestamps
|
||
|
|
|
||
|
|
8. **[services/forecasting/app/api/scenario_operations.py](services/forecasting/app/api/scenario_operations.py)**
|
||
|
|
- Lines 68, 134, 163: Scenario simulation timestamps
|
||
|
|
|
||
|
|
9. **[services/forecasting/app/services/messaging.py](services/forecasting/app/services/messaging.py)**
|
||
|
|
- Message timestamps
|
||
|
|
|
||
|
|
## Verification
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Before fix
|
||
|
|
$ grep -r "datetime\.utcnow()" services/forecasting/app --include="*.py" | wc -l
|
||
|
|
20
|
||
|
|
|
||
|
|
# After fix
|
||
|
|
$ grep -r "datetime\.utcnow()" services/forecasting/app --include="*.py" | wc -l
|
||
|
|
0
|
||
|
|
```
|
||
|
|
|
||
|
|
## Why This Matters
|
||
|
|
|
||
|
|
### Timezone-Naive (datetime.utcnow())
|
||
|
|
```python
|
||
|
|
>>> datetime.utcnow()
|
||
|
|
datetime.datetime(2025, 10, 9, 9, 10, 37, 123456) # No timezone info
|
||
|
|
```
|
||
|
|
|
||
|
|
### Timezone-Aware (datetime.now(timezone.utc))
|
||
|
|
```python
|
||
|
|
>>> datetime.now(timezone.utc)
|
||
|
|
datetime.datetime(2025, 10, 9, 9, 10, 37, 123456, tzinfo=datetime.timezone.utc) # Has timezone
|
||
|
|
```
|
||
|
|
|
||
|
|
When PostgreSQL stores `DateTime(timezone=True)` columns, it stores them as timezone-aware. Comparing these with timezone-naive datetimes fails.
|
||
|
|
|
||
|
|
## Impact
|
||
|
|
|
||
|
|
This fix resolves:
|
||
|
|
- ✅ Cache expiration checks
|
||
|
|
- ✅ Batch status updates
|
||
|
|
- ✅ Performance metric queries
|
||
|
|
- ✅ Forecast analytics date ranges
|
||
|
|
- ✅ Cleanup operations
|
||
|
|
- ✅ Recent activity queries
|
||
|
|
|
||
|
|
## Best Practice
|
||
|
|
|
||
|
|
**Always use timezone-aware datetimes with PostgreSQL `DateTime(timezone=True)` columns:**
|
||
|
|
|
||
|
|
```python
|
||
|
|
# ✅ GOOD
|
||
|
|
created_at = Column(DateTime(timezone=True), default=lambda: datetime.now(timezone.utc))
|
||
|
|
expires_at = datetime.now(timezone.utc) + timedelta(hours=24)
|
||
|
|
if record.created_at < datetime.now(timezone.utc):
|
||
|
|
...
|
||
|
|
|
||
|
|
# ❌ BAD
|
||
|
|
created_at = Column(DateTime(timezone=True), default=datetime.utcnow) # No timezone!
|
||
|
|
expires_at = datetime.utcnow() + timedelta(hours=24) # Naive!
|
||
|
|
if record.created_at < datetime.utcnow(): # TypeError!
|
||
|
|
...
|
||
|
|
```
|
||
|
|
|
||
|
|
## Additional Issue Found and Fixed
|
||
|
|
|
||
|
|
### Local Import Shadowing
|
||
|
|
|
||
|
|
After the initial fix, a new error appeared:
|
||
|
|
```
|
||
|
|
[error] Multi-day forecast generation failed
|
||
|
|
error=cannot access local variable 'timezone' where it is not associated with a value
|
||
|
|
```
|
||
|
|
|
||
|
|
**Cause:** In `forecasting_service.py` line 428, there was a local import inside a conditional block that shadowed the module-level import:
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Module level (line 9)
|
||
|
|
from datetime import datetime, date, timedelta, timezone
|
||
|
|
|
||
|
|
# Inside function (line 428) - PROBLEM
|
||
|
|
if day_offset > 0:
|
||
|
|
from datetime import timedelta, timezone # ← Creates LOCAL variable
|
||
|
|
current_date = current_date + timedelta(days=day_offset)
|
||
|
|
|
||
|
|
# Later in same function (line 447)
|
||
|
|
processing_time = (datetime.now(timezone.utc) - start_time) # ← Error! timezone not accessible
|
||
|
|
```
|
||
|
|
|
||
|
|
When Python sees the local import on line 428, it creates a local variable `timezone` that only exists within that `if` block. When line 447 tries to use `timezone.utc`, Python looks for the local variable but can't find it (it's out of scope), resulting in: "cannot access local variable 'timezone' where it is not associated with a value".
|
||
|
|
|
||
|
|
**Fix:** Removed the redundant local import since `timezone` is already imported at module level:
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Before (BROKEN)
|
||
|
|
if day_offset > 0:
|
||
|
|
from datetime import timedelta, timezone
|
||
|
|
current_date = current_date + timedelta(days=day_offset)
|
||
|
|
|
||
|
|
# After (WORKING)
|
||
|
|
if day_offset > 0:
|
||
|
|
current_date = current_date + timedelta(days=day_offset)
|
||
|
|
```
|
||
|
|
|
||
|
|
**File:** [services/forecasting/app/services/forecasting_service.py](services/forecasting/app/services/forecasting_service.py#L427-L428)
|
||
|
|
|
||
|
|
## Deployment
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Restart forecasting service to apply changes
|
||
|
|
kubectl -n bakery-ia rollout restart deployment forecasting-service
|
||
|
|
|
||
|
|
# Monitor for errors
|
||
|
|
kubectl -n bakery-ia logs -f deployment/forecasting-service | grep -E "(can't compare|cannot access)"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Related Issues
|
||
|
|
|
||
|
|
This same issue may exist in other services. Search for:
|
||
|
|
```bash
|
||
|
|
# Find services using timezone-aware columns
|
||
|
|
grep -r "DateTime(timezone=True)" services/*/app/models --include="*.py"
|
||
|
|
|
||
|
|
# Find services using datetime.utcnow()
|
||
|
|
grep -r "datetime\.utcnow()" services/*/app --include="*.py"
|
||
|
|
```
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- Python datetime docs: https://docs.python.org/3/library/datetime.html#aware-and-naive-objects
|
||
|
|
- SQLAlchemy DateTime: https://docs.sqlalchemy.org/en/20/core/type_basics.html#sqlalchemy.types.DateTime
|
||
|
|
- PostgreSQL TIMESTAMP WITH TIME ZONE: https://www.postgresql.org/docs/current/datatype-datetime.html
|