electricity-forecasting

# Electricity Forecasting Framework ## Overview This skill provides end-to-end support for electricity load/demand forecasting projects, from data preprocessing to model deployment. It covers traditional statistical methods, modern machine learning approaches, and state-of-the-art deep learning architectures. ## Quick Start ### 1. Define Your Forecasting Task | Horizon | Type | Typical Use | |---------|------|-------------| | 1-48 hours | Short-term (STLF) | Grid operations, unit commitment | | 1 week - 1 month | Medium-term | Maintenance scheduling, fuel planning | | 1-12 months | Long-term (LTLF) | Capacity planning, infrastructure investment | ### 2. Prepare Your Data ```bash # Run the data preparation script python scripts/prepare_data.py --input raw_load.csv --output processed/ ``` Required data columns: - `timestamp`: Datetime index (hourly or sub-hourly) - `load`: Target variable (MW or kWh) - `temperature`: Weather feature (°C) - Optional: humidity, wind_speed, solar_radiation, holiday_flag ### 3. Select Your Model See [references/model-selection.md](references/model-selection.md) for detailed guidance. **Quick recommendation:** - **Baseline**: Start with `persistence` or `seasonal-naive` - **Production STLF**: Use `XGBoost` or `LightGBM` with weather features - **Research/SOTA**: Try `Temporal Fusion Transformer (TFT)` or `iTransformer` ### 4. Train and Evaluate ```bash python scripts/train_model.py --model xgboost --data processed/ --horizon 24 ``` Key metrics to track: - **MAPE** (%): Mean Absolute Percentage Error - business interpretability - **RMSE** (MW): Root Mean Square Error - penalizes large errors - **MAE** (MW): Mean Absolute Error - robust to outliers - **Coverage** (%): Prediction interval coverage probability ## Core Workflows ### Data Preprocessing 1. **Load raw data** with proper datetime parsing 2. **Handle missing values**: Forward-fill for short gaps, interpolate for longer 3. **Feature engineering**: - Temporal: hour, day_of_week, month, is_weekend, is_holiday - Lag features: load_t-1, load_t-24, load_t-168 (weekly) - Rolling stats: rolling_mean_24h, rolling_std_7d - Weather: temperature, humidity, apparent_temperature 4. **Normalization**: RobustScaler or MinMaxScaler for deep learning models See [references/feature-engineering.md](references/feature-engineering.md) for complete feature list. ### Model Training ```python # Example training workflow from electricity_forecasting import ForecastPipeline pipeline = ForecastPipeline( model_type="xgboost", horizon=24, lookback=168 # 1 week of history ) pipeline.fit(train_data, val_data) predictions, uncertainty = pipeline.predict(test_data) metrics = pipeline.evaluate(predictions, actuals) ``` ### Hyperparameter Tuning Use `scripts/hyperparameter_search.py` for automated tuning: ```bash python scripts/hyperparameter_search.py \ --model lightgbm \ --data processed/ \ --n-trials 50 \ --study-name stlf-tuning ``` ### Uncertainty Quantification For risk-aware decision making: - **Quantile Regression**: Predict multiple quantiles (0.1, 0.5, 0.9) - **Conformal Prediction**: Distribution-free uncertainty bounds - **Ensemble Methods**: Model disagreement as uncertainty proxy - **Monte Carlo Dropout**: For neural networks See [references/uncertainty.md](references/uncertainty.md) for implementation details. ## Model Reference ### Statistical Models | Model | Best For | Pros | Cons | |-------|----------|------|------| | ARIMA | Stable series | Interpretable, fast | Assumes linearity | | SARIMA | Strong seasonality | Captures daily/weekly patterns | Manual parameter tuning | | Prophet | Multiple seasonalities | Handles holidays well | Less accurate for STLF | | TBATS | Complex seasonality | Automatic parameter selection | Slower training | ### Machine Learning Models | Model | Best For | Pros | Cons | |-------|----------|------|------| | XGBoost | Production STLF | Fast, accurate, handles missing | No native uncertainty | | LightGBM | Large datasets | Faster than XGBoost, memory efficient | Sensitive to hyperparameters | | Random Forest | Baseline ML | Robust, easy to tune | Lower accuracy than boosting | | CatBoost | Categorical features | Handles categoricals natively | Slower training | ### Deep Learning Models | Model | Best For | Pros | Cons | |-------|----------|------|------| | LSTM | Sequential patterns | Captures long-term dependencies | Slow training, hard to tune | | GRU | Similar to LSTM | Faster convergence | Similar limitations | | Transformer | Long sequences | Parallel training, attention | Data-hungry, complex | | TFT | Multi-horizon | Interpretable attention, uncertainty | Complex implementation | | N-BEATS | Pure deep learning | Strong baseline, interpretable | Less flexible than TFT | | iTransformer | SOTA performance | Inverted transformer architecture | Recent, less battle-tested | See [references/deep-learning-models.md](references/deep-learning-models.md) for architecture details and PyTorch implementations. ## Evaluation Best Practices ### Time Series Cross-Validation Never use random k-fold! Use expanding or sliding window: ```python # Expanding window CV from sklearn.model_selection import TimeSeriesSplit tscv = TimeSeriesSplit(n_splits=5, test_size=168) # 1 week test for train_idx, test_idx in tscv.split(data): train, test = data[train_idx], data[test_idx] # Train and evaluate ``` ### Backtesting Framework ```bash python scripts/backtest.py \ --model xgboost \ --data processed/ \ --cv-splits 5 \ --horizon 24 \ --metrics mape,rmse,mae ``` ### Benchmark Comparison Always compare against: 1. **Persistence**: load_t = load_t-1 2. **Seasonal Naive**: load_t = load_t-24 (for hourly data) 3. **Weekly Naive**: load_t = load_t-168 ## Deployment ### Production Pipeline 1. **Model serialization**: Save with joblib or ONNX 2. **Feature pipeline**: Ensure identical preprocessing at inference 3. **Scheduling**: Cron or Airflow for automated forecasts 4. **Monitoring**: Track forecast drift and retrain triggers See [references/deployment.md](references/deployment.md) for MLOps patterns. ### Real-time Inference ```python from electricity_forecasting import DeploymentModel model = DeploymentModel.load("models/xgboost-stlf.joblib") features = prepare_features(latest_data) prediction = model.predict(features, return_uncertainty=True) ``` ## Common Pitfalls 1. **Data leakage**: Ensure no future information in features 2. **Holiday handling**: Special days need explicit modeling 3. **Temperature nonlinearity**: Use heating/cooling degree days 4. **Concept drift**: Retrain quarterly or when MAPE degrades >20% 5. **Peak prediction**: Models often under-predict peaks - consider quantile loss ## Resources - [Feature Engineering Guide](references/feature-engineering.md) - [Model Selection Guide](references/model-selection.md) - [Deep Learning Architectures](references/deep-learning-models.md) - [Uncertainty Quantification](references/uncertainty.md) - [Deployment Patterns](references/deployment.md) - [Datasets Reference](references/datasets.md) ## Scripts | Script | Purpose | |--------|---------| | `scripts/prepare_data.py` | Data cleaning and feature engineering | | `scripts/train_model.py` | Model training with validation | | `scripts/hyperparameter_search.py` | Automated hyperparameter optimization | | `scripts/backtest.py` | Time series cross-validation | | `scripts/evaluate.py` | Comprehensive metric calculation | | `scripts/deploy_model.py` | Export model for production | ## Example Usage ```bash # Complete workflow example # 1. Prepare data python scripts/prepare_data.py --input data/load_2024.csv --output data/processed/ # 2. Train model python scripts/train_model.py --model lightgbm --data data/processed/ --horizon 48 # 3. Hyperparameter tuning python scripts/hyperparameter_search.py --model lightgbm --data data/processed/ --n-trials 100 # 4. Backtest python scripts/backtest.py --model lightgbm-best --data data/processed/ --cv-splits 5 # 5. Deploy python scripts/deploy_model.py --model lightgbm-best --output models/production/ ```

electricity-forecasting

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

electricity-forecasting