Risk Quantification & Hedging Implications

1 Portfolio Definition

This analysis quantifies the spot-price revenue risk facing a stylised, fully unhedged renewable energy portfolio in the DE-LU bidding zone. The portfolio is defined as:

Show code
tibble(
  Parameter = c("Installed capacity", "Technology", "Bidding zone",
                 "Market position", "Forward hedging", "Comparison zone"),
  Value     = c(glue("{params$portfolio_mw} MW"), params$portfolio_tech,
                params$portfolio_zone, "Long spot (generator)",
                "None — fully exposed to day-ahead price",
                params$comparison_zone)
) |>
  gt() |>
  tab_header(title = "Stylised Portfolio Parameters") |>
  cols_align(align = "left")
Stylised Portfolio Parameters
Parameter Value
Installed capacity 400 MW
Technology wind_onshore
Bidding zone DE-LU
Market position Long spot (generator)
Forward hedging None — fully exposed to day-ahead price
Comparison zone NO_4
NoteWhat does “fully unhedged” mean?

A wind farm that sells all output on the day-ahead spot market earns revenue = price × generation in each delivery hour. Both sides of this equation are weather-driven: wind speed determines generation volume, while the aggregate wind and solar fleet drives the renewable share that (per Stage 3) is the dominant price factor in summer. This creates a natural short hedge — high wind hours produce more MWh but depress the price via the merit order effect. The simulation below quantifies how much residual risk remains after this embedded correlation.

2 Monte Carlo Framework

2.1 Feature Engineering

We reconstruct the same feature set used in the Stage 3 price model: renewable share, residual load, net export balance, TTF gas price (level and daily change), thermal generation, solar irradiance, centred temperature, renewable ramp rate, holidays, hour, day-of-week, and season.

Show code
# ── Aggregate cross-border flows to hourly net export balance ────────
# Identify DE-LU flow columns
export_cols <- names(flows_raw) |>
  str_subset("^DE_LU->") |>
  str_subset("datetime", negate = TRUE)

import_cols <- names(flows_raw) |>
  str_subset("->DE_LU$")

flows_delu <- flows_raw |>
  select(datetime_utc, all_of(c(export_cols, import_cols))) |>
  mutate(datetime_utc = floor_date(datetime_utc, "hour")) |>
  group_by(datetime_utc) |>
  summarise(
    across(everything(), \(x) mean(x, na.rm = TRUE)),
    .groups = "drop"
  ) |>
  mutate(
    total_exports = rowSums(pick(all_of(export_cols)), na.rm = TRUE),
    total_imports = rowSums(pick(all_of(import_cols)), na.rm = TRUE),
    net_export_mw = total_exports - total_imports
  ) |>
  select(datetime_utc, net_export_mw)
Show code
# ── Join all data sources into a single scenario table ───────────────
scenario_df <- smard |>
  # Derived generation features
  mutate(
    wind_total  = wind_onshore + wind_offshore,
    solar_total = solar,
    renewable_gen = wind_total + solar_total,
    renewable_share = renewable_gen / total_load,
    residual_load = total_load - renewable_gen,
    thermal_gen = biomass + gas + hard_coal + lignite,
    date = as_date(datetime_utc),
    hour = hour(datetime_utc),
    dow  = wday(datetime_utc, label = TRUE, week_start = 1),
    season = assign_season(datetime_utc),
    month = factor(month(datetime_utc, label = TRUE), ordered = FALSE),
    is_holiday = as.integer(date %in% holiday_dates)
  ) |>
  # Join ERA5 weather (ssrd, temperature)
  left_join(
    era5_delu,
    by = "datetime_utc"
  ) |>
  # Join cross-border flows
  left_join(flows_delu, by = "datetime_utc") |>
  # Join TTF gas price (daily → hourly via date, includes ttf_change)
  left_join(ttf, by = "date") |>
  arrange(datetime_utc) |>
  # Renewable ramp rate (hour-over-hour change)
  mutate(renewable_ramp = renewable_share - lag(renewable_share))

cat("Scenario table:", nrow(scenario_df), "rows,",
    sum(complete.cases(scenario_df)), "complete cases\n")
Scenario table: 52584 rows, 24554 complete cases
Show code
# ── Impute missing values (matching 03_prices.qmd strategy) ──────────
# Kalman smoother for time-series features; TTF forward/back-filled
scenario_df <- scenario_df |>
  arrange(datetime_utc) |>
  mutate(
    renewable_share    = na_kalman(renewable_share, model = "StructTS"),
    residual_load      = na_kalman(residual_load, model = "StructTS"),
    net_export_mw      = na_kalman(net_export_mw, model = "StructTS"),
    thermal_gen        = na_kalman(thermal_gen, model = "StructTS"),
    wind_speed_100m    = na_kalman(wind_speed_100m, model = "StructTS"),
    wind_speed_10m     = na_kalman(wind_speed_10m, model = "StructTS"),
    temperature_2m     = na_kalman(temperature_2m, model = "StructTS"),
    ssrd_wm2           = na_kalman(ssrd_wm2, model = "StructTS"),
    temp_centered      = na_kalman(temp_centered, model = "StructTS"),
    temp_squared       = na_kalman(temp_squared, model = "StructTS"),
    ttf_eur_mwh        = zoo::na.locf(zoo::na.locf(ttf_eur_mwh, na.rm = FALSE),
                                       fromLast = TRUE),
    ttf_change         = zoo::na.locf(zoo::na.locf(ttf_change, na.rm = FALSE),
                                       fromLast = TRUE)
  ) |>
  # Re-derive renewable ramp after imputation; fill leading NA
  mutate(
    renewable_ramp = renewable_share - lag(renewable_share),
    renewable_ramp = replace_na(renewable_ramp, 0)
  )

n_remaining_na <- scenario_df |>
  select(renewable_share, residual_load, net_export_mw, ttf_eur_mwh,
         ttf_change, thermal_gen, ssrd_wm2, temp_centered, temp_squared) |>
  summarise(across(everything(), \(x) sum(is.na(x))))

cat("Remaining NAs after imputation:\n")
Remaining NAs after imputation:
Show code
print(n_remaining_na)
# A tibble: 1 × 9
  renewable_share residual_load net_export_mw ttf_eur_mwh ttf_change thermal_gen
            <int>         <int>         <int>       <int>      <int>       <int>
1               0             0             0           0          0           0
# ℹ 3 more variables: ssrd_wm2 <int>, temp_centered <int>, temp_squared <int>

2.2 Predict Prices & Generation

We pre-compute predicted prices and portfolio generation for every historical hour before resampling. This keeps the Monte Carlo loop fast — it only needs to sample and sum pre-computed daily revenues.

Variant A (weather-only): TTF and net exports fixed at their seasonal medians; TTF change set to zero. Isolates the pure weather-driven component of price risk.

Variant B (full historical): All features drawn from the same historical day. Captures total risk including gas price and cross-border flow co-movement.

Realized: Actual historical prices (price_de_lu) multiplied by predicted generation. This is the ground-truth revenue stream — what the portfolio would have earned on the spot market. Risk metrics (VaR/CVaR) are computed on realized revenue because it represents actual market outcomes rather than model predictions.

Show code
# ── Compute seasonal medians for Variant A ───────────────────────────
seasonal_medians <- scenario_df |>
  group_by(season) |>
  summarise(
    ttf_median     = median(ttf_eur_mwh, na.rm = TRUE),
    exports_median = median(net_export_mw, na.rm = TRUE),
    .groups = "drop"
  )

seasonal_medians |>
  gt() |>
  fmt_number(columns = c(ttf_median, exports_median), decimals = 1) |>
  cols_label(
    season         = "Season",
    ttf_median     = "TTF Median (EUR/MWh)",
    exports_median = "Net Exports Median (MW)"
  ) |>
  tab_header(title = "Conditional Values for Weather-Only Variant")
Conditional Values for Weather-Only Variant
Season TTF Median (EUR/MWh) Net Exports Median (MW)
Winter 41.2 3,556.2
Spring 31.8 −378.9
Summer 34.2 −2,376.7
Autumn 40.8 963.4
Show code
# ── Variant B: predict with original features ────────────────────────
scenario_df <- scenario_df |>
  mutate(pred_price_b = predict(price_model, new_data = scenario_df)$.pred)

# ── Variant A: swap in seasonal medians, then predict ────────────────
scenario_a <- scenario_df |>
  select(-ttf_eur_mwh, -net_export_mw, -ttf_change) |>
  left_join(seasonal_medians, by = "season") |>
  rename(
    ttf_eur_mwh    = ttf_median,
    net_export_mw  = exports_median
  ) |>
  mutate(ttf_change = 0)  # No gas price movement under weather-only

scenario_df <- scenario_df |>
  mutate(pred_price_a = predict(price_model, new_data = scenario_a)$.pred)

# ── Quantile predictions (q05 / q95) ────────────────────────────────
scenario_df <- scenario_df |>
  mutate(
    pred_q05 = predict(price_xgb_q05, new_data = scenario_df)$.pred,
    pred_q95 = predict(price_xgb_q95, new_data = scenario_df)$.pred
  )
Show code
# ── Portfolio generation via Stage 1 CF model ────────────────────────
# CF models (02_weather.qmd) expect hour and month as factors,
# while the price model (03_prices.qmd) expects hour as integer.
# Convert only for the CF prediction.
cf_input <- scenario_df |>
  mutate(
    hour  = factor(hour),
    month = factor(month)
  )

scenario_df <- scenario_df |>
  mutate(
    pred_cf = predict(cf_wind_onshore, new_data = cf_input)$.pred,
    pred_cf = pmax(pred_cf, 0),
    # Generation in MWh per hour (hourly data, so MW = MWh/h)
    portfolio_gen_mwh = pred_cf * params$portfolio_mw
  )

rm(cf_input)

# ── Hourly revenue: price × generation ───────────────────────────────
scenario_df <- scenario_df |>
  mutate(
    revenue_a        = pred_price_a * portfolio_gen_mwh,
    revenue_b        = pred_price_b * portfolio_gen_mwh,
    revenue_realized = price_de_lu  * portfolio_gen_mwh,
    revenue_q05      = pred_q05     * portfolio_gen_mwh,
    revenue_q95      = pred_q95     * portfolio_gen_mwh
  )

cat("Predicted CF range:",
    round(range(scenario_df$pred_cf, na.rm = TRUE), 3), "\n")
Predicted CF range: 0 0.745 
Show code
cat("Mean portfolio generation:",
    round(mean(scenario_df$portfolio_gen_mwh, na.rm = TRUE), 1), "MWh/h\n")
Mean portfolio generation: 82 MWh/h

2.3 Daily Revenue Aggregation

Show code
# ── Aggregate hourly revenue to daily totals ─────────────────────────
daily_revenue <- scenario_df |>
  filter(!is.na(revenue_a), !is.na(revenue_b), !is.na(revenue_realized)) |>
  group_by(date, season, month) |>
  summarise(
    daily_rev_a        = sum(revenue_a),
    daily_rev_b        = sum(revenue_b),
    daily_rev_realized = sum(revenue_realized),
    daily_gen_mwh      = sum(portfolio_gen_mwh),
    daily_price_a      = mean(pred_price_a),
    daily_price_b      = mean(pred_price_b),
    daily_price_real   = mean(price_de_lu),
    n_hours            = n(),
    .groups = "drop"
  ) |>
  # Keep only complete days (24 hours)
  filter(n_hours == 24)

cat("Complete days:", nrow(daily_revenue), "\n")
Complete days: 2190 
Show code
cat("Days by season:\n")
Days by season:
Show code
print(count(daily_revenue, season))
# A tibble: 4 × 2
  season     n
  <fct>  <int>
1 Winter   540
2 Spring   552
3 Summer   552
4 Autumn   546

2.4 Block Bootstrap Simulation

We use a 5-day block bootstrap instead of sampling individual days independently. By resampling non-overlapping blocks of 5 consecutive days, the simulation preserves the serial correlation in weather patterns (multi-day wind droughts, persistent high-pressure systems) that drives the heaviest revenue losses. Each synthetic month consists of 6 blocks × 5 days = 30 days, constructed independently for each season from that season’s historical pool.

Show code
set.seed(2026)
n_sim            <- params$n_sim
block_size       <- 5
n_blocks         <- params$n_days_per_month %/% block_size   # 6 blocks × 5 days = 30

# ── Create non-overlapping 5-day blocks within each season ───────────
daily_revenue <- daily_revenue |>
  group_by(season) |>
  arrange(date, .by_group = TRUE) |>
  mutate(block_id = (row_number() - 1) %/% block_size) |>
  ungroup()

# ── Aggregate to block-level totals ──────────────────────────────────
block_revenue <- daily_revenue |>
  group_by(season, block_id) |>
  summarise(
    block_rev_a        = sum(daily_rev_a),
    block_rev_b        = sum(daily_rev_b),
    block_rev_realized = sum(daily_rev_realized),
    block_gen_mwh      = sum(daily_gen_mwh),
    block_price_a      = mean(daily_price_a),
    block_price_b      = mean(daily_price_b),
    block_price_real   = mean(daily_price_real),
    block_days         = n(),
    .groups = "drop"
  ) |>
  # Keep only complete blocks (exactly 5 days)
  filter(block_days == block_size)

cat("Complete 5-day blocks by season:\n")
Complete 5-day blocks by season:
Show code
print(count(block_revenue, season))
# A tibble: 4 × 2
  season     n
  <fct>  <int>
1 Winter   108
2 Spring   110
3 Summer   110
4 Autumn   109
Show code
# ── Bootstrap: sample blocks with replacement ────────────────────────
sim_results <- block_revenue |>
  group_by(season) |>
  group_modify(\(data, key) {
    n_pool <- nrow(data)

    # Matrix of sampled row indices: n_blocks × n_sim
    idx <- matrix(
      sample.int(n_pool, size = n_blocks * n_sim, replace = TRUE),
      nrow = n_blocks, ncol = n_sim
    )

    tibble(
      sim_id = 1:n_sim,
      # Monthly revenue (EUR) — sum of sampled block revenues
      monthly_rev_a        = colSums(matrix(data$block_rev_a[idx],
                                            nrow = n_blocks)),
      monthly_rev_b        = colSums(matrix(data$block_rev_b[idx],
                                            nrow = n_blocks)),
      monthly_rev_realized = colSums(matrix(data$block_rev_realized[idx],
                                            nrow = n_blocks)),
      # Monthly generation (MWh)
      monthly_gen          = colSums(matrix(data$block_gen_mwh[idx],
                                            nrow = n_blocks)),
      # Average daily price (EUR/MWh) — mean of block averages
      avg_price_a          = colMeans(matrix(data$block_price_a[idx],
                                             nrow = n_blocks)),
      avg_price_b          = colMeans(matrix(data$block_price_b[idx],
                                             nrow = n_blocks)),
      avg_price_real       = colMeans(matrix(data$block_price_real[idx],
                                             nrow = n_blocks))
    )
  }) |>
  ungroup()

cat("Simulation complete:", nrow(sim_results), "rows\n")
Simulation complete: 40000 rows
Show code
cat("(", n_sim, "simulations ×", n_distinct(sim_results$season), "seasons )\n")
( 10000 simulations × 4 seasons )

2.5 Variant Comparison

Show code
sim_long <- sim_results |>
  pivot_longer(
    cols = c(monthly_rev_a, monthly_rev_b, monthly_rev_realized),
    names_to  = "variant",
    values_to = "monthly_revenue"
  ) |>
  mutate(
    variant = recode(variant,
      monthly_rev_a        = "Weather-only",
      monthly_rev_b        = "Full historical",
      monthly_rev_realized = "Realized"
    ),
    monthly_revenue_meur = monthly_revenue / 1e6
  )

ggplot(sim_long, aes(x = monthly_revenue_meur, fill = variant)) +
  geom_density(alpha = 0.35, colour = NA) +
  geom_vline(
    data = sim_long |>
      group_by(season, variant) |>
      summarise(med = median(monthly_revenue_meur), .groups = "drop"),
    aes(xintercept = med, colour = variant),
    linetype = "dashed", linewidth = 0.6
  ) +
  facet_wrap(~ season, scales = "free_x", ncol = 2) +
  scale_fill_manual(values = variant_colours) +
  scale_colour_manual(values = variant_colours) +
  labs(
    title    = glue("{params$portfolio_mw} MW {params$portfolio_tech} — Monthly Revenue Distribution"),
    subtitle = "Dashed lines show median. Realized = actual prices × predicted generation.",
    x = "Monthly Revenue (M EUR)",
    y = "Density",
    fill = "Variant", colour = "Variant"
  )
Figure 1: Monthly revenue distributions by season and simulation variant. Realized uses actual historical prices; the model-based variants show how much risk the weather-only and full models capture.
Show code
sim_price_long <- sim_results |>
  pivot_longer(
    cols = c(avg_price_a, avg_price_b, avg_price_real),
    names_to  = "variant",
    values_to = "avg_price"
  ) |>
  mutate(
    variant = recode(variant,
      avg_price_a    = "Weather-only",
      avg_price_b    = "Full historical",
      avg_price_real = "Realized"
    )
  )

ggplot(sim_price_long, aes(x = avg_price, fill = variant)) +
  geom_density(alpha = 0.35, colour = NA) +
  facet_wrap(~ season, scales = "free_x", ncol = 2) +
  scale_fill_manual(values = variant_colours) +
  labs(
    title    = "Simulated Average Monthly Spot Price",
    subtitle = "Weather-only isolates renewable variability; Realized shows actual market outcomes",
    x = "Average Monthly Price (EUR/MWh)",
    y = "Density",
    fill = "Variant"
  )
Figure 2: Simulated average monthly price by season and variant
Show code
# ── Summary table: how do the three variants compare? ────────────────
variant_summary <- sim_results |>
  group_by(season) |>
  summarise(
    sd_rev_a        = sd(monthly_rev_a) / 1e6,
    sd_rev_b        = sd(monthly_rev_b) / 1e6,
    sd_rev_realized = sd(monthly_rev_realized) / 1e6,
    sd_ratio_ba     = sd(monthly_rev_b) / sd(monthly_rev_a),
    cvar95_a        = compute_cvar(monthly_rev_a, 0.95) / 1e6,
    cvar95_b        = compute_cvar(monthly_rev_b, 0.95) / 1e6,
    cvar95_realized = compute_cvar(monthly_rev_realized, 0.95) / 1e6,
    .groups = "drop"
  )

variant_summary |>
  gt() |>
  fmt_number(columns = c(sd_rev_a, sd_rev_b, sd_rev_realized,
                          cvar95_a, cvar95_b, cvar95_realized), decimals = 2) |>
  fmt_number(columns = sd_ratio_ba, decimals = 2) |>
  cols_label(
    season          = "Season",
    sd_rev_a        = "SD Weather (M EUR)",
    sd_rev_b        = "SD Full (M EUR)",
    sd_rev_realized = "SD Realized (M EUR)",
    sd_ratio_ba     = "SD Ratio (Full/Weather)",
    cvar95_a        = "CVaR 95% Weather",
    cvar95_b        = "CVaR 95% Full",
    cvar95_realized = "CVaR 95% Realized"
  ) |>
  tab_header(
    title    = "Revenue Risk: Three Variants Compared",
    subtitle = glue("{params$portfolio_mw} MW {params$portfolio_tech} — Monthly Revenue (M EUR)")
  ) |>
  tab_footnote("Realized uses actual historical prices × predicted generation. SD ratio > 1 indicates additional risk from TTF and cross-border flow variability.")
Revenue Risk: Three Variants Compared
400 MW wind_onshore — Monthly Revenue (M EUR)
Season SD Weather (M EUR) SD Full (M EUR) SD Realized (M EUR) SD Ratio (Full/Weather) CVaR 95% Weather CVaR 95% Full CVaR 95% Realized
Winter 0.91 1.67 1.56 1.85 4.52 3.30 3.75
Spring 0.61 1.25 1.17 2.04 2.45 1.98 1.94
Summer 0.37 1.12 1.26 3.01 2.08 1.78 1.57
Autumn 0.66 1.62 1.40 2.47 2.69 2.25 3.04
Realized uses actual historical prices × predicted generation. SD ratio > 1 indicates additional risk from TTF and cross-border flow variability.

3 Risk Metrics

How much could a bad month cost? The table below answers that for each season, using realized revenue (actual historical prices × predicted generation) as the basis for risk metrics. This ensures VaR and CVaR reflect the revenue distribution that would have actually been observed, rather than model-filtered predictions that smooth out tail events.

Show code
risk_metrics <- sim_results |>
  group_by(season) |>
  summarise(
    median_rev     = median(monthly_rev_realized) / 1e6,
    mean_rev       = mean(monthly_rev_realized) / 1e6,
    sd_rev         = sd(monthly_rev_realized) / 1e6,
    var_95         = compute_var(monthly_rev_realized, 0.95) / 1e6,
    var_99         = compute_var(monthly_rev_realized, 0.99) / 1e6,
    cvar_95        = compute_cvar(monthly_rev_realized, 0.95) / 1e6,
    cvar_99        = compute_cvar(monthly_rev_realized, 0.99) / 1e6,
    weather_share  = sd(monthly_rev_a) / sd(monthly_rev_realized),
    .groups = "drop"
  )

risk_metrics |>
  gt() |>
  fmt_number(columns = c(median_rev, mean_rev, sd_rev,
                          var_95, var_99, cvar_95, cvar_99), decimals = 2) |>
  fmt_percent(columns = weather_share, decimals = 0) |>
  cols_label(
    season        = "Season",
    median_rev    = "Median",
    mean_rev      = "Mean",
    sd_rev        = "Std Dev",
    var_95        = "VaR 95%",
    var_99        = "VaR 99%",
    cvar_95       = "CVaR 95%",
    cvar_99       = "CVaR 99%",
    weather_share = "Weather Share"
  ) |>
  tab_header(
    title    = glue("Monthly Revenue Risk — {params$portfolio_mw} MW Onshore Wind in {params$portfolio_zone}"),
    subtitle = "All values in M EUR. Based on realized revenue (actual prices × predicted generation)."
  ) |>
  tab_spanner(label = "Central Estimate", columns = c(median_rev, mean_rev)) |>
  tab_spanner(label = "Spread", columns = c(sd_rev)) |>
  tab_spanner(label = "Downside Risk", columns = c(var_95, var_99, cvar_95, cvar_99)) |>
  tab_footnote(
    "Weather Share = fraction of total revenue volatility attributable to wind variability alone (weather-only SD / realized SD). The remainder comes from gas price and cross-border flow movements."
  )
Monthly Revenue Risk — 400 MW Onshore Wind in DE-LU
All values in M EUR. Based on realized revenue (actual prices × predicted generation).
Season
Central Estimate
Spread
Downside Risk
Weather Share
Median Mean Std Dev VaR 95% VaR 99% CVaR 95% CVaR 99%
Winter 6.52 6.60 1.56 4.19 3.47 3.75 3.15 58%
Spring 3.77 3.89 1.17 2.22 1.73 1.94 1.55 52%
Summer 3.32 3.47 1.26 1.76 1.45 1.57 1.31 30%
Autumn 5.47 5.58 1.40 3.45 2.78 3.04 2.49 47%
Weather Share = fraction of total revenue volatility attributable to wind variability alone (weather-only SD / realized SD). The remainder comes from gas price and cross-border flow movements.
TipReading the table

Pick a row — say Winter. The median month brings in the figure shown under Median. But in the worst 5% of months (CVaR 95%), revenue drops to the CVaR 95% figure. The Weather Share column is the key link back to Stage 3: it tells you how much of that revenue swing comes purely from wind variability versus gas and trade flow uncertainty. A low weather share means hedging weather alone won’t cover most of your risk.

4 Cross-Zone Analysis

Can a generator reduce risk by selling into multiple bidding zones? If prices in different zones move independently, geographic diversification works well. If they move in lockstep — driven by the same weather systems sweeping across Europe — the benefit is limited.

4.1 Price Correlation Across European Bidding Zones

Show code
# ── Hourly price correlation matrix across all zones ─────────────────
price_matrix <- prices_all |>
  select(-datetime_utc) |>
  cor(use = "pairwise.complete.obs")

# ── Reorder by hierarchical clustering for visual clarity ────────────
hc <- hclust(as.dist(1 - price_matrix))
price_matrix_ordered <- price_matrix[hc$order, hc$order]
Show code
cor_long <- price_matrix_ordered |>
  as.data.frame() |>
  rownames_to_column("zone_1") |>
  pivot_longer(-zone_1, names_to = "zone_2", values_to = "correlation") |>
  mutate(
    zone_1 = factor(zone_1, levels = rownames(price_matrix_ordered)),
    zone_2 = factor(zone_2, levels = rownames(price_matrix_ordered))
  )

ggplot(cor_long, aes(x = zone_1, y = zone_2, fill = correlation)) +
  geom_tile(colour = "white", linewidth = 0.3) +
  geom_text(aes(label = sprintf("%.2f", correlation)),
            size = 2.5, colour = "grey20") +
  scale_fill_distiller(
    palette  = "YlOrRd",
    direction = 1,
    limits   = c(0, 1),
    name     = "Correlation"
  ) +
  labs(
    title    = "How Closely Do European Electricity Prices Move Together?",
    subtitle = "High correlation = limited diversification from cross-border hedging",
    x = NULL, y = NULL
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 9),
    axis.text.y = element_text(size = 9)
  )
Figure 3: Hourly day-ahead price correlations across European bidding zones (2020–2025)

4.2 Geographic View — Price Linkages Across Europe

Show code
library(plotly)

# ── Zone centroids (approximate geographic centres) ──────────────────
zone_coords <- tribble(
  ~zone,   ~lat,  ~lon,  ~label,
  "DE_LU", 51.0,  10.0,  "DE-LU",
  "FR",    46.5,   2.5,  "FR",
  "NL",    52.0,   5.5,  "NL",
  "BE",    50.8,   4.5,  "BE",
  "AT",    47.5,  14.0,  "AT",
  "DK_1",  56.0,   9.5,  "DK1",
  "DK_2",  55.5,  12.0,  "DK2",
  "NO_1",  60.0,  10.5,  "NO1",
  "NO_2",  59.0,   6.0,  "NO2",
  "NO_3",  63.5,  10.5,  "NO3",
  "NO_4",  68.0,  15.5,  "NO4",
  "NO_5",  60.5,   6.5,  "NO5",
  "SE_1",  66.0,  18.0,  "SE1",
  "SE_2",  63.0,  16.0,  "SE2",
  "SE_3",  59.0,  16.0,  "SE3",
  "SE_4",  56.5,  14.5,  "SE4",
  "FI",    62.0,  26.0,  "FI"
)

# ── DE-LU correlations ──────────────────────────────────────────────
delu_corr <- tibble(
  zone       = rownames(price_matrix),
  corr_de_lu = price_matrix[, "DE_LU"]
) |>
  filter(zone != "DE_LU") |>
  left_join(zone_coords, by = "zone")

delu_point <- zone_coords |> filter(zone == "DE_LU")

# ── Colour mapping (sequential warm) ────────────────────────────────
corr_to_colour <- colorRamp(c("#F7F7F7", "#FDDBC7", "#EF8A62", "#B2182B"))

delu_corr <- delu_corr |>
  mutate(
    colour_rgb = map_chr(corr_de_lu, \(x) {
      rgb_vals <- corr_to_colour(x)
      sprintf("rgb(%d,%d,%d)",
              round(rgb_vals[1]), round(rgb_vals[2]), round(rgb_vals[3]))
    }),
    hover_text = paste0(
      "<b>", label, "</b><br>",
      "Correlation with DE-LU: <b>", sprintf("%.2f", corr_de_lu), "</b>"
    )
  )

# ── Country-level averages for choropleth background ────────────────
zone_to_iso3 <- tribble(
  ~zone, ~iso3,
  "FR", "FRA", "NL", "NLD", "BE", "BEL", "AT", "AUT",
  "DK_1", "DNK", "DK_2", "DNK",
  "NO_1", "NOR", "NO_2", "NOR", "NO_3", "NOR", "NO_4", "NOR", "NO_5", "NOR",
  "SE_1", "SWE", "SE_2", "SWE", "SE_3", "SWE", "SE_4", "SWE",
  "FI", "FIN"
)

country_corrs <- delu_corr |>
  left_join(zone_to_iso3, by = "zone") |>
  group_by(iso3) |>
  summarise(corr_de_lu = mean(corr_de_lu), .groups = "drop") |>
  bind_rows(tibble(iso3 = "DEU", corr_de_lu = 1.0))

# ── Build plotly geo map (responsive sizing) ──────────────────────────
fig <- plot_geo(height = 700) |>
  # Choropleth background — country polygons filled by correlation
  add_trace(
    type       = "choropleth",
    locations  = country_corrs$iso3,
    z          = country_corrs$corr_de_lu,
    colorscale = list(
      c(0,    "rgb(222,235,247)"),
      c(0.25, "rgb(247,247,247)"),
      c(0.5,  "rgb(253,208,162)"),
      c(0.75, "rgb(227,120,71)"),
      c(1,    "rgb(178,24,43)")
    ),
    zmin = 0.15, zmax = 1,
    marker     = list(line = list(color = "rgb(180,180,180)", width = 1)),
    colorbar   = list(title = "Correlation", len = 0.4, y = 0.5),
    hoverinfo  = "none",
    showlegend = FALSE
  ) |>
  layout(
    title = list(
      text = "How Correlated Are European Electricity Prices with Germany?",
      font = list(size = 16)
    ),
    geo = list(
      scope = "europe",
      projection = list(type = "mercator"),
      lonaxis = list(range = c(-5, 30)),
      lataxis = list(range = c(44, 72)),
      showland = TRUE, landcolor = "rgb(235,235,235)",
      showocean = TRUE, oceancolor = "rgb(225,235,245)",
      showcountries = TRUE, countrycolor = "rgb(180,180,180)",
      showlakes = TRUE, lakecolor = "rgb(225,235,245)",
      showcoastlines = TRUE, coastlinecolor = "rgb(180,180,180)",
      resolution = 50
    ),
    showlegend = FALSE,
    autosize = TRUE,
    margin = list(t = 60, b = 20)
  )

# ── Connection lines from DE-LU to each zone ────────────────────────
for (i in seq_len(nrow(delu_corr))) {
  row <- delu_corr[i, ]
  fig <- fig |>
    add_trace(
      type = "scattergeo", mode = "lines",
      lon = c(delu_point$lon, row$lon),
      lat = c(delu_point$lat, row$lat),
      line = list(
        width = row$corr_de_lu * 4,
        color = row$colour_rgb
      ),
      opacity    = 0.9,
      hoverinfo  = "none",
      showlegend = FALSE
    )
}

# ── Zone markers (sized and coloured by correlation) ─────────────────
fig <- fig |>
  add_trace(
    type = "scattergeo", mode = "markers+text",
    data = delu_corr,
    lon = ~lon, lat = ~lat,
    marker = list(
      size = ~corr_de_lu * 25 + 5,
      color = ~colour_rgb,
      line = list(width = 1.5, color = "white")
    ),
    text = ~paste0(label, " (", sprintf("%.2f", corr_de_lu), ")"),
    textposition = "top center",
    textfont = list(size = 10, color = "rgb(50,50,50)"),
    hovertext = ~hover_text,
    hoverinfo = "text",
    showlegend = FALSE
  )

# ── DE-LU reference marker ──────────────────────────────────────────
fig <- fig |>
  add_trace(
    type = "scattergeo", mode = "markers+text",
    lon = delu_point$lon, lat = delu_point$lat,
    marker = list(
      size = 16, color = "#B2182B", symbol = "diamond",
      line = list(width = 2, color = "white")
    ),
    text = "DE-LU", textposition = "top center",
    textfont = list(size = 12, color = "black", family = "Arial"),
    hovertext = "<b>DE-LU</b><br>Reference zone",
    hoverinfo = "text",
    showlegend = FALSE
  )

fig
Figure 4: Price correlation with DE-LU across European bidding zones — hover for details
NoteWhat the map tells us

Zones physically close to Germany (AT, NL, BE) show very high price correlation — their markets are tightly coupled through interconnectors. The Nordic zones (NO, SE, FI) are less correlated, meaning a portfolio spanning both CWE and the Nordics would capture some diversification benefit. But even the lowest correlations are well above zero: no European zone offers a true hedge against German price moves.

4.3 Seasonal Price Correlation

Price linkages aren’t fixed — they shift with season as weather patterns and demand fundamentals change.

Show code
comparison_zones <- c("AT", "NL", "FR", "BE", "DK_1", "DK_2",
                       "NO_1", "NO_2", "SE_3", "SE_4", "FI")

seasonal_corr <- prices_all |>
  mutate(season = assign_season(datetime_utc)) |>
  group_by(season) |>
  group_modify(\(data, key) {
    corrs <- data |>
      select(all_of(c("DE_LU", comparison_zones))) |>
      cor(use = "pairwise.complete.obs")

    tibble(
      zone        = comparison_zones,
      correlation = corrs["DE_LU", comparison_zones]
    )
  }) |>
  ungroup()

ggplot(seasonal_corr, aes(x = reorder(zone, -correlation),
                           y = correlation, fill = season)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  scale_fill_manual(values = season_colours) +
  geom_hline(yintercept = c(0.5, 0.75), linetype = "dotted",
             colour = "grey50") +
  labs(
    title    = "Price Correlation with DE-LU Shifts by Season",
    subtitle = "CWE neighbours stay tightly linked year-round; Nordic correlation varies more",
    x = NULL, y = "Correlation with DE-LU", fill = "Season"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
Figure 5: DE-LU price correlation with neighbouring zones varies by season

5 NO_4 Comparison

NO_4 serves as a natural comparison to DE-LU. By fitting the same regression structure to a zone with a different generation mix, we can see how market composition shapes the seasonal risk profile.

5.1 Feature Engineering

Show code
# ── Clean comparison zone generation: standardise names, hourly agg ──
# ENTSO-E columns use spaces; select what's available
available_gen_cols <- names(comp_gen_raw) |>
  str_subset("^(Wind Onshore|Wind Offshore|Solar|Fossil Gas|datetime_utc)$")

comp_gen <- comp_gen_raw |>
  select(all_of(available_gen_cols))

# Standardise names
comp_gen <- comp_gen |>
  rename_with(\(x) case_when(
    x == "Wind Onshore"  ~ "wind_onshore",
    x == "Wind Offshore" ~ "wind_offshore",
    x == "Solar"         ~ "solar",
    x == "Fossil Gas"    ~ "fossil_gas",
    TRUE                 ~ x
  ))

# Add zeros for missing columns (e.g. some zones lack offshore wind)
for (col in c("wind_onshore", "wind_offshore", "solar", "fossil_gas")) {
  if (!col %in% names(comp_gen)) comp_gen[[col]] <- 0
}

# Aggregate to hourly (generation file may be sub-hourly)
comp_gen <- comp_gen |>
  mutate(datetime_utc = floor_date(datetime_utc, "hour")) |>
  group_by(datetime_utc) |>
  summarise(across(everything(), \(x) mean(x, na.rm = TRUE)),
            .groups = "drop")

# ── Comparison zone net exports ──────────────────────────────────────
comp_export_cols <- names(flows_raw) |>
  str_subset(glue("^{comp_zone}->")) |>
  str_subset("datetime", negate = TRUE)

comp_import_cols <- names(flows_raw) |>
  str_subset(glue("->{comp_zone}$"))

flows_comp <- flows_raw |>
  select(datetime_utc, all_of(c(comp_export_cols, comp_import_cols))) |>
  mutate(datetime_utc = floor_date(datetime_utc, "hour")) |>
  group_by(datetime_utc) |>
  summarise(across(everything(), \(x) mean(x, na.rm = TRUE)),
            .groups = "drop") |>
  mutate(
    total_exports = rowSums(pick(all_of(comp_export_cols)), na.rm = TRUE),
    total_imports = rowSums(pick(all_of(comp_import_cols)), na.rm = TRUE),
    net_export_mw = total_exports - total_imports
  ) |>
  select(datetime_utc, net_export_mw)

# ── Comparison zone prices ───────────────────────────────────────────
comp_prices <- prices_all |>
  select(datetime_utc, comp_price = all_of(comp_zone))

# ── Assemble comparison scenario table ───────────────────────────────
comp_df <- comp_gen |>
  left_join(comp_load, by = "datetime_utc") |>
  mutate(
    renewable_gen   = wind_onshore + wind_offshore + solar,
    renewable_share = renewable_gen / total_load,
    residual_load   = total_load - renewable_gen,
    date   = as_date(datetime_utc),
    hour   = hour(datetime_utc),
    dow    = wday(datetime_utc, label = TRUE, week_start = 1),
    season = assign_season(datetime_utc)
  ) |>
  left_join(flows_comp, by = "datetime_utc") |>
  left_join(comp_prices, by = "datetime_utc") |>
  left_join(ttf |> select(date, ttf_eur_mwh), by = "date") |>
  arrange(datetime_utc) |>
  mutate(
    renewable_share = na_kalman(renewable_share, model = "StructTS"),
    residual_load   = na_kalman(residual_load, model = "StructTS"),
    net_export_mw   = na_kalman(net_export_mw, model = "StructTS"),
    ttf_eur_mwh     = zoo::na.locf(zoo::na.locf(ttf_eur_mwh, na.rm = FALSE),
                                    fromLast = TRUE)
  ) |>
  filter(!is.na(comp_price))

cat(glue("{comp_zone} scenario table: {nrow(comp_df)} rows, ",
         "{sum(complete.cases(comp_df))} complete cases\n"))
NO_4 scenario table: 52608 rows, 52608 complete cases

5.2 Seasonal Variance Attribution

We fit the same model structure as Stage 3 — OLS with the same feature groups — within each season. Partial R² measures how much price variance each feature group explains: drop the feature, see how much R² falls.

Show code
# ── Partial R² by feature group, by season ───────────────────────────
feature_groups <- list(
  "Renewable share" = "renewable_share",
  "Residual load"   = "residual_load",
  "Net exports"     = "net_export_mw",
  "TTF gas price"   = "ttf_eur_mwh",
  "Hour of day"     = "factor(hour)",
  "Day of week"     = "dow"
)

full_formula <- comp_price ~ renewable_share + residual_load +
  net_export_mw + ttf_eur_mwh + factor(hour) + dow

# ── Helper: compute partial R² for one season ───────────────────────
compute_partial_r2 <- function(data, full_form, groups) {
  full_fit <- lm(full_form, data = data)
  full_r2  <- summary(full_fit)$r.squared

  map_dfr(names(groups), \(gname) {
    drop_term <- groups[[gname]]
    # Build reduced formula by removing the term
    reduced_form <- update(full_form, paste("~ . -", drop_term))
    reduced_fit  <- lm(reduced_form, data = data)
    reduced_r2   <- summary(reduced_fit)$r.squared

    tibble(
      feature_group = gname,
      partial_r2    = full_r2 - reduced_r2
    )
  }) |>
    mutate(full_r2 = full_r2)
}

comp_partial <- map_dfr(levels(comp_df$season), \(s) {
  sdata <- comp_df |> filter(season == s)
  compute_partial_r2(sdata, full_formula, feature_groups) |>
    mutate(season = s, zone = comp_zone)
})

# ── Same analysis for DE-LU (using scenario_df from earlier) ────────
delu_formula <- price_de_lu ~ renewable_share + residual_load +
  net_export_mw + ttf_eur_mwh + factor(hour) + dow

delu_partial <- map_dfr(levels(scenario_df$season), \(s) {
  sdata <- scenario_df |> filter(season == s)
  compute_partial_r2(sdata, delu_formula, feature_groups) |>
    mutate(season = s, zone = "DE_LU")
})

# ── Combine for plotting ────────────────────────────────────────────
partial_r2_combined <- bind_rows(delu_partial, comp_partial)
Show code
ggplot(partial_r2_combined,
       aes(x = feature_group, y = partial_r2, fill = zone)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  facet_wrap(~ season, ncol = 2) +
  scale_fill_manual(
    values = c("DE_LU" = "#4393C3", setNames("#D6604D", comp_zone)),
    labels = c("DE_LU" = "DE-LU", setNames(comp_zone, comp_zone))
  ) +
  scale_y_continuous(labels = percent_format()) +
  labs(
    title    = glue("What Drives Prices? DE-LU vs {comp_zone}"),
    subtitle = "Partial R² — the price variance explained by each feature group, by season",
    x = NULL, y = "Partial R² (share of price variance explained)",
    fill = "Zone"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
Figure 6: Seasonal variance attribution — which factors drive prices in each zone?
Show code
# ── Summary: dominant factor per zone × season ──────────────────────
r2_summary <- partial_r2_combined |>
  group_by(zone, season) |>
  slice_max(partial_r2, n = 1) |>
  ungroup() |>
  select(Zone = zone, Season = season,
         `Dominant Factor` = feature_group,
         `Partial R²` = partial_r2, `Full R²` = full_r2)

r2_summary |>
  gt() |>
  fmt_percent(columns = c(`Partial R²`, `Full R²`), decimals = 1) |>
  tab_header(
    title    = glue("Dominant Price Driver by Season — DE-LU vs {comp_zone}"),
    subtitle = "The single feature group explaining the most price variance in each season"
  )
Dominant Price Driver by Season — DE-LU vs NO_4
The single feature group explaining the most price variance in each season
Zone Season Dominant Factor Partial R² Full R²
DE_LU Autumn TTF gas price 47.5% 76.1%
DE_LU Spring TTF gas price 51.2% 85.2%
DE_LU Summer TTF gas price 68.5% 89.5%
DE_LU Winter TTF gas price 49.1% 80.1%
NO_4 Autumn TTF gas price 5.8% 17.0%
NO_4 Spring Residual load 4.9% 8.9%
NO_4 Summer TTF gas price 3.2% 8.6%
NO_4 Winter TTF gas price 9.9% 22.3%
NoteInterpreting the comparison

If NO_4 has higher wind penetration than Germany, we’d expect renewable share to dominate price variance across more seasons — not just summer. Conversely, if gas generation plays a smaller role, TTF should matter less. These structural differences determine what a hedging strategy needs to cover in each market.

5.3 Post-Crisis Comparison (2023-2025)

The full-sample analysis is dominated by the 2022 energy crisis. Restricting to the post-crisis period (2023-2025) isolates the current market regime — the environment a hedging desk faces today. This reveals whether the variance attribution structure has shifted as gas prices stabilised and renewable capacity expanded.

Show code
post_start <- as.POSIXct("2023-01-01", tz = "UTC")

# ── Post-crisis DE-LU ────────────────────────────────────────────────
delu_partial_post <- map_dfr(levels(scenario_df$season), \(s) {
  sdata <- scenario_df |>
    filter(season == s, datetime_utc >= post_start)
  if (nrow(sdata) < 100) return(tibble())
  compute_partial_r2(sdata, delu_formula, feature_groups) |>
    mutate(season = s, zone = "DE_LU")
})

# ── Post-crisis comparison zone ──────────────────────────────────────
comp_partial_post <- map_dfr(levels(comp_df$season), \(s) {
  sdata <- comp_df |>
    filter(season == s, datetime_utc >= post_start)
  if (nrow(sdata) < 100) return(tibble())
  compute_partial_r2(sdata, full_formula, feature_groups) |>
    mutate(season = s, zone = comp_zone)
})

partial_r2_post <- bind_rows(delu_partial_post, comp_partial_post)
Show code
ggplot(partial_r2_post,
       aes(x = feature_group, y = partial_r2, fill = zone)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  facet_wrap(~ season, ncol = 2) +
  scale_fill_manual(
    values = c("DE_LU" = "#4393C3", setNames("#D6604D", comp_zone)),
    labels = c("DE_LU" = "DE-LU", setNames(comp_zone, comp_zone))
  ) +
  scale_y_continuous(labels = percent_format()) +
  labs(
    title    = glue("Post-Crisis Variance Attribution — DE-LU vs {comp_zone} (2023-2025)"),
    subtitle = "Partial R² — same structure as the full-sample analysis, restricted to the current regime",
    x = NULL, y = "Partial R² (share of price variance explained)",
    fill = "Zone"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
Figure 7: Post-crisis (2023-2025): variance attribution shifts as gas prices stabilise. Renewable share and residual load gain importance relative to TTF.
Show code
# ── Summary: how does the dominant driver change post-crisis? ────────
r2_post_summary <- partial_r2_post |>
  group_by(zone, season) |>
  slice_max(partial_r2, n = 1) |>
  ungroup() |>
  select(Zone = zone, Season = season,
         `Dominant Factor` = feature_group,
         `Partial R²` = partial_r2, `Full R²` = full_r2)

r2_post_summary |>
  gt() |>
  fmt_percent(columns = c(`Partial R²`, `Full R²`), decimals = 1) |>
  tab_header(
    title    = glue("Post-Crisis: Dominant Price Driver — DE-LU vs {comp_zone} (2023-2025)"),
    subtitle = "With gas prices stabilised, renewable variability becomes the dominant factor in more seasons"
  )
Table 1
Post-Crisis: Dominant Price Driver — DE-LU vs NO_4 (2023-2025)
With gas prices stabilised, renewable variability becomes the dominant factor in more seasons
Zone Season Dominant Factor Partial R² Full R²
DE_LU Autumn Hour of day 3.5% 71.8%
DE_LU Spring TTF gas price 4.1% 83.1%
DE_LU Summer Residual load 1.8% 79.5%
DE_LU Winter TTF gas price 5.8% 69.6%
NO_4 Autumn Residual load 17.0% 33.6%
NO_4 Spring Residual load 10.1% 19.8%
NO_4 Summer Residual load 4.0% 18.0%
NO_4 Winter Residual load 3.6% 17.4%
NoteFull-sample vs post-crisis

Comparing these results with the full-sample analysis above reveals the 2022 crisis effect. In the full sample, TTF dominates across all seasons and both zones. In the post-crisis period, the variance attribution spreads across renewable share, residual load, and temporal patterns — this is the market structure that determines hedging strategy today.

6 Hedging Implications

6.1 What the Risk Structure Means for Hedging

The analysis above reveals a clear seasonal pattern in what drives revenue risk for a wind portfolio:

Show code
# ── Synthesise key findings into a narrative table ──────────────────
hedging_df <- risk_metrics |>
  left_join(
    variant_summary |> select(season, sd_ratio_ba),
    by = "season"
  ) |>
  mutate(
    weather_pct  = scales::percent(weather_share, accuracy = 1),
    market_pct   = scales::percent(1 - weather_share, accuracy = 1),
    recommendation = case_when(
      weather_share > 0.6 ~ "Weather derivatives or volume hedging most effective",
      weather_share < 0.4 ~ "Gas price exposure management (TTF swaps) is priority",
      TRUE                ~ "Blended approach: both weather and gas hedges needed"
    )
  ) |>
  select(
    Season           = season,
    `Monthly CVaR 95% (M EUR)` = cvar_95,
    `Weather-Driven`  = weather_pct,
    `Market-Driven`   = market_pct,
    `Hedging Focus`   = recommendation
  )

hedging_df |>
  gt() |>
  fmt_number(columns = `Monthly CVaR 95% (M EUR)`, decimals = 2) |>
  tab_header(
    title    = "Seasonal Hedging Strategy Guide",
    subtitle = glue("{params$portfolio_mw} MW onshore wind in {params$portfolio_zone}")
  ) |>
  tab_footnote("Weather-Driven = share of revenue volatility from wind variability (weather-only SD / realized SD); Market-Driven = gas + cross-border flows.")
Seasonal Hedging Strategy Guide
400 MW onshore wind in DE-LU
Season Monthly CVaR 95% (M EUR) Weather-Driven Market-Driven Hedging Focus
Winter 3.75 58% 42% Blended approach: both weather and gas hedges needed
Spring 1.94 52% 48% Blended approach: both weather and gas hedges needed
Summer 1.57 30% 70% Gas price exposure management (TTF swaps) is priority
Autumn 3.04 47% 53% Blended approach: both weather and gas hedges needed
Weather-Driven = share of revenue volatility from wind variability (weather-only SD / realized SD); Market-Driven = gas + cross-border flows.

6.2 Cross-Border Capacity as a Risk Tool

The ENTSO-E Forward Capacity Allocation (FCA) regulation governs how cross-border transmission rights are allocated — the framework that determines whether generators can actually access neighbouring markets for hedging purposes.

Our cross-zone analysis shows why this matters:

  • CWE neighbours (AT, NL, BE, FR) are highly correlated with DE-LU. Buying transmission rights into these zones provides limited price diversification — prices move together. The primary value of CWE interconnection is physical delivery optionality, not price hedging.

  • Nordic zones (NO, SE, FI) show lower correlation, meaning interconnector capacity to the Nordics offers genuine portfolio diversification. However, this is constrained by physical capacity and allocation mechanisms under the FCA framework.

  • Seasonal variation in correlation means the diversification benefit itself is not constant. A hedging strategy that relies on Nordic decorrelation in summer may find that correlation tightens in winter when the entire continent faces cold weather simultaneously.

ImportantThe policy punchline

Forward capacity allocation isn’t just a grid operations question — it’s a risk management tool. The FCA framework directly affects whether renewable generators can access the cross-border diversification that our correlation analysis shows exists in the Nordics. Market participants bidding for long-term transmission rights should account for the seasonal correlation structure documented here.

7 Technical Notes

7.1 Simulation Convergence

Are 10,000 simulations enough for stable risk estimates? The plot below shows how CVaR 95% on realized revenue stabilises as the number of draws increases.

Show code
# ── Convergence check: compute running CVaR on realized revenue ──────
winter_rev <- sim_results |>
  filter(season == "Winter") |>
  arrange(sim_id) |>
  pull(monthly_rev_realized)

# Compute CVaR at increasing sample sizes (every 50 draws)
check_points <- seq(200, length(winter_rev), by = 50)

convergence <- tibble(
  running_n    = check_points,
  running_cvar = map_dbl(check_points, \(n) {
    compute_cvar(winter_rev[1:n], 0.95) / 1e6
  })
)

ggplot(convergence, aes(x = running_n, y = running_cvar)) +
  geom_line(colour = "#4393C3", linewidth = 0.5) +
  geom_hline(
    yintercept = compute_cvar(
      sim_results |> filter(season == "Winter") |> pull(monthly_rev_realized),
      0.95
    ) / 1e6,
    linetype = "dashed", colour = "#B2182B"
  ) +
  labs(
    title    = "Simulation Convergence — Winter CVaR 95% (Realized Revenue)",
    subtitle = "The red dashed line shows the final estimate; convergence is reached well before 10,000 draws",
    x = "Number of Simulations",
    y = "Running CVaR 95% (M EUR)"
  )
Figure 8: CVaR 95% on realized revenue converges well before the full simulation count

7.2 Assumptions & Limitations

This analysis makes several deliberate simplifications:

  • Fully unhedged portfolio. In practice, most generators hedge 60–90% of expected output via forward contracts or PPAs. The unhedged case represents the maximum spot exposure — the risk baseline against which any hedging strategy is measured.
  • No balancing costs, grid fees, or curtailment. Real net revenue is lower than price × generation.
  • No outages or maintenance. Generation is modelled as purely weather-driven via the Stage 1 capacity factor models.
  • Single technology, single zone. Real portfolios diversify across technologies and geographies — exactly the kind of diversification our cross-zone analysis quantifies.
  • Realized-price VaR/CVaR. Risk metrics are computed on realized revenue (actual historical prices × predicted generation) rather than model-predicted prices. This avoids the problem of model predictions smoothing out tail events — extreme price spikes and negative price hours are captured at their actual magnitude. The trade-off is that realized revenue includes noise that the model would filter, but for risk measurement, capturing the full tail is more important than noise reduction.
  • 5-day block bootstrap. By resampling blocks of 5 consecutive days rather than individual days, the simulation preserves intra-week serial correlation in weather patterns. Multi-day wind droughts and persistent high-pressure systems are the primary drivers of the heaviest revenue losses; an iid daily bootstrap would understate tail risk by breaking these sequences apart. The block size of 5 days balances correlation preservation against the number of unique blocks available per season.
  • Historical resampling and the 2022 energy crisis. The simulation draws from the full 2020–2025 period, including the 2022 gas price spike. In Variant A (weather-only), this has minimal impact — TTF is fixed at seasonal medians. In Variant B and the realized pathway, the 2022 period widens the tails significantly. This is arguably appropriate: risk metrics should reflect scenarios that have actually occurred. The walk-forward cross-validation in Stage 3 confirms the price model generalises across both crisis and non-crisis regimes.
  • Price model is linear (lasso). Extreme prices driven by scarcity events or negative pricing during high-wind periods may be underrepresented by the lasso. The XGBoost benchmark in Stage 3 captures these non-linearities better; quantile XGBoost predictions (q05/q95) are available alongside the lasso for distributional analysis.

7.3 Data Sources

  • SMARD (Bundesnetzagentur): DE-LU generation, load, and day-ahead price. SMARD.de, CC BY 4.0.
  • ENTSO-E Transparency Platform: Multi-zone prices, cross-border flows, and generation by type. Used per ENTSO-E terms of service.
  • Copernicus ERA5 Reanalysis: Hourly wind speed and solar irradiance. Hersbach et al. (2020), doi:10.24381/cds.adbb2d47.
  • Yahoo Finance: TTF natural gas front-month futures (ICE Dutch TTF, EUR/MWh).