Risk Quantification & Hedging Implications

1 Portfolio Definition

This analysis quantifies the spot-price revenue risk facing a stylised, fully unhedged renewable energy portfolio in the DE-LU bidding zone. The portfolio is defined as:

Show code

tibble(
  Parameter = c("Installed capacity", "Technology", "Bidding zone",
                 "Market position", "Forward hedging", "Comparison zone"),
  Value     = c(glue("{params$portfolio_mw} MW"), params$portfolio_tech,
                params$portfolio_zone, "Long spot (generator)",
                "None — fully exposed to day-ahead price",
                params$comparison_zone)
) |>
  gt() |>
  tab_header(title = "Stylised Portfolio Parameters") |>
  cols_align(align = "left")

Parameter	Value
Stylised Portfolio Parameters
Installed capacity	400 MW
Technology	wind_onshore
Bidding zone	DE-LU
Market position	Long spot (generator)
Forward hedging	None — fully exposed to day-ahead price
Comparison zone	NO_4

What does “fully unhedged” mean?

A wind farm that sells all output on the day-ahead spot market earns revenue = price × generation in each delivery hour. Both sides of this equation are weather-driven: wind speed determines generation volume, while the aggregate wind and solar fleet drives the renewable share that (per Stage 3) is the dominant price factor in summer. This creates a natural short hedge — high wind hours produce more MWh but depress the price via the merit order effect. The simulation below quantifies how much residual risk remains after this embedded correlation.

2 Monte Carlo Framework

2.1 Feature Engineering

We reconstruct the same feature set used in the Stage 3 price model: renewable share, residual load, net export balance, TTF gas price (level and daily change), thermal generation, solar irradiance, centred temperature, renewable ramp rate, holidays, hour, day-of-week, and season.

Show code

# ── Aggregate cross-border flows to hourly net export balance ────────
# Identify DE-LU flow columns
export_cols <- names(flows_raw) |>
  str_subset("^DE_LU->") |>
  str_subset("datetime", negate = TRUE)

import_cols <- names(flows_raw) |>
  str_subset("->DE_LU$")

flows_delu <- flows_raw |>
  select(datetime_utc, all_of(c(export_cols, import_cols))) |>
  mutate(datetime_utc = floor_date(datetime_utc, "hour")) |>
  group_by(datetime_utc) |>
  summarise(
    across(everything(), \(x) mean(x, na.rm = TRUE)),
    .groups = "drop"
  ) |>
  mutate(
    total_exports = rowSums(pick(all_of(export_cols)), na.rm = TRUE),
    total_imports = rowSums(pick(all_of(import_cols)), na.rm = TRUE),
    net_export_mw = total_exports - total_imports
  ) |>
  select(datetime_utc, net_export_mw)

Show code

# ── Join all data sources into a single scenario table ───────────────
scenario_df <- smard |>
  # Derived generation features
  mutate(
    wind_total  = wind_onshore + wind_offshore,
    solar_total = solar,
    renewable_gen = wind_total + solar_total,
    renewable_share = renewable_gen / total_load,
    residual_load = total_load - renewable_gen,
    thermal_gen = biomass + gas + hard_coal + lignite,
    date = as_date(datetime_utc),
    hour = hour(datetime_utc),
    dow  = wday(datetime_utc, label = TRUE, week_start = 1),
    season = assign_season(datetime_utc),
    month = factor(month(datetime_utc, label = TRUE), ordered = FALSE),
    is_holiday = as.integer(date %in% holiday_dates)
  ) |>
  # Join ERA5 weather (ssrd, temperature)
  left_join(
    era5_delu,
    by = "datetime_utc"
  ) |>
  # Join cross-border flows
  left_join(flows_delu, by = "datetime_utc") |>
  # Join TTF gas price (daily → hourly via date, includes ttf_change)
  left_join(ttf, by = "date") |>
  arrange(datetime_utc) |>
  # Renewable ramp rate (hour-over-hour change)
  mutate(renewable_ramp = renewable_share - lag(renewable_share))

cat("Scenario table:", nrow(scenario_df), "rows,",
    sum(complete.cases(scenario_df)), "complete cases\n")

Scenario table: 52584 rows, 24554 complete cases

Show code

# ── Impute missing values (matching 03_prices.qmd strategy) ──────────
# Kalman smoother for time-series features; TTF forward/back-filled
scenario_df <- scenario_df |>
  arrange(datetime_utc) |>
  mutate(
    renewable_share    = na_kalman(renewable_share, model = "StructTS"),
    residual_load      = na_kalman(residual_load, model = "StructTS"),
    net_export_mw      = na_kalman(net_export_mw, model = "StructTS"),
    thermal_gen        = na_kalman(thermal_gen, model = "StructTS"),
    wind_speed_100m    = na_kalman(wind_speed_100m, model = "StructTS"),
    wind_speed_10m     = na_kalman(wind_speed_10m, model = "StructTS"),
    temperature_2m     = na_kalman(temperature_2m, model = "StructTS"),
    ssrd_wm2           = na_kalman(ssrd_wm2, model = "StructTS"),
    temp_centered      = na_kalman(temp_centered, model = "StructTS"),
    temp_squared       = na_kalman(temp_squared, model = "StructTS"),
    ttf_eur_mwh        = zoo::na.locf(zoo::na.locf(ttf_eur_mwh, na.rm = FALSE),
                                       fromLast = TRUE),
    ttf_change         = zoo::na.locf(zoo::na.locf(ttf_change, na.rm = FALSE),
                                       fromLast = TRUE)
  ) |>
  # Re-derive renewable ramp after imputation; fill leading NA
  mutate(
    renewable_ramp = renewable_share - lag(renewable_share),
    renewable_ramp = replace_na(renewable_ramp, 0)
  )

n_remaining_na <- scenario_df |>
  select(renewable_share, residual_load, net_export_mw, ttf_eur_mwh,
         ttf_change, thermal_gen, ssrd_wm2, temp_centered, temp_squared) |>
  summarise(across(everything(), \(x) sum(is.na(x))))

cat("Remaining NAs after imputation:\n")

Remaining NAs after imputation:

Show code

print(n_remaining_na)

# A tibble: 1 × 9
  renewable_share residual_load net_export_mw ttf_eur_mwh ttf_change thermal_gen
            <int>         <int>         <int>       <int>      <int>       <int>
1               0             0             0           0          0           0
# ℹ 3 more variables: ssrd_wm2 <int>, temp_centered <int>, temp_squared <int>

2.2 Predict Prices & Generation

We pre-compute predicted prices and portfolio generation for every historical hour before resampling. This keeps the Monte Carlo loop fast — it only needs to sample and sum pre-computed daily revenues.

Variant A (weather-only): TTF and net exports fixed at their seasonal medians; TTF change set to zero. Isolates the pure weather-driven component of price risk.

Variant B (full historical): All features drawn from the same historical day. Captures total risk including gas price and cross-border flow co-movement.

Realized: Actual historical prices (price_de_lu) multiplied by predicted generation. This is the ground-truth revenue stream — what the portfolio would have earned on the spot market. Risk metrics (VaR/CVaR) are computed on realized revenue because it represents actual market outcomes rather than model predictions.

Show code

# ── Compute seasonal medians for Variant A ───────────────────────────
seasonal_medians <- scenario_df |>
  group_by(season) |>
  summarise(
    ttf_median     = median(ttf_eur_mwh, na.rm = TRUE),
    exports_median = median(net_export_mw, na.rm = TRUE),
    .groups = "drop"
  )

seasonal_medians |>
  gt() |>
  fmt_number(columns = c(ttf_median, exports_median), decimals = 1) |>
  cols_label(
    season         = "Season",
    ttf_median     = "TTF Median (EUR/MWh)",
    exports_median = "Net Exports Median (MW)"
  ) |>
  tab_header(title = "Conditional Values for Weather-Only Variant")

Season	TTF Median (EUR/MWh)	Net Exports Median (MW)
Conditional Values for Weather-Only Variant
Winter	41.2	3,556.2
Spring	31.8	−378.9
Summer	34.2	−2,376.7
Autumn	40.8	963.4

Show code

# ── Variant B: predict with original features ────────────────────────
scenario_df <- scenario_df |>
  mutate(pred_price_b = predict(price_model, new_data = scenario_df)$.pred)

# ── Variant A: swap in seasonal medians, then predict ────────────────
scenario_a <- scenario_df |>
  select(-ttf_eur_mwh, -net_export_mw, -ttf_change) |>
  left_join(seasonal_medians, by = "season") |>
  rename(
    ttf_eur_mwh    = ttf_median,
    net_export_mw  = exports_median
  ) |>
  mutate(ttf_change = 0)  # No gas price movement under weather-only

scenario_df <- scenario_df |>
  mutate(pred_price_a = predict(price_model, new_data = scenario_a)$.pred)

# ── Quantile predictions (q05 / q95) ────────────────────────────────
scenario_df <- scenario_df |>
  mutate(
    pred_q05 = predict(price_xgb_q05, new_data = scenario_df)$.pred,
    pred_q95 = predict(price_xgb_q95, new_data = scenario_df)$.pred
  )

Show code

# ── Portfolio generation via Stage 1 CF model ────────────────────────
# CF models (02_weather.qmd) expect hour and month as factors,
# while the price model (03_prices.qmd) expects hour as integer.
# Convert only for the CF prediction.
cf_input <- scenario_df |>
  mutate(
    hour  = factor(hour),
    month = factor(month)
  )

scenario_df <- scenario_df |>
  mutate(
    pred_cf = predict(cf_wind_onshore, new_data = cf_input)$.pred,
    pred_cf = pmax(pred_cf, 0),
    # Generation in MWh per hour (hourly data, so MW = MWh/h)
    portfolio_gen_mwh = pred_cf * params$portfolio_mw
  )

rm(cf_input)

# ── Hourly revenue: price × generation ───────────────────────────────
scenario_df <- scenario_df |>
  mutate(
    revenue_a        = pred_price_a * portfolio_gen_mwh,
    revenue_b        = pred_price_b * portfolio_gen_mwh,
    revenue_realized = price_de_lu  * portfolio_gen_mwh,
    revenue_q05      = pred_q05     * portfolio_gen_mwh,
    revenue_q95      = pred_q95     * portfolio_gen_mwh
  )

cat("Predicted CF range:",
    round(range(scenario_df$pred_cf, na.rm = TRUE), 3), "\n")

Predicted CF range: 0 0.745

Show code

cat("Mean portfolio generation:",
    round(mean(scenario_df$portfolio_gen_mwh, na.rm = TRUE), 1), "MWh/h\n")

Mean portfolio generation: 82 MWh/h

2.3 Daily Revenue Aggregation

Show code

# ── Aggregate hourly revenue to daily totals ─────────────────────────
daily_revenue <- scenario_df |>
  filter(!is.na(revenue_a), !is.na(revenue_b), !is.na(revenue_realized)) |>
  group_by(date, season, month) |>
  summarise(
    daily_rev_a        = sum(revenue_a),
    daily_rev_b        = sum(revenue_b),
    daily_rev_realized = sum(revenue_realized),
    daily_gen_mwh      = sum(portfolio_gen_mwh),
    daily_price_a      = mean(pred_price_a),
    daily_price_b      = mean(pred_price_b),
    daily_price_real   = mean(price_de_lu),
    n_hours            = n(),
    .groups = "drop"
  ) |>
  # Keep only complete days (24 hours)
  filter(n_hours == 24)

cat("Complete days:", nrow(daily_revenue), "\n")

Complete days: 2190

Show code

cat("Days by season:\n")

Days by season:

Show code

print(count(daily_revenue, season))

# A tibble: 4 × 2
  season     n
  <fct>  <int>
1 Winter   540
2 Spring   552
3 Summer   552
4 Autumn   546

2.4 Block Bootstrap Simulation

We use a 5-day block bootstrap instead of sampling individual days independently. By resampling non-overlapping blocks of 5 consecutive days, the simulation preserves the serial correlation in weather patterns (multi-day wind droughts, persistent high-pressure systems) that drives the heaviest revenue losses. Each synthetic month consists of 6 blocks × 5 days = 30 days, constructed independently for each season from that season’s historical pool.

Show code

set.seed(2026)
n_sim            <- params$n_sim
block_size       <- 5
n_blocks         <- params$n_days_per_month %/% block_size   # 6 blocks × 5 days = 30

# ── Create non-overlapping 5-day blocks within each season ───────────
daily_revenue <- daily_revenue |>
  group_by(season) |>
  arrange(date, .by_group = TRUE) |>
  mutate(block_id = (row_number() - 1) %/% block_size) |>
  ungroup()

# ── Aggregate to block-level totals ──────────────────────────────────
block_revenue <- daily_revenue |>
  group_by(season, block_id) |>
  summarise(
    block_rev_a        = sum(daily_rev_a),
    block_rev_b        = sum(daily_rev_b),
    block_rev_realized = sum(daily_rev_realized),
    block_gen_mwh      = sum(daily_gen_mwh),
    block_price_a      = mean(daily_price_a),
    block_price_b      = mean(daily_price_b),
    block_price_real   = mean(daily_price_real),
    block_days         = n(),
    .groups = "drop"
  ) |>
  # Keep only complete blocks (exactly 5 days)
  filter(block_days == block_size)

cat("Complete 5-day blocks by season:\n")

Complete 5-day blocks by season:

Show code

print(count(block_revenue, season))

# A tibble: 4 × 2
  season     n
  <fct>  <int>
1 Winter   108
2 Spring   110
3 Summer   110
4 Autumn   109

Show code

# ── Bootstrap: sample blocks with replacement ────────────────────────
sim_results <- block_revenue |>
  group_by(season) |>
  group_modify(\(data, key) {
    n_pool <- nrow(data)

    # Matrix of sampled row indices: n_blocks × n_sim
    idx <- matrix(
      sample.int(n_pool, size = n_blocks * n_sim, replace = TRUE),
      nrow = n_blocks, ncol = n_sim
    )

    tibble(
      sim_id = 1:n_sim,
      # Monthly revenue (EUR) — sum of sampled block revenues
      monthly_rev_a        = colSums(matrix(data$block_rev_a[idx],
                                            nrow = n_blocks)),
      monthly_rev_b        = colSums(matrix(data$block_rev_b[idx],
                                            nrow = n_blocks)),
      monthly_rev_realized = colSums(matrix(data$block_rev_realized[idx],
                                            nrow = n_blocks)),
      # Monthly generation (MWh)
      monthly_gen          = colSums(matrix(data$block_gen_mwh[idx],
                                            nrow = n_blocks)),
      # Average daily price (EUR/MWh) — mean of block averages
      avg_price_a          = colMeans(matrix(data$block_price_a[idx],
                                             nrow = n_blocks)),
      avg_price_b          = colMeans(matrix(data$block_price_b[idx],
                                             nrow = n_blocks)),
      avg_price_real       = colMeans(matrix(data$block_price_real[idx],
                                             nrow = n_blocks))
    )
  }) |>
  ungroup()

cat("Simulation complete:", nrow(sim_results), "rows\n")

Simulation complete: 40000 rows

Show code

cat("(", n_sim, "simulations ×", n_distinct(sim_results$season), "seasons )\n")

( 10000 simulations × 4 seasons )

2.5 Variant Comparison

Show code

sim_long <- sim_results |>
  pivot_longer(
    cols = c(monthly_rev_a, monthly_rev_b, monthly_rev_realized),
    names_to  = "variant",
    values_to = "monthly_revenue"
  ) |>
  mutate(
    variant = recode(variant,
      monthly_rev_a        = "Weather-only",
      monthly_rev_b        = "Full historical",
      monthly_rev_realized = "Realized"
    ),
    monthly_revenue_meur = monthly_revenue / 1e6
  )

ggplot(sim_long, aes(x = monthly_revenue_meur, fill = variant)) +
  geom_density(alpha = 0.35, colour = NA) +
  geom_vline(
    data = sim_long |>
      group_by(season, variant) |>
      summarise(med = median(monthly_revenue_meur), .groups = "drop"),
    aes(xintercept = med, colour = variant),
    linetype = "dashed", linewidth = 0.6
  ) +
  facet_wrap(~ season, scales = "free_x", ncol = 2) +
  scale_fill_manual(values = variant_colours) +
  scale_colour_manual(values = variant_colours) +
  labs(
    title    = glue("{params$portfolio_mw} MW {params$portfolio_tech} — Monthly Revenue Distribution"),
    subtitle = "Dashed lines show median. Realized = actual prices × predicted generation.",
    x = "Monthly Revenue (M EUR)",
    y = "Density",
    fill = "Variant", colour = "Variant"
  )

Figure 1: Monthly revenue distributions by season and simulation variant. Realized uses actual historical prices; the model-based variants show how much risk the weather-only and full models capture.

Show code

sim_price_long <- sim_results |>
  pivot_longer(
    cols = c(avg_price_a, avg_price_b, avg_price_real),
    names_to  = "variant",
    values_to = "avg_price"
  ) |>
  mutate(
    variant = recode(variant,
      avg_price_a    = "Weather-only",
      avg_price_b    = "Full historical",
      avg_price_real = "Realized"
    )
  )

ggplot(sim_price_long, aes(x = avg_price, fill = variant)) +
  geom_density(alpha = 0.35, colour = NA) +
  facet_wrap(~ season, scales = "free_x", ncol = 2) +
  scale_fill_manual(values = variant_colours) +
  labs(
    title    = "Simulated Average Monthly Spot Price",
    subtitle = "Weather-only isolates renewable variability; Realized shows actual market outcomes",
    x = "Average Monthly Price (EUR/MWh)",
    y = "Density",
    fill = "Variant"
  )

Figure 2: Simulated average monthly price by season and variant

Show code

# ── Summary table: how do the three variants compare? ────────────────
variant_summary <- sim_results |>
  group_by(season) |>
  summarise(
    sd_rev_a        = sd(monthly_rev_a) / 1e6,
    sd_rev_b        = sd(monthly_rev_b) / 1e6,
    sd_rev_realized = sd(monthly_rev_realized) / 1e6,
    sd_ratio_ba     = sd(monthly_rev_b) / sd(monthly_rev_a),
    cvar95_a        = compute_cvar(monthly_rev_a, 0.95) / 1e6,
    cvar95_b        = compute_cvar(monthly_rev_b, 0.95) / 1e6,
    cvar95_realized = compute_cvar(monthly_rev_realized, 0.95) / 1e6,
    .groups = "drop"
  )

variant_summary |>
  gt() |>
  fmt_number(columns = c(sd_rev_a, sd_rev_b, sd_rev_realized,
                          cvar95_a, cvar95_b, cvar95_realized), decimals = 2) |>
  fmt_number(columns = sd_ratio_ba, decimals = 2) |>
  cols_label(
    season          = "Season",
    sd_rev_a        = "SD Weather (M EUR)",
    sd_rev_b        = "SD Full (M EUR)",
    sd_rev_realized = "SD Realized (M EUR)",
    sd_ratio_ba     = "SD Ratio (Full/Weather)",
    cvar95_a        = "CVaR 95% Weather",
    cvar95_b        = "CVaR 95% Full",
    cvar95_realized = "CVaR 95% Realized"
  ) |>
  tab_header(
    title    = "Revenue Risk: Three Variants Compared",
    subtitle = glue("{params$portfolio_mw} MW {params$portfolio_tech} — Monthly Revenue (M EUR)")
  ) |>
  tab_footnote("Realized uses actual historical prices × predicted generation. SD ratio > 1 indicates additional risk from TTF and cross-border flow variability.")

Season	SD Weather (M EUR)	SD Full (M EUR)	SD Realized (M EUR)	SD Ratio (Full/Weather)	CVaR 95% Weather	CVaR 95% Full	CVaR 95% Realized
Revenue Risk: Three Variants Compared
400 MW wind_onshore — Monthly Revenue (M EUR)
Winter	0.91	1.67	1.56	1.85	4.52	3.30	3.75
Spring	0.61	1.25	1.17	2.04	2.45	1.98	1.94
Summer	0.37	1.12	1.26	3.01	2.08	1.78	1.57
Autumn	0.66	1.62	1.40	2.47	2.69	2.25	3.04
Realized uses actual historical prices × predicted generation. SD ratio > 1 indicates additional risk from TTF and cross-border flow variability.

3 Risk Metrics

How much could a bad month cost? The table below answers that for each season, using realized revenue (actual historical prices × predicted generation) as the basis for risk metrics. This ensures VaR and CVaR reflect the revenue distribution that would have actually been observed, rather than model-filtered predictions that smooth out tail events.

Show code

risk_metrics <- sim_results |>
  group_by(season) |>
  summarise(
    median_rev     = median(monthly_rev_realized) / 1e6,
    mean_rev       = mean(monthly_rev_realized) / 1e6,
    sd_rev         = sd(monthly_rev_realized) / 1e6,
    var_95         = compute_var(monthly_rev_realized, 0.95) / 1e6,
    var_99         = compute_var(monthly_rev_realized, 0.99) / 1e6,
    cvar_95        = compute_cvar(monthly_rev_realized, 0.95) / 1e6,
    cvar_99        = compute_cvar(monthly_rev_realized, 0.99) / 1e6,
    weather_share  = sd(monthly_rev_a) / sd(monthly_rev_realized),
    .groups = "drop"
  )

risk_metrics |>
  gt() |>
  fmt_number(columns = c(median_rev, mean_rev, sd_rev,
                          var_95, var_99, cvar_95, cvar_99), decimals = 2) |>
  fmt_percent(columns = weather_share, decimals = 0) |>
  cols_label(
    season        = "Season",
    median_rev    = "Median",
    mean_rev      = "Mean",
    sd_rev        = "Std Dev",
    var_95        = "VaR 95%",
    var_99        = "VaR 99%",
    cvar_95       = "CVaR 95%",
    cvar_99       = "CVaR 99%",
    weather_share = "Weather Share"
  ) |>
  tab_header(
    title    = glue("Monthly Revenue Risk — {params$portfolio_mw} MW Onshore Wind in {params$portfolio_zone}"),
    subtitle = "All values in M EUR. Based on realized revenue (actual prices × predicted generation)."
  ) |>
  tab_spanner(label = "Central Estimate", columns = c(median_rev, mean_rev)) |>
  tab_spanner(label = "Spread", columns = c(sd_rev)) |>
  tab_spanner(label = "Downside Risk", columns = c(var_95, var_99, cvar_95, cvar_99)) |>
  tab_footnote(
    "Weather Share = fraction of total revenue volatility attributable to wind variability alone (weather-only SD / realized SD). The remainder comes from gas price and cross-border flow movements."
  )

Season	Central Estimate		Spread	Downside Risk				Weather Share
Monthly Revenue Risk — 400 MW Onshore Wind in DE-LU
All values in M EUR. Based on realized revenue (actual prices × predicted generation).
Season	Median	Mean	Std Dev	VaR 95%	VaR 99%	CVaR 95%	CVaR 99%	Weather Share
Winter	6.52	6.60	1.56	4.19	3.47	3.75	3.15	58%
Spring	3.77	3.89	1.17	2.22	1.73	1.94	1.55	52%
Summer	3.32	3.47	1.26	1.76	1.45	1.57	1.31	30%
Autumn	5.47	5.58	1.40	3.45	2.78	3.04	2.49	47%
Weather Share = fraction of total revenue volatility attributable to wind variability alone (weather-only SD / realized SD). The remainder comes from gas price and cross-border flow movements.

Reading the table

Pick a row — say Winter. The median month brings in the figure shown under Median. But in the worst 5% of months (CVaR 95%), revenue drops to the CVaR 95% figure. The Weather Share column is the key link back to Stage 3: it tells you how much of that revenue swing comes purely from wind variability versus gas and trade flow uncertainty. A low weather share means hedging weather alone won’t cover most of your risk.

4 Cross-Zone Analysis

Can a generator reduce risk by selling into multiple bidding zones? If prices in different zones move independently, geographic diversification works well. If they move in lockstep — driven by the same weather systems sweeping across Europe — the benefit is limited.

4.1 Price Correlation Across European Bidding Zones

Show code

# ── Hourly price correlation matrix across all zones ─────────────────
price_matrix <- prices_all |>
  select(-datetime_utc) |>
  cor(use = "pairwise.complete.obs")

# ── Reorder by hierarchical clustering for visual clarity ────────────
hc <- hclust(as.dist(1 - price_matrix))
price_matrix_ordered <- price_matrix[hc$order, hc$order]

Show code

cor_long <- price_matrix_ordered |>
  as.data.frame() |>
  rownames_to_column("zone_1") |>
  pivot_longer(-zone_1, names_to = "zone_2", values_to = "correlation") |>
  mutate(
    zone_1 = factor(zone_1, levels = rownames(price_matrix_ordered)),
    zone_2 = factor(zone_2, levels = rownames(price_matrix_ordered))
  )

ggplot(cor_long, aes(x = zone_1, y = zone_2, fill = correlation)) +
  geom_tile(colour = "white", linewidth = 0.3) +
  geom_text(aes(label = sprintf("%.2f", correlation)),
            size = 2.5, colour = "grey20") +
  scale_fill_distiller(
    palette  = "YlOrRd",
    direction = 1,
    limits   = c(0, 1),
    name     = "Correlation"
  ) +
  labs(
    title    = "How Closely Do European Electricity Prices Move Together?",
    subtitle = "High correlation = limited diversification from cross-border hedging",
    x = NULL, y = NULL
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 9),
    axis.text.y = element_text(size = 9)
  )

Figure 3: Hourly day-ahead price correlations across European bidding zones (2020–2025)

4.2 Geographic View — Price Linkages Across Europe

Show code

library(plotly)

# ── Zone centroids (approximate geographic centres) ──────────────────
zone_coords <- tribble(
  ~zone,   ~lat,  ~lon,  ~label,
  "DE_LU", 51.0,  10.0,  "DE-LU",
  "FR",    46.5,   2.5,  "FR",
  "NL",    52.0,   5.5,  "NL",
  "BE",    50.8,   4.5,  "BE",
  "AT",    47.5,  14.0,  "AT",
  "DK_1",  56.0,   9.5,  "DK1",
  "DK_2",  55.5,  12.0,  "DK2",
  "NO_1",  60.0,  10.5,  "NO1",
  "NO_2",  59.0,   6.0,  "NO2",
  "NO_3",  63.5,  10.5,  "NO3",
  "NO_4",  68.0,  15.5,  "NO4",
  "NO_5",  60.5,   6.5,  "NO5",
  "SE_1",  66.0,  18.0,  "SE1",
  "SE_2",  63.0,  16.0,  "SE2",
  "SE_3",  59.0,  16.0,  "SE3",
  "SE_4",  56.5,  14.5,  "SE4",
  "FI",    62.0,  26.0,  "FI"
)

# ── DE-LU correlations ──────────────────────────────────────────────
delu_corr <- tibble(
  zone       = rownames(price_matrix),
  corr_de_lu = price_matrix[, "DE_LU"]
) |>
  filter(zone != "DE_LU") |>
  left_join(zone_coords, by = "zone")

delu_point <- zone_coords |> filter(zone == "DE_LU")

# ── Colour mapping (sequential warm) ────────────────────────────────
corr_to_colour <- colorRamp(c("#F7F7F7", "#FDDBC7", "#EF8A62", "#B2182B"))

delu_corr <- delu_corr |>
  mutate(
    colour_rgb = map_chr(corr_de_lu, \(x) {
      rgb_vals <- corr_to_colour(x)
      sprintf("rgb(%d,%d,%d)",
              round(rgb_vals[1]), round(rgb_vals[2]), round(rgb_vals[3]))
    }),
    hover_text = paste0(
      "<b>", label, "</b><br>",
      "Correlation with DE-LU: <b>", sprintf("%.2f", corr_de_lu), "</b>"
    )
  )

# ── Country-level averages for choropleth background ────────────────
zone_to_iso3 <- tribble(
  ~zone, ~iso3,
  "FR", "FRA", "NL", "NLD", "BE", "BEL", "AT", "AUT",
  "DK_1", "DNK", "DK_2", "DNK",
  "NO_1", "NOR", "NO_2", "NOR", "NO_3", "NOR", "NO_4", "NOR", "NO_5", "NOR",
  "SE_1", "SWE", "SE_2", "SWE", "SE_3", "SWE", "SE_4", "SWE",
  "FI", "FIN"
)

country_corrs <- delu_corr |>
  left_join(zone_to_iso3, by = "zone") |>
  group_by(iso3) |>
  summarise(corr_de_lu = mean(corr_de_lu), .groups = "drop") |>
  bind_rows(tibble(iso3 = "DEU", corr_de_lu = 1.0))

# ── Build plotly geo map (responsive sizing) ──────────────────────────
fig <- plot_geo(height = 700) |>
  # Choropleth background — country polygons filled by correlation
  add_trace(
    type       = "choropleth",
    locations  = country_corrs$iso3,
    z          = country_corrs$corr_de_lu,
    colorscale = list(
      c(0,    "rgb(222,235,247)"),
      c(0.25, "rgb(247,247,247)"),
      c(0.5,  "rgb(253,208,162)"),
      c(0.75, "rgb(227,120,71)"),
      c(1,    "rgb(178,24,43)")
    ),
    zmin = 0.15, zmax = 1,
    marker     = list(line = list(color = "rgb(180,180,180)", width = 1)),
    colorbar   = list(title = "Correlation", len = 0.4, y = 0.5),
    hoverinfo  = "none",
    showlegend = FALSE
  ) |>
  layout(
    title = list(
      text = "How Correlated Are European Electricity Prices with Germany?",
      font = list(size = 16)
    ),
    geo = list(
      scope = "europe",
      projection = list(type = "mercator"),
      lonaxis = list(range = c(-5, 30)),
      lataxis = list(range = c(44, 72)),
      showland = TRUE, landcolor = "rgb(235,235,235)",
      showocean = TRUE, oceancolor = "rgb(225,235,245)",
      showcountries = TRUE, countrycolor = "rgb(180,180,180)",
      showlakes = TRUE, lakecolor = "rgb(225,235,245)",
      showcoastlines = TRUE, coastlinecolor = "rgb(180,180,180)",
      resolution = 50
    ),
    showlegend = FALSE,
    autosize = TRUE,
    margin = list(t = 60, b = 20)
  )

# ── Connection lines from DE-LU to each zone ────────────────────────
for (i in seq_len(nrow(delu_corr))) {
  row <- delu_corr[i, ]
  fig <- fig |>
    add_trace(
      type = "scattergeo", mode = "lines",
      lon = c(delu_point$lon, row$lon),
      lat = c(delu_point$lat, row$lat),
      line = list(
        width = row$corr_de_lu * 4,
        color = row$colour_rgb
      ),
      opacity    = 0.9,
      hoverinfo  = "none",
      showlegend = FALSE
    )
}

# ── Zone markers (sized and coloured by correlation) ─────────────────
fig <- fig |>
  add_trace(
    type = "scattergeo", mode = "markers+text",
    data = delu_corr,
    lon = ~lon, lat = ~lat,
    marker = list(
      size = ~corr_de_lu * 25 + 5,
      color = ~colour_rgb,
      line = list(width = 1.5, color = "white")
    ),
    text = ~paste0(label, " (", sprintf("%.2f", corr_de_lu), ")"),
    textposition = "top center",
    textfont = list(size = 10, color = "rgb(50,50,50)"),
    hovertext = ~hover_text,
    hoverinfo = "text",
    showlegend = FALSE
  )

# ── DE-LU reference marker ──────────────────────────────────────────
fig <- fig |>
  add_trace(
    type = "scattergeo", mode = "markers+text",
    lon = delu_point$lon, lat = delu_point$lat,
    marker = list(
      size = 16, color = "#B2182B", symbol = "diamond",
      line = list(width = 2, color = "white")
    ),
    text = "DE-LU", textposition = "top center",
    textfont = list(size = 12, color = "black", family = "Arial"),
    hovertext = "<b>DE-LU</b><br>Reference zone",
    hoverinfo = "text",
    showlegend = FALSE
  )

fig

Figure 4: Price correlation with DE-LU across European bidding zones — hover for details

What the map tells us

Zones physically close to Germany (AT, NL, BE) show very high price correlation — their markets are tightly coupled through interconnectors. The Nordic zones (NO, SE, FI) are less correlated, meaning a portfolio spanning both CWE and the Nordics would capture some diversification benefit. But even the lowest correlations are well above zero: no European zone offers a true hedge against German price moves.

4.3 Seasonal Price Correlation

Price linkages aren’t fixed — they shift with season as weather patterns and demand fundamentals change.

Show code

comparison_zones <- c("AT", "NL", "FR", "BE", "DK_1", "DK_2",
                       "NO_1", "NO_2", "SE_3", "SE_4", "FI")

seasonal_corr <- prices_all |>
  mutate(season = assign_season(datetime_utc)) |>
  group_by(season) |>
  group_modify(\(data, key) {
    corrs <- data |>
      select(all_of(c("DE_LU", comparison_zones))) |>
      cor(use = "pairwise.complete.obs")

    tibble(
      zone        = comparison_zones,
      correlation = corrs["DE_LU", comparison_zones]
    )
  }) |>
  ungroup()

ggplot(seasonal_corr, aes(x = reorder(zone, -correlation),
                           y = correlation, fill = season)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  scale_fill_manual(values = season_colours) +
  geom_hline(yintercept = c(0.5, 0.75), linetype = "dotted",
             colour = "grey50") +
  labs(
    title    = "Price Correlation with DE-LU Shifts by Season",
    subtitle = "CWE neighbours stay tightly linked year-round; Nordic correlation varies more",
    x = NULL, y = "Correlation with DE-LU", fill = "Season"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Figure 5: DE-LU price correlation with neighbouring zones varies by season

5 NO_4 Comparison

NO_4 serves as a natural comparison to DE-LU. By fitting the same regression structure to a zone with a different generation mix, we can see how market composition shapes the seasonal risk profile.

5.1 Feature Engineering

Show code

# ── Clean comparison zone generation: standardise names, hourly agg ──
# ENTSO-E columns use spaces; select what's available
available_gen_cols <- names(comp_gen_raw) |>
  str_subset("^(Wind Onshore|Wind Offshore|Solar|Fossil Gas|datetime_utc)$")

comp_gen <- comp_gen_raw |>
  select(all_of(available_gen_cols))

# Standardise names
comp_gen <- comp_gen |>
  rename_with(\(x) case_when(
    x == "Wind Onshore"  ~ "wind_onshore",
    x == "Wind Offshore" ~ "wind_offshore",
    x == "Solar"         ~ "solar",
    x == "Fossil Gas"    ~ "fossil_gas",
    TRUE                 ~ x
  ))

# Add zeros for missing columns (e.g. some zones lack offshore wind)
for (col in c("wind_onshore", "wind_offshore", "solar", "fossil_gas")) {
  if (!col %in% names(comp_gen)) comp_gen[[col]] <- 0
}

# Aggregate to hourly (generation file may be sub-hourly)
comp_gen <- comp_gen |>
  mutate(datetime_utc = floor_date(datetime_utc, "hour")) |>
  group_by(datetime_utc) |>
  summarise(across(everything(), \(x) mean(x, na.rm = TRUE)),
            .groups = "drop")

# ── Comparison zone net exports ──────────────────────────────────────
comp_export_cols <- names(flows_raw) |>
  str_subset(glue("^{comp_zone}->")) |>
  str_subset("datetime", negate = TRUE)

comp_import_cols <- names(flows_raw) |>
  str_subset(glue("->{comp_zone}$"))

flows_comp <- flows_raw |>
  select(datetime_utc, all_of(c(comp_export_cols, comp_import_cols))) |>
  mutate(datetime_utc = floor_date(datetime_utc, "hour")) |>
  group_by(datetime_utc) |>
  summarise(across(everything(), \(x) mean(x, na.rm = TRUE)),
            .groups = "drop") |>
  mutate(
    total_exports = rowSums(pick(all_of(comp_export_cols)), na.rm = TRUE),
    total_imports = rowSums(pick(all_of(comp_import_cols)), na.rm = TRUE),
    net_export_mw = total_exports - total_imports
  ) |>
  select(datetime_utc, net_export_mw)

# ── Comparison zone prices ───────────────────────────────────────────
comp_prices <- prices_all |>
  select(datetime_utc, comp_price = all_of(comp_zone))

# ── Assemble comparison scenario table ───────────────────────────────
comp_df <- comp_gen |>
  left_join(comp_load, by = "datetime_utc") |>
  mutate(
    renewable_gen   = wind_onshore + wind_offshore + solar,
    renewable_share = renewable_gen / total_load,
    residual_load   = total_load - renewable_gen,
    date   = as_date(datetime_utc),
    hour   = hour(datetime_utc),
    dow    = wday(datetime_utc, label = TRUE, week_start = 1),
    season = assign_season(datetime_utc)
  ) |>
  left_join(flows_comp, by = "datetime_utc") |>
  left_join(comp_prices, by = "datetime_utc") |>
  left_join(ttf |> select(date, ttf_eur_mwh), by = "date") |>
  arrange(datetime_utc) |>
  mutate(
    renewable_share = na_kalman(renewable_share, model = "StructTS"),
    residual_load   = na_kalman(residual_load, model = "StructTS"),
    net_export_mw   = na_kalman(net_export_mw, model = "StructTS"),
    ttf_eur_mwh     = zoo::na.locf(zoo::na.locf(ttf_eur_mwh, na.rm = FALSE),
                                    fromLast = TRUE)
  ) |>
  filter(!is.na(comp_price))

cat(glue("{comp_zone} scenario table: {nrow(comp_df)} rows, ",
         "{sum(complete.cases(comp_df))} complete cases\n"))

NO_4 scenario table: 52608 rows, 52608 complete cases

5.2 Seasonal Variance Attribution

We fit the same model structure as Stage 3 — OLS with the same feature groups — within each season. Partial R² measures how much price variance each feature group explains: drop the feature, see how much R² falls.

Show code

# ── Partial R² by feature group, by season ───────────────────────────
feature_groups <- list(
  "Renewable share" = "renewable_share",
  "Residual load"   = "residual_load",
  "Net exports"     = "net_export_mw",
  "TTF gas price"   = "ttf_eur_mwh",
  "Hour of day"     = "factor(hour)",
  "Day of week"     = "dow"
)

full_formula <- comp_price ~ renewable_share + residual_load +
  net_export_mw + ttf_eur_mwh + factor(hour) + dow

# ── Helper: compute partial R² for one season ───────────────────────
compute_partial_r2 <- function(data, full_form, groups) {
  full_fit <- lm(full_form, data = data)
  full_r2  <- summary(full_fit)$r.squared

  map_dfr(names(groups), \(gname) {
    drop_term <- groups[[gname]]
    # Build reduced formula by removing the term
    reduced_form <- update(full_form, paste("~ . -", drop_term))
    reduced_fit  <- lm(reduced_form, data = data)
    reduced_r2   <- summary(reduced_fit)$r.squared

    tibble(
      feature_group = gname,
      partial_r2    = full_r2 - reduced_r2
    )
  }) |>
    mutate(full_r2 = full_r2)
}

comp_partial <- map_dfr(levels(comp_df$season), \(s) {
  sdata <- comp_df |> filter(season == s)
  compute_partial_r2(sdata, full_formula, feature_groups) |>
    mutate(season = s, zone = comp_zone)
})

# ── Same analysis for DE-LU (using scenario_df from earlier) ────────
delu_formula <- price_de_lu ~ renewable_share + residual_load +
  net_export_mw + ttf_eur_mwh + factor(hour) + dow

delu_partial <- map_dfr(levels(scenario_df$season), \(s) {
  sdata <- scenario_df |> filter(season == s)
  compute_partial_r2(sdata, delu_formula, feature_groups) |>
    mutate(season = s, zone = "DE_LU")
})

# ── Combine for plotting ────────────────────────────────────────────
partial_r2_combined <- bind_rows(delu_partial, comp_partial)

Show code

ggplot(partial_r2_combined,
       aes(x = feature_group, y = partial_r2, fill = zone)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  facet_wrap(~ season, ncol = 2) +
  scale_fill_manual(
    values = c("DE_LU" = "#4393C3", setNames("#D6604D", comp_zone)),
    labels = c("DE_LU" = "DE-LU", setNames(comp_zone, comp_zone))
  ) +
  scale_y_continuous(labels = percent_format()) +
  labs(
    title    = glue("What Drives Prices? DE-LU vs {comp_zone}"),
    subtitle = "Partial R² — the price variance explained by each feature group, by season",
    x = NULL, y = "Partial R² (share of price variance explained)",
    fill = "Zone"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Figure 6: Seasonal variance attribution — which factors drive prices in each zone?

Show code

# ── Summary: dominant factor per zone × season ──────────────────────
r2_summary <- partial_r2_combined |>
  group_by(zone, season) |>
  slice_max(partial_r2, n = 1) |>
  ungroup() |>
  select(Zone = zone, Season = season,
         `Dominant Factor` = feature_group,
         `Partial R²` = partial_r2, `Full R²` = full_r2)

r2_summary |>
  gt() |>
  fmt_percent(columns = c(`Partial R²`, `Full R²`), decimals = 1) |>
  tab_header(
    title    = glue("Dominant Price Driver by Season — DE-LU vs {comp_zone}"),
    subtitle = "The single feature group explaining the most price variance in each season"
  )

Zone	Season	Dominant Factor	Partial R²	Full R²
Dominant Price Driver by Season — DE-LU vs NO_4
The single feature group explaining the most price variance in each season
DE_LU	Autumn	TTF gas price	47.5%	76.1%
DE_LU	Spring	TTF gas price	51.2%	85.2%
DE_LU	Summer	TTF gas price	68.5%	89.5%
DE_LU	Winter	TTF gas price	49.1%	80.1%
NO_4	Autumn	TTF gas price	5.8%	17.0%
NO_4	Spring	Residual load	4.9%	8.9%
NO_4	Summer	TTF gas price	3.2%	8.6%
NO_4	Winter	TTF gas price	9.9%	22.3%

Interpreting the comparison

If NO_4 has higher wind penetration than Germany, we’d expect renewable share to dominate price variance across more seasons — not just summer. Conversely, if gas generation plays a smaller role, TTF should matter less. These structural differences determine what a hedging strategy needs to cover in each market.

5.3 Post-Crisis Comparison (2023-2025)

The full-sample analysis is dominated by the 2022 energy crisis. Restricting to the post-crisis period (2023-2025) isolates the current market regime — the environment a hedging desk faces today. This reveals whether the variance attribution structure has shifted as gas prices stabilised and renewable capacity expanded.

Show code

post_start <- as.POSIXct("2023-01-01", tz = "UTC")

# ── Post-crisis DE-LU ────────────────────────────────────────────────
delu_partial_post <- map_dfr(levels(scenario_df$season), \(s) {
  sdata <- scenario_df |>
    filter(season == s, datetime_utc >= post_start)
  if (nrow(sdata) < 100) return(tibble())
  compute_partial_r2(sdata, delu_formula, feature_groups) |>
    mutate(season = s, zone = "DE_LU")
})

# ── Post-crisis comparison zone ──────────────────────────────────────
comp_partial_post <- map_dfr(levels(comp_df$season), \(s) {
  sdata <- comp_df |>
    filter(season == s, datetime_utc >= post_start)
  if (nrow(sdata) < 100) return(tibble())
  compute_partial_r2(sdata, full_formula, feature_groups) |>
    mutate(season = s, zone = comp_zone)
})

partial_r2_post <- bind_rows(delu_partial_post, comp_partial_post)

Show code

ggplot(partial_r2_post,
       aes(x = feature_group, y = partial_r2, fill = zone)) +
  geom_col(position = position_dodge(width = 0.7), width = 0.6) +
  facet_wrap(~ season, ncol = 2) +
  scale_fill_manual(
    values = c("DE_LU" = "#4393C3", setNames("#D6604D", comp_zone)),
    labels = c("DE_LU" = "DE-LU", setNames(comp_zone, comp_zone))
  ) +
  scale_y_continuous(labels = percent_format()) +
  labs(
    title    = glue("Post-Crisis Variance Attribution — DE-LU vs {comp_zone} (2023-2025)"),
    subtitle = "Partial R² — same structure as the full-sample analysis, restricted to the current regime",
    x = NULL, y = "Partial R² (share of price variance explained)",
    fill = "Zone"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Figure 7: Post-crisis (2023-2025): variance attribution shifts as gas prices stabilise. Renewable share and residual load gain importance relative to TTF.

Show code

# ── Summary: how does the dominant driver change post-crisis? ────────
r2_post_summary <- partial_r2_post |>
  group_by(zone, season) |>
  slice_max(partial_r2, n = 1) |>
  ungroup() |>
  select(Zone = zone, Season = season,
         `Dominant Factor` = feature_group,
         `Partial R²` = partial_r2, `Full R²` = full_r2)

r2_post_summary |>
  gt() |>
  fmt_percent(columns = c(`Partial R²`, `Full R²`), decimals = 1) |>
  tab_header(
    title    = glue("Post-Crisis: Dominant Price Driver — DE-LU vs {comp_zone} (2023-2025)"),
    subtitle = "With gas prices stabilised, renewable variability becomes the dominant factor in more seasons"
  )

Table 1

Zone	Season	Dominant Factor	Partial R²	Full R²
Post-Crisis: Dominant Price Driver — DE-LU vs NO_4 (2023-2025)
With gas prices stabilised, renewable variability becomes the dominant factor in more seasons
DE_LU	Autumn	Hour of day	3.5%	71.8%
DE_LU	Spring	TTF gas price	4.1%	83.1%
DE_LU	Summer	Residual load	1.8%	79.5%
DE_LU	Winter	TTF gas price	5.8%	69.6%
NO_4	Autumn	Residual load	17.0%	33.6%
NO_4	Spring	Residual load	10.1%	19.8%
NO_4	Summer	Residual load	4.0%	18.0%
NO_4	Winter	Residual load	3.6%	17.4%

Full-sample vs post-crisis

Comparing these results with the full-sample analysis above reveals the 2022 crisis effect. In the full sample, TTF dominates across all seasons and both zones. In the post-crisis period, the variance attribution spreads across renewable share, residual load, and temporal patterns — this is the market structure that determines hedging strategy today.

6 Hedging Implications

6.1 What the Risk Structure Means for Hedging

The analysis above reveals a clear seasonal pattern in what drives revenue risk for a wind portfolio:

Show code

# ── Synthesise key findings into a narrative table ──────────────────
hedging_df <- risk_metrics |>
  left_join(
    variant_summary |> select(season, sd_ratio_ba),
    by = "season"
  ) |>
  mutate(
    weather_pct  = scales::percent(weather_share, accuracy = 1),
    market_pct   = scales::percent(1 - weather_share, accuracy = 1),
    recommendation = case_when(
      weather_share > 0.6 ~ "Weather derivatives or volume hedging most effective",
      weather_share < 0.4 ~ "Gas price exposure management (TTF swaps) is priority",
      TRUE                ~ "Blended approach: both weather and gas hedges needed"
    )
  ) |>
  select(
    Season           = season,
    `Monthly CVaR 95% (M EUR)` = cvar_95,
    `Weather-Driven`  = weather_pct,
    `Market-Driven`   = market_pct,
    `Hedging Focus`   = recommendation
  )

hedging_df |>
  gt() |>
  fmt_number(columns = `Monthly CVaR 95% (M EUR)`, decimals = 2) |>
  tab_header(
    title    = "Seasonal Hedging Strategy Guide",
    subtitle = glue("{params$portfolio_mw} MW onshore wind in {params$portfolio_zone}")
  ) |>
  tab_footnote("Weather-Driven = share of revenue volatility from wind variability (weather-only SD / realized SD); Market-Driven = gas + cross-border flows.")

Season	Monthly CVaR 95% (M EUR)	Weather-Driven	Market-Driven	Hedging Focus
Seasonal Hedging Strategy Guide
400 MW onshore wind in DE-LU
Winter	3.75	58%	42%	Blended approach: both weather and gas hedges needed
Spring	1.94	52%	48%	Blended approach: both weather and gas hedges needed
Summer	1.57	30%	70%	Gas price exposure management (TTF swaps) is priority
Autumn	3.04	47%	53%	Blended approach: both weather and gas hedges needed
Weather-Driven = share of revenue volatility from wind variability (weather-only SD / realized SD); Market-Driven = gas + cross-border flows.

6.2 Cross-Border Capacity as a Risk Tool

The ENTSO-E Forward Capacity Allocation (FCA) regulation governs how cross-border transmission rights are allocated — the framework that determines whether generators can actually access neighbouring markets for hedging purposes.

Our cross-zone analysis shows why this matters:

CWE neighbours (AT, NL, BE, FR) are highly correlated with DE-LU. Buying transmission rights into these zones provides limited price diversification — prices move together. The primary value of CWE interconnection is physical delivery optionality, not price hedging.
Nordic zones (NO, SE, FI) show lower correlation, meaning interconnector capacity to the Nordics offers genuine portfolio diversification. However, this is constrained by physical capacity and allocation mechanisms under the FCA framework.
Seasonal variation in correlation means the diversification benefit itself is not constant. A hedging strategy that relies on Nordic decorrelation in summer may find that correlation tightens in winter when the entire continent faces cold weather simultaneously.

The policy punchline

Forward capacity allocation isn’t just a grid operations question — it’s a risk management tool. The FCA framework directly affects whether renewable generators can access the cross-border diversification that our correlation analysis shows exists in the Nordics. Market participants bidding for long-term transmission rights should account for the seasonal correlation structure documented here.

7 Technical Notes

7.1 Simulation Convergence

Are 10,000 simulations enough for stable risk estimates? The plot below shows how CVaR 95% on realized revenue stabilises as the number of draws increases.

Show code

# ── Convergence check: compute running CVaR on realized revenue ──────
winter_rev <- sim_results |>
  filter(season == "Winter") |>
  arrange(sim_id) |>
  pull(monthly_rev_realized)

# Compute CVaR at increasing sample sizes (every 50 draws)
check_points <- seq(200, length(winter_rev), by = 50)

convergence <- tibble(
  running_n    = check_points,
  running_cvar = map_dbl(check_points, \(n) {
    compute_cvar(winter_rev[1:n], 0.95) / 1e6
  })
)

ggplot(convergence, aes(x = running_n, y = running_cvar)) +
  geom_line(colour = "#4393C3", linewidth = 0.5) +
  geom_hline(
    yintercept = compute_cvar(
      sim_results |> filter(season == "Winter") |> pull(monthly_rev_realized),
      0.95
    ) / 1e6,
    linetype = "dashed", colour = "#B2182B"
  ) +
  labs(
    title    = "Simulation Convergence — Winter CVaR 95% (Realized Revenue)",
    subtitle = "The red dashed line shows the final estimate; convergence is reached well before 10,000 draws",
    x = "Number of Simulations",
    y = "Running CVaR 95% (M EUR)"
  )

Figure 8: CVaR 95% on realized revenue converges well before the full simulation count

7.2 Assumptions & Limitations

This analysis makes several deliberate simplifications:

Fully unhedged portfolio. In practice, most generators hedge 60–90% of expected output via forward contracts or PPAs. The unhedged case represents the maximum spot exposure — the risk baseline against which any hedging strategy is measured.
No balancing costs, grid fees, or curtailment. Real net revenue is lower than price × generation.
No outages or maintenance. Generation is modelled as purely weather-driven via the Stage 1 capacity factor models.
Single technology, single zone. Real portfolios diversify across technologies and geographies — exactly the kind of diversification our cross-zone analysis quantifies.
Realized-price VaR/CVaR. Risk metrics are computed on realized revenue (actual historical prices × predicted generation) rather than model-predicted prices. This avoids the problem of model predictions smoothing out tail events — extreme price spikes and negative price hours are captured at their actual magnitude. The trade-off is that realized revenue includes noise that the model would filter, but for risk measurement, capturing the full tail is more important than noise reduction.
5-day block bootstrap. By resampling blocks of 5 consecutive days rather than individual days, the simulation preserves intra-week serial correlation in weather patterns. Multi-day wind droughts and persistent high-pressure systems are the primary drivers of the heaviest revenue losses; an iid daily bootstrap would understate tail risk by breaking these sequences apart. The block size of 5 days balances correlation preservation against the number of unique blocks available per season.
Historical resampling and the 2022 energy crisis. The simulation draws from the full 2020–2025 period, including the 2022 gas price spike. In Variant A (weather-only), this has minimal impact — TTF is fixed at seasonal medians. In Variant B and the realized pathway, the 2022 period widens the tails significantly. This is arguably appropriate: risk metrics should reflect scenarios that have actually occurred. The walk-forward cross-validation in Stage 3 confirms the price model generalises across both crisis and non-crisis regimes.
Price model is linear (lasso). Extreme prices driven by scarcity events or negative pricing during high-wind periods may be underrepresented by the lasso. The XGBoost benchmark in Stage 3 captures these non-linearities better; quantile XGBoost predictions (q05/q95) are available alongside the lasso for distributional analysis.

7.3 Data Sources

SMARD (Bundesnetzagentur): DE-LU generation, load, and day-ahead price. SMARD.de, CC BY 4.0.
ENTSO-E Transparency Platform: Multi-zone prices, cross-border flows, and generation by type. Used per ENTSO-E terms of service.
Copernicus ERA5 Reanalysis: Hourly wind speed and solar irradiance. Hersbach et al. (2020), doi:10.24381/cds.adbb2d47.
Yahoo Finance: TTF natural gas front-month futures (ICE Dutch TTF, EUR/MWh).