Results¶

Estimation results are stored in 3 files: the point estimates, the performance (RMSE), and out-of-sample estimates

In [3]:
import numpy as np
import pandas as pd
import os
date_str = "20260511"
PATH_OUTPUT = f"{os.getcwd()}/../output/{date_str}/"

forecast_file     = f"{date_str} 3_nwcst Tab fcst.xlsx"   # model forecasts / nowcasts
performance_file  = f"{date_str} 3_nwcst Tab perf.xlsx"   # model performance metrics
outofsample_file  = f"{date_str} 3_nwcst Tab oos.xlsx"    # full OOS prediction panel


results_table = pd.read_excel(f"{PATH_OUTPUT}/{forecast_file}", index_col='date')
results_table.index = pd.to_datetime(results_table.index, format="%Y-%m-%d")
performance_table = pd.read_excel(f"{PATH_OUTPUT}/{performance_file}", index_col='Vintage')
#outofsample_table = pd.read_excel(f"{PATH_OUTPUT}/{outofsample_file}", index_col='date')

Inspect the performance table. This shows each model's prediction error (RMSE) broken down by forecasting horizon ("Vintage"): negative numbers mean the data was already published when the model ran; positive means it was predicting quarters that had not been released yet.

In [4]:
performance_table
Out[4]:
RMSE estimator spec
Vintage
-2 2.298953 OLS lars46
-1 2.291649 OLS lars46
0 2.624861 OLS lars46
1 12.721925 OLS lars46
2 13.272481 OLS lars46
... ... ... ...
-2 2.600459 LSTM lars49
-1 2.520544 LSTM lars49
0 2.603486 LSTM lars49
1 2.549008 LSTM lars49
2 2.649067 LSTM lars49

80 rows × 3 columns

Check which model specifications are present in the performance data.

In [5]:
performance_table['spec'].unique()
Out[5]:
array(['lars46', 'lars49'], dtype=object)

Inspect the results table. This contains the actual nowcast values produced by each model, for each quarter and each forecasting horizon.

In [6]:
results_table
Out[6]:
actuals two_back one_back zero_back one_ahead two_ahead estimator spec
date
2023-06-01 -0.066758 -0.419722 -0.332563 -1.330659 -35.239000 -36.206200 OLS lars46
2023-09-01 1.988200 -0.308410 -0.357085 -0.358990 -7.257542 -1.693524 OLS lars46
2023-12-01 -0.778219 -0.228725 0.150708 0.873922 2.111830 1.235066 OLS lars46
2024-03-01 -0.054704 -0.075191 0.255990 0.315120 -1.867632 0.578832 OLS lars46
2024-06-01 -0.997453 -0.277602 -0.212376 0.435814 -4.414146 -5.344480 OLS lars46
... ... ... ... ... ... ... ... ...
2024-12-01 1.691583 -0.187005 0.239369 0.297889 0.241903 0.444719 LSTM lars49
2025-03-01 1.086629 -0.026154 -0.705539 -1.088655 -0.908419 -1.005068 LSTM lars49
2025-06-01 0.255034 0.926614 0.676982 0.648158 0.395492 0.603667 LSTM lars49
2025-09-01 0.350465 0.315954 0.193476 -0.264647 -0.701200 -0.409720 LSTM lars49
2025-12-01 -7.272064 0.345186 0.046084 0.148406 -0.016349 0.528473 LSTM lars49

176 rows × 8 columns

Check which model specifications are present in the results data.

In [7]:
results_table['spec'].unique()
Out[7]:
array(['lars46', 'lars49'], dtype=object)

Presenting Results¶

Set up plotting libraries and color palette. We import the tools needed to draw charts and define the IDB color scheme used throughout.

In [8]:
import matplotlib.ticker as ticker
import matplotlib.pyplot as plt
import seaborn as sns

# Define the color scheme
idb_blue = "#005BAC"
idb_light_blue = "#7FB3E6"

Average RMSE by publication lag¶

In [9]:
all_means = performance_table.groupby(performance_table.index)['RMSE'].mean()


fig, axes = plt.subplots(1, 1, figsize=(8, 5))  # adjust figsize as you want
# If axes is a single AxesSubplot (not an array), make it a list for consistency
axes = [axes]

# Plot
sns.lineplot(data=performance_table, x='Vintage', y='RMSE', ax=axes[0], color=idb_blue)

axes[0].xaxis.set_major_locator(ticker.MaxNLocator(integer=True))
axes[0].set_title("")
axes[0].tick_params(axis='x', rotation=45)
plt.tight_layout()

# Save figure:
plt.savefig(f'{PATH_OUTPUT}/Figure Vintage.png')
plt.show()
No description has been provided for this image

Boxplot by publication lag¶

In [10]:
df = performance_table.reset_index()

fig, ax = plt.subplots(figsize=(8, 5))

# If you want a specific order, define it here; otherwise seaborn will infer it.
# order = sorted(df['Publication lag'].unique())
order = None

sns.boxplot(
    data=df,
    x='Vintage',
    y='RMSE',
    order=order,
    ax=ax,
    boxprops=dict(color=idb_light_blue),
    whiskerprops=dict(color=idb_blue),
    capprops=dict(color=idb_blue),
    medianprops=dict(color='black'),
    flierprops=dict(markerfacecolor=idb_light_blue, markeredgecolor=idb_blue,
                    marker='o', markersize=6)
)

ax.set_title("")
ax.tick_params(axis='x', rotation=45)
plt.tight_layout()
plt.show()
c:\Users\guerr\anaconda3\envs\nwcst\Lib\site-packages\seaborn\categorical.py:700: UserWarning: Setting the 'color' property will override the edgecolor or facecolor properties.
  artists = ax.bxp(**boxplot_kws)
No description has been provided for this image

Same boxplot with mean markers added. The dot in each box shows the average RMSE for that publication lag, making it easier to compare central tendency across vintages.

In [11]:
df = performance_table.reset_index()

fig, ax = plt.subplots(figsize=(8, 5))

# If you want a specific order, define it here; otherwise seaborn will infer it.
# order = sorted(df['Publication lag'].unique())
order = None

sns.boxplot(
    data=df,
    x='Vintage',
    y='RMSE',
    order=order,
    ax=ax,
    boxprops=dict(color=idb_light_blue),
    whiskerprops=dict(color=idb_blue),
    capprops=dict(color=idb_blue),
    medianprops=dict(color='black'),
    flierprops=dict(markerfacecolor=idb_light_blue, markeredgecolor=idb_blue,
                    marker='o', markersize=6)
)

# Get the actual category order used on the axis (this is key)
cats = [t.get_text() for t in ax.get_xticklabels()]

# Means computed in the same category space as the plot
means = df.groupby('Vintage', dropna=False)['RMSE'].mean()

# Map tick labels -> x positions
xpos = np.arange(len(cats))

# Overlay means (match tick labels carefully)
ymeans = []
for c in cats:
    # Try direct label
    if c in means.index:
        ymeans.append(means.loc[c])
        continue
    # Try numeric label -> numeric index match (common when lag is int/float)
    try:
        cnum = float(c)
        hit = None
        for idx in means.index:
            try:
                if float(idx) == cnum:
                    hit = means.loc[idx]
                    break
            except Exception:
                pass
        ymeans.append(hit)
    except Exception:
        ymeans.append(None)

# Plot only the ones we found
mask = [v is not None and np.isfinite(v) for v in ymeans]
ax.scatter(
    xpos[mask],
    np.array(ymeans, dtype=float)[mask],
    color='black',
    marker='o',
    s=50,
    zorder=10,
    edgecolor='k',
    linewidth=0.4,
    label='Mean'
)

# Clean legend (only "Mean" once)
handles, labels = ax.get_legend_handles_labels()
by_label = dict(zip(labels, handles))
ax.legend(by_label.values(), by_label.keys())

ax.set_title("")
ax.tick_params(axis='x', rotation=45)
plt.tight_layout()
plt.show()
c:\Users\guerr\anaconda3\envs\nwcst\Lib\site-packages\seaborn\categorical.py:700: UserWarning: Setting the 'color' property will override the edgecolor or facecolor properties.
  artists = ax.bxp(**boxplot_kws)
No description has been provided for this image

Histogram: distribution of errors¶

In [12]:
human_mean = performance_table[performance_table['spec']=='human1']['RMSE'].mean()

plt.figure(figsize=(8, 6))

# Plot with Seaborn, no edge lines
sns.histplot(performance_table['RMSE'], bins=30, kde=False, color=idb_blue, edgecolor=None)

# Median line
plt.axvline(human_mean, color='black', linestyle=':', linewidth=2, label=f'Human mean = {human_mean:.2f}')

plt.title("")
plt.xlabel("Root Mean Squared Error (RMSE)")
plt.ylabel("Frequency")
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

Filter out extreme outliers using the vintage mean

In [13]:
spec_rmse_mean = pd.DataFrame(performance_table.groupby('Vintage')['RMSE'].median())
spec_rmse_mean = spec_rmse_mean.rename(columns={'RMSE': 'threshold'})
spec_rmse_mean
Out[13]:
threshold
Vintage
-2 2.511964
-1 2.517836
0 2.613635
1 3.779224
2 7.888829

Apply the outlier filter and redraw the histogram. Models whose RMSE exceeds the vintage-specific median are dropped. The histogram now shows the cleaned error distribution alongside the human-forecaster benchmark.

In [14]:
performance_table = performance_table.merge(spec_rmse_mean, on='Vintage', how='left')

# ++++++++++++++++++++++++++++++++++++++++++++++++++++
# ++++++++++++++++++++++++++++++++++++++++++++++++++++
filtered = performance_table[performance_table['RMSE'] < performance_table['threshold']]
# ++++++++++++++++++++++++++++++++++++++++++++++++++++
# ++++++++++++++++++++++++++++++++++++++++++++++++++++


human_mean = performance_table[performance_table['spec']=='human1']['RMSE'].mean()

plt.figure(figsize=(8, 6))

# Plot with Seaborn, no edge lines
sns.histplot(filtered['RMSE'], bins=30, kde=False, color=idb_blue, edgecolor=None)

# Median line
plt.axvline(human_mean, color='black', linestyle=':', linewidth=2, label=f'Human mean = {human_mean:.2f}')

plt.title("")
plt.xlabel("Root Mean Squared Error (RMSE)")
plt.ylabel("Frequency")
plt.legend()
plt.tight_layout()
plt.show()
No description has been provided for this image

Histogram conditional on publication lag¶

In [15]:
human_means = (
    performance_table[performance_table['spec'] == 'human1']
    .groupby('Vintage')
    .mean(numeric_only=True)
)

# Mapping of vintages
lags = {
    -2: 'two_back',
    -1: 'one_back',
     0: 'zero_back',
     1: 'one_ahead',
     2: 'two_ahead'
}

fig, axes = plt.subplots(nrows=3, ncols=2, figsize=(14, 12), sharex=False, sharey=False)
axes = axes.flatten()

# ------------------------------------------------
# Loop over each vintage and plot histogram
for i, (V, N) in enumerate(lags.items()):
    ax = axes[i]

    # Filter data for this vintage
    subset = filtered[filtered.index == V]

    # Histogram
    sns.histplot(subset['RMSE'], bins=30, kde=False, color=idb_blue, edgecolor=None, ax=ax)

    # Median line
    rmse_median = subset['RMSE'].median()
    ax.axvline(rmse_median, color='black', linestyle='dashed', linewidth=2, label=f'Median = {rmse_median:.2f}')

    # Human benchmark
    human_mean = human_means[human_means.index==V]['RMSE'].mean()
    ax.axvline(human_mean, color='black', linestyle=':', linewidth=2, label=f'Human mean = {human_mean:.2f}')

    # Titles
    ax.set_title(f'Information set p={V}', fontsize=12)
    ax.set_ylabel('Frequency')
    ax.legend(fontsize=8, loc='upper right')

# ------------------------------------------------
# Last panel: pooled distribution
ax = axes[-1]
sns.histplot(filtered['RMSE'], bins=30, kde=False, color=idb_blue, edgecolor=None, ax=ax)

rmse_median = filtered['RMSE'].median()
ax.axvline(rmse_median, color='black', linestyle='dashed', linewidth=2, label=f'Median = {rmse_median:.2f}')

human_mean = performance_table[performance_table['spec'] == 'human1']['RMSE'].mean()
ax.axvline(human_mean, color='black', linestyle=':', linewidth=2, label=f'Human mean = {human_mean:.2f}')

ax.set_title('All Vintages', fontsize=12)
ax.set_ylabel('Frequency')
ax.legend(fontsize=8, loc='upper left')

# ------------------------------------------------
# X-labels on bottom row
for ax in axes[-3:]:
    ax.set_xlabel("Root Mean Squared Error (RMSE)")

plt.tight_layout()
plt.show()
No description has been provided for this image

Boxplot (outliers management)¶

In [16]:
# First we define reference points
human_p = performance_table[performance_table['spec']=='human1']
human_p = human_p.groupby(human_p.index)['RMSE'].mean()
all_means = filtered.groupby(filtered.index)['RMSE'].mean()


fig, ax = plt.subplots(figsize=(8, 5))
sns.boxplot(
    data=filtered.reset_index(),
    x='Vintage',
    y='RMSE',
    ax=ax,
    boxprops=dict(color=idb_light_blue),
    whiskerprops=dict(color=idb_blue),
    capprops=dict(color=idb_blue),
    medianprops=dict(color='black'),
    flierprops=dict(markerfacecolor=idb_light_blue, markeredgecolor=idb_blue, marker='o', markersize=6)
)

# Ensure the figure is rendered so tick labels are available
fig.canvas.draw()   # <- important

xticks = ax.get_xticks()
xtlabels = [t.get_text() for t in ax.get_xticklabels()]

# compute means (keep original dtype)
all_means = filtered.reset_index().groupby('Vintage')['RMSE'].mean()

def find_mean_for_label(lbl):
    # direct match
    if lbl in all_means.index:
        return all_means.loc[lbl]
    # try numeric match
    try:
        lbl_num = float(lbl)
        for idx in all_means.index:
            try:
                if float(idx) == lbl_num:
                    return all_means.loc[idx]
            except Exception:
                continue
    except Exception:
        pass
    # fallback: match by string representation
    for idx in all_means.index:
        if str(idx) == lbl:
            return all_means.loc[idx]
    return None

# plot mean markers aligned to the xtick positions
for i, lbl in enumerate(xtlabels):
    m = find_mean_for_label(lbl)
    if m is not None:
        ax.scatter(xticks[i], m, color='black', marker='o', s=50, zorder=10,
                   label='Mean' if i == 0 else "", edgecolor='k', linewidth=0.4)

# overlay human points: match similarly
for i, vintage in enumerate(human_p.index):
    # try to find which xtick corresponds
    for j, lbl in enumerate(xtlabels):
        try:
            if str(vintage) == lbl or float(vintage) == float(lbl):
                ax.scatter(xticks[j], human_p[vintage], color='black', marker='v', s=50,
                           label='Human' if i == 0 else "")
                break
        except Exception:
            if str(vintage) == lbl:
                ax.scatter(xticks[j], human_p[vintage], color='black', marker='v', s=50,
                           label='Human' if i == 0 else "")
                break

# clean legend
handles, labels = ax.get_legend_handles_labels()
unique = dict(zip(labels, handles))
ax.legend(unique.values(), unique.keys())

ax.set_title("")
ax.tick_params(axis='x', rotation=45)
plt.tight_layout()
plt.show()
c:\Users\guerr\anaconda3\envs\nwcst\Lib\site-packages\seaborn\categorical.py:700: UserWarning: Setting the 'color' property will override the edgecolor or facecolor properties.
  artists = ax.bxp(**boxplot_kws)
No description has been provided for this image

Nowcast outputs:¶

Results/RMSE merge¶

Filter out results from under-performing models

In [17]:
# -----------------------------------------------
# Reshape results to long format
df = results_table.reset_index().melt(
    id_vars=["date", 'spec', 'estimator'],
    value_vars=["two_back", "one_back", "zero_back", "one_ahead", "two_ahead"],
    var_name="horizon",
    value_name="value"
)

# Make Vintage column based on horizon -2, -1, 0, 1, 2
df['Vintage'] = df['horizon'].map({
    'two_back': -2,
    'one_back': -1,
    'zero_back': 0,
    'one_ahead': 1,
    'two_ahead': 2
})
df
Out[17]:
date spec estimator horizon value Vintage
0 2023-06-01 lars46 OLS two_back -0.419722 -2
1 2023-09-01 lars46 OLS two_back -0.308410 -2
2 2023-12-01 lars46 OLS two_back -0.228725 -2
3 2024-03-01 lars46 OLS two_back -0.075191 -2
4 2024-06-01 lars46 OLS two_back -0.277602 -2
... ... ... ... ... ... ...
875 2024-12-01 lars49 LSTM two_ahead 0.444719 2
876 2025-03-01 lars49 LSTM two_ahead -1.005068 2
877 2025-06-01 lars49 LSTM two_ahead 0.603667 2
878 2025-09-01 lars49 LSTM two_ahead -0.409720 2
879 2025-12-01 lars49 LSTM two_ahead 0.528473 2

880 rows × 6 columns

Attach each model's RMSE to its nowcast values. We join the performance table onto the long-format results so that every prediction row carries the corresponding model error — needed for the quality filter below.

In [18]:
p = performance_table.reset_index()
results_w = (
    df  # safe even if index is unnamed
    .rename(columns=str.strip)
    .merge(
        p.rename(columns=str.strip)[['Vintage','spec','estimator','RMSE']],
        on=['Vintage','spec','estimator'],
        how='left'
    )
)
results_w
Out[18]:
date spec estimator horizon value Vintage RMSE
0 2023-06-01 lars46 OLS two_back -0.419722 -2 2.298953
1 2023-09-01 lars46 OLS two_back -0.308410 -2 2.298953
2 2023-12-01 lars46 OLS two_back -0.228725 -2 2.298953
3 2024-03-01 lars46 OLS two_back -0.075191 -2 2.298953
4 2024-06-01 lars46 OLS two_back -0.277602 -2 2.298953
... ... ... ... ... ... ... ...
875 2024-12-01 lars49 LSTM two_ahead 0.444719 2 2.649067
876 2025-03-01 lars49 LSTM two_ahead -1.005068 2 2.649067
877 2025-06-01 lars49 LSTM two_ahead 0.603667 2 2.649067
878 2025-09-01 lars49 LSTM two_ahead -0.409720 2 2.649067
879 2025-12-01 lars49 LSTM two_ahead 0.528473 2 2.649067

880 rows × 7 columns

Keep only the better-performing models. For each forecasting horizon we discard models whose RMSE is above the horizon average, retaining only those that have predicted well historically.

In [19]:
mean_rmse = results_w.groupby('Vintage')['RMSE'].mean()

# Keep rows with RMSE <= Vintage-specific mean
filtered = results_w[
    results_w['RMSE'] <= results_w['Vintage'].map(mean_rmse)
]
filtered
Out[19]:
date spec estimator horizon value Vintage RMSE
0 2023-06-01 lars46 OLS two_back -0.419722 -2 2.298953
1 2023-09-01 lars46 OLS two_back -0.308410 -2 2.298953
2 2023-12-01 lars46 OLS two_back -0.228725 -2 2.298953
3 2024-03-01 lars46 OLS two_back -0.075191 -2 2.298953
4 2024-06-01 lars46 OLS two_back -0.277602 -2 2.298953
... ... ... ... ... ... ... ...
875 2024-12-01 lars49 LSTM two_ahead 0.444719 2 2.649067
876 2025-03-01 lars49 LSTM two_ahead -1.005068 2 2.649067
877 2025-06-01 lars49 LSTM two_ahead 0.603667 2 2.649067
878 2025-09-01 lars49 LSTM two_ahead -0.409720 2 2.649067
879 2025-12-01 lars49 LSTM two_ahead 0.528473 2 2.649067

583 rows × 7 columns

Ridge plot¶

In [20]:
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

df = filtered[['date', 'horizon', 'Vintage', 'value']]
observed = pd.DataFrame(results_table['actuals'].groupby(results_table.index).mean())
actuals_dict = observed['actuals'].to_dict()
df
Out[20]:
date horizon Vintage value
0 2023-06-01 two_back -2 -0.419722
1 2023-09-01 two_back -2 -0.308410
2 2023-12-01 two_back -2 -0.228725
3 2024-03-01 two_back -2 -0.075191
4 2024-06-01 two_back -2 -0.277602
... ... ... ... ...
875 2024-12-01 two_ahead 2 0.444719
876 2025-03-01 two_ahead 2 -1.005068
877 2025-06-01 two_ahead 2 0.603667
878 2025-09-01 two_ahead 2 -0.409720
879 2025-12-01 two_ahead 2 0.528473

583 rows × 4 columns

Draw the ridge plot. Each row is one quarter; the colored curves show how the nowcast values are spread across models for each horizon. This reveals how much uncertainty exists at different information sets.

In [21]:
# Mock data to make the code runnable
dates = df["date"].unique()
horizons = ["two_back", "one_back", "zero_back", "one_ahead", "two_ahead"]
data = []
for date in dates:
    for horizon in horizons:
        # Create different distribution shapes to make the plot more interesting
        values = df[(df['date']==date) & (df['horizon']==horizon)].value
        data.extend([(date, horizon, v) for v in values])

df = pd.DataFrame(data, columns=["date", "horizon", "value"])

horizon_labels = {
    "two_back": "p=-2",
    "one_back": "p=-1",
    "zero_back": "p=0",
    "one_ahead": "p=1",
    "two_ahead": "p=2"
}
df["horizon"] = df["horizon"].map(horizon_labels)

sns.set_theme(style="white", rc={"axes.facecolor": (0, 0, 0, 0)})

# Generate a range of colors for the horizons
horizons_unique = df["horizon"].unique()
n_horizons = len(horizons_unique)
pal = sns.cubehelix_palette(n_horizons, rot=-.25, light=.7)

# The fix is to add sharey=False
g = sns.FacetGrid(
    df,  
    row="date",  
    hue="horizon",  
    aspect=5,  
    height=2,  
    palette=pal,
    sharey=True  # This is the key change
)

# KDE distributions
g.map(
    sns.kdeplot,  
    "value",  
    bw_adjust=0.7,  
    clip_on=False,  
    fill=True,  
    alpha=0.6,  
    linewidth=1.2,
    common_norm=True
)

# Contour lines
g.map(
    sns.kdeplot,  
    "value",  
    bw_adjust=0.7,  
    clip_on=False,  
    color="w",  
    lw=0.5
)

# Adjust spacing between ridges
g.fig.subplots_adjust(hspace=-0.2)

# Y-axis labels as dates, remove density scale
for ax, date in zip(g.axes.flat, df["date"].unique()):
    ax.set_ylabel(date.strftime("%Y-%m-%d"), rotation=0, labelpad=50, ha="right", va="center", fontsize=18)
    ax.set_yticks([])
    ax.axhline(y=0, color="gray", lw=0.6, ls="--", alpha=0.7)

# Add observed (example without results_table)
# Loop through the axes and sorted dates to plot the correct 'actual' value
sorted_dates = sorted(df["date"].unique())
for ax, date in zip(g.axes.flat, sorted_dates):
    actual_value = actuals_dict.get(date)
    if actual_value is not None:
        # Create a line with y-coordinates from 0.1 to 0.9 of the axes height,
        # but with the x-coordinate in data units.
        line = plt.Line2D(
            [actual_value, actual_value], # x-coordinates in data units
            [0, 0.9],                   # y-coordinates in axes units
            transform=ax.transData,       # This is the correct transform object
            color='black', linestyle='-', lw=1, alpha=0.8
        )
        # Add the line to the current axes.
        ax.add_line(line)
     
# Cleanup axes
g.set(xlabel="GDP growth QoQ%")
g.set_titles("")
g.despine(bottom=True, left=True)

# Custom white legend
palette_dict = dict(zip(horizons_unique, pal))
handles = [mpatches.Patch(color=palette_dict[h], label=h) for h in horizons_unique]
plt.legend(
    handles=handles,  
    title="Information",  
    title_fontsize=18,  
    fontsize=18,  
    bbox_to_anchor=(1.2, 4),  
    loc="upper center",
    frameon=True,
    facecolor="white",
    edgecolor="gray",
    framealpha=1
)

# Save and show
plt.show()
No description has been provided for this image

Weighted Nowcast¶

We will weight the nowcast by the inverse of the RMSE. Effectively, we will compute:

$$ \hat{v} = \frac{\sum_{i=1}^{N} w_i \, v_i}{\sum_{i=1}^{N} w_i} = \frac{\sum_{i=1}^{N} \dfrac{v_i}{\mathrm{RMSE}_i}} {\sum_{i=1}^{N} \dfrac{1}{\mathrm{RMSE}_i}} $$

We can also estimate a weighted confidence interval using the same weights.

$$ \mathrm{SE}(\hat{v}) = \frac{1}{\sqrt{\sum_{i=1}^{N} w_i}} = \left( \sum_{i=1}^{N} \frac{1}{\mathrm{RMSE}_i} \right)^{-1/2} $$

$$ \hat{v} \pm 1.96 \cdot \mathrm{SE}(\hat{v}) $$

In [22]:
filtered = filtered.copy()
filtered['date'] = pd.to_datetime(filtered['date'])
filtered['weight'] = 1 / filtered['RMSE']
filtered
Out[22]:
date spec estimator horizon value Vintage RMSE weight
0 2023-06-01 lars46 OLS two_back -0.419722 -2 2.298953 0.434981
1 2023-09-01 lars46 OLS two_back -0.308410 -2 2.298953 0.434981
2 2023-12-01 lars46 OLS two_back -0.228725 -2 2.298953 0.434981
3 2024-03-01 lars46 OLS two_back -0.075191 -2 2.298953 0.434981
4 2024-06-01 lars46 OLS two_back -0.277602 -2 2.298953 0.434981
... ... ... ... ... ... ... ... ...
875 2024-12-01 lars49 LSTM two_ahead 0.444719 2 2.649067 0.377491
876 2025-03-01 lars49 LSTM two_ahead -1.005068 2 2.649067 0.377491
877 2025-06-01 lars49 LSTM two_ahead 0.603667 2 2.649067 0.377491
878 2025-09-01 lars49 LSTM two_ahead -0.409720 2 2.649067 0.377491
879 2025-12-01 lars49 LSTM two_ahead 0.528473 2 2.649067 0.377491

583 rows × 8 columns

Pull out the observed (actual) GDP values. These are the true outcomes the models were trying to predict — we will compare nowcasts against them in the next steps.

In [23]:
df = results_table[['actuals']]
df = df.groupby('date').mean()
df.index = pd.to_datetime(df.index)
df
Out[23]:
actuals
date
2023-06-01 -0.066758
2023-09-01 1.988200
2023-12-01 -0.778219
2024-03-01 -0.054704
2024-06-01 -0.997453
2024-09-01 -1.055556
2024-12-01 1.691583
2025-03-01 1.086629
2025-06-01 0.255034
2025-09-01 0.350465
2025-12-01 -7.272064

Compute the RMSE-weighted average nowcast for the earliest horizon (two quarters before release). Models with lower historical errors receive more weight, so the combined forecast relies more on models that have been accurate in the past.

In [24]:
# ---------------------------------------------------
V = -2
N = "two_back"

d = filtered.loc[filtered['Vintage'] == V]
weighted = (
    (d['value'] * d['weight']).groupby(d['date']).sum()
    / d.groupby(d['date'])['weight'].sum()
).reset_index(name=N).set_index('date')


df = df.merge(weighted, left_index=True, right_index=True, how='left')
df
Out[24]:
actuals two_back
date
2023-06-01 -0.066758 0.121110
2023-09-01 1.988200 0.129354
2023-12-01 -0.778219 0.115386
2024-03-01 -0.054704 0.200174
2024-06-01 -0.997453 0.022680
2024-09-01 -1.055556 -0.008797
2024-12-01 1.691583 0.080847
2025-03-01 1.086629 -0.123339
2025-06-01 0.255034 -0.170563
2025-09-01 0.350465 0.081047
2025-12-01 -7.272064 -0.154390

Repeat the weighted average for the remaining four forecasting horizons, adding each as a new column to the table.

In [25]:
# ---------------------------------------------------
V = -1
N = "one_back"

d = filtered.loc[filtered['Vintage'] == V]
weighted = (
    (d['value'] * d['weight']).groupby(d['date']).sum()
    / d.groupby(d['date'])['weight'].sum()
).reset_index(name=N).set_index('date')

df = df.merge(weighted, left_index=True, right_index=True, how='left')

# ---------------------------------------------------
V = 0
N = "zero_back"

d = filtered.loc[filtered['Vintage'] == V]
weighted = (
    (d['value'] * d['weight']).groupby(d['date']).sum()
    / d.groupby(d['date'])['weight'].sum()
).reset_index(name=N).set_index('date')

df = df.merge(weighted, left_index=True, right_index=True, how='left')

# ---------------------------------------------------
V = 1
N = "one_ahead"

d = filtered.loc[filtered['Vintage'] == V]
weighted = (
    (d['value'] * d['weight']).groupby(d['date']).sum()
    / d.groupby(d['date'])['weight'].sum()
).reset_index(name=N).set_index('date')

df = df.merge(weighted, left_index=True, right_index=True, how='left')

# ---------------------------------------------------
V = 2
N = "two_ahead"

d = filtered.loc[filtered['Vintage'] == V]
weighted = (
    (d['value'] * d['weight']).groupby(d['date']).sum()
    / d.groupby(d['date'])['weight'].sum()
).reset_index(name=N).set_index('date')

df = df.merge(weighted, left_index=True, right_index=True, how='left')

df
Out[25]:
actuals two_back one_back zero_back one_ahead two_ahead
date
2023-06-01 -0.066758 0.121110 0.260364 -0.108956 0.290239 0.507318
2023-09-01 1.988200 0.129354 0.101427 0.095867 0.031705 0.458202
2023-12-01 -0.778219 0.115386 0.432420 0.643221 0.705854 0.571956
2024-03-01 -0.054704 0.200174 0.359627 -0.016257 -0.148880 -0.032345
2024-06-01 -0.997453 0.022680 0.080296 0.274846 1.075381 0.317263
2024-09-01 -1.055556 -0.008797 -0.301066 -0.384276 0.068972 0.107290
2024-12-01 1.691583 0.080847 0.228743 0.387874 0.125630 0.164198
2025-03-01 1.086629 -0.123339 -0.125767 -0.160577 0.274556 0.131708
2025-06-01 0.255034 -0.170563 0.192939 0.253161 0.364079 0.389159
2025-09-01 0.350465 0.081047 0.159874 0.029010 0.505154 0.589125
2025-12-01 -7.272064 -0.154390 -0.104595 0.288565 1.771885 0.338105

Alternatively... a loop, from start:

In [26]:
horizons = {
    'two_back': -2,
    'one_back': -1,
    'zero_back': 0,
    'one_ahead': 1,
    'two_ahead': 2
}

filtered = filtered.copy()
filtered['date'] = pd.to_datetime(filtered['date'])
filtered['weight'] = 1 / filtered['RMSE']

df = results_table[['actuals']]
df = df.groupby('date').mean()
df.index = pd.to_datetime(df.index)
df
Out[26]:
actuals
date
2023-06-01 -0.066758
2023-09-01 1.988200
2023-12-01 -0.778219
2024-03-01 -0.054704
2024-06-01 -0.997453
2024-09-01 -1.055556
2024-12-01 1.691583
2025-03-01 1.086629
2025-06-01 0.255034
2025-09-01 0.350465
2025-12-01 -7.272064

Compact loop version: compute the RMSE-weighted average nowcast for all five horizons at once. This replaces the step-by-step approach above and produces the same result more concisely.

In [27]:
horizons = {
    'two_back': -2,
    'one_back': -1,
    'zero_back': 0,
    'one_ahead': 1,
    'two_ahead': 2
}

for N, V in horizons.items():
    d = filtered.loc[filtered['Vintage'] == V]

    weighted = (
        (d['value'] * d['weight']).groupby(d['date']).sum()
        / d.groupby(d['date'])['weight'].sum()
    ).reset_index(name=N).set_index('date')

    df = df.merge(weighted, left_index=True, right_index=True, how='left')

df
Out[27]:
actuals two_back one_back zero_back one_ahead two_ahead
date
2023-06-01 -0.066758 0.121110 0.260364 -0.108956 0.290239 0.507318
2023-09-01 1.988200 0.129354 0.101427 0.095867 0.031705 0.458202
2023-12-01 -0.778219 0.115386 0.432420 0.643221 0.705854 0.571956
2024-03-01 -0.054704 0.200174 0.359627 -0.016257 -0.148880 -0.032345
2024-06-01 -0.997453 0.022680 0.080296 0.274846 1.075381 0.317263
2024-09-01 -1.055556 -0.008797 -0.301066 -0.384276 0.068972 0.107290
2024-12-01 1.691583 0.080847 0.228743 0.387874 0.125630 0.164198
2025-03-01 1.086629 -0.123339 -0.125767 -0.160577 0.274556 0.131708
2025-06-01 0.255034 -0.170563 0.192939 0.253161 0.364079 0.389159
2025-09-01 0.350465 0.081047 0.159874 0.029010 0.505154 0.589125
2025-12-01 -7.272064 -0.154390 -0.104595 0.288565 1.771885 0.338105

Graph results¶

In [28]:
plot_df = df.rename(columns={
    'actuals': 'observed',
    'two_back': 'p=-2',
    'one_back': 'p=-1',
    'zero_back': 'p=0',
    'one_ahead': 'p=1',
    'two_ahead': 'p=2'
})
plot_df.index = pd.PeriodIndex(plot_df.index, freq='Q').astype(str)
fig, ax = plt.subplots(figsize=(10,6))

# Plot 'observed' as a black dashed line, no marker
plot_df['observed'].plot(
    ax=ax, color='black', linestyle='-', linewidth=2, label='observed'
)

# Define line styles and markers for other lines
line_styles = ['--', '--', '-.', ':', (0, (3, 5, 1, 5))]  # various dash styles
markers = ['x', 'o', '^', 'v', 'D']  # circle, square, triangle up/down, diamond

# Plot the rest with unique line styles and markers
for idx, col in enumerate(['p=-2', 'p=-1', 'p=0', 'p=1', 'p=2']):
    plot_df[col].plot(
        ax=ax,
        color=idb_blue,
        linestyle=line_styles[idx],
        marker=markers[idx],
        linewidth=1.5,
        markersize=5,
        label=col
    )

# Labels and grid
ax.set_title('')
ax.set_xlabel('Quarter', fontsize=12)
ax.set_ylabel('QoQ GDP Growth (%)', fontsize=12)
ax.legend(title='Publication Lag', fontsize=12, title_fontsize=12)
plt.tight_layout()

plt.show()
No description has been provided for this image

Weighted Nowcast with Confidence Intervals¶

In [29]:
df = results_table[['actuals']].groupby('date').mean()
df
Out[29]:
actuals
date
2023-06-01 -0.066758
2023-09-01 1.988200
2023-12-01 -0.778219
2024-03-01 -0.054704
2024-06-01 -0.997453
2024-09-01 -1.055556
2024-12-01 1.691583
2025-03-01 1.086629
2025-06-01 0.255034
2025-09-01 0.350465
2025-12-01 -7.272064

Compute weighted average nowcasts and 95% confidence intervals for all horizons. For each horizon we calculate the weighted mean, its standard error (SE), and the interval mean +/- 1.96 x SE.

In [30]:
lags = {
    -2: 'two_back',
    -1: 'one_back',
     0: 'zero_back',
     1: 'one_ahead',
     2: 'two_ahead'
}

for V, N in lags.items():
    # Read and prepare results_table fresh in each loop
    New_Table = filtered.copy()
    d = New_Table.loc[New_Table['Vintage'] == V]

    g = d.groupby('date')

    meanN = (
        (d['value'] * d['weight']).groupby(d['date']).sum()
        / g['weight'].sum()
    ).rename(N)

    seN = (1 / np.sqrt(g['weight'].sum())).rename(f'se_{N}')

    weighted = pd.concat([meanN, seN], axis=1)

    # Compute 95% confidence interval bounds
    weighted[f'ci_lower_{N}'] = weighted[N] - 1.96 * weighted[f'se_{N}']
    weighted[f'ci_upper_{N}'] = weighted[N] + 1.96 * weighted[f'se_{N}']

    # Concatenate the new columns to the main df
    df = df.join(weighted[[N, f'se_{N}', f'ci_lower_{N}', f'ci_upper_{N}']])

df.index = pd.PeriodIndex(df.index, freq='Q').astype(str)
df
Out[30]:
actuals two_back se_two_back ci_lower_two_back ci_upper_two_back one_back se_one_back ci_lower_one_back ci_upper_one_back zero_back ... ci_lower_zero_back ci_upper_zero_back one_ahead se_one_ahead ci_lower_one_ahead ci_upper_one_ahead two_ahead se_two_ahead ci_lower_two_ahead ci_upper_two_ahead
date
2023Q2 -0.066758 0.121110 0.516343 -0.890923 1.133142 0.260364 0.448102 -0.617915 1.138644 -0.108956 ... -1.009857 0.791945 0.290239 0.501652 -0.693000 1.273478 0.507318 0.565639 -0.601334 1.615970
2023Q3 1.988200 0.129354 0.516343 -0.882678 1.141387 0.101427 0.448102 -0.776852 0.979707 0.095867 ... -0.805034 0.996768 0.031705 0.501652 -0.951534 1.014944 0.458202 0.565639 -0.650450 1.566854
2023Q4 -0.778219 0.115386 0.516343 -0.896646 1.127419 0.432420 0.448102 -0.445860 1.310699 0.643221 ... -0.257680 1.544122 0.705854 0.501652 -0.277385 1.689092 0.571956 0.565639 -0.536696 1.680608
2024Q1 -0.054704 0.200174 0.516343 -0.811858 1.212207 0.359627 0.448102 -0.518653 1.237906 -0.016257 ... -0.917158 0.884644 -0.148880 0.501652 -1.132119 0.834358 -0.032345 0.565639 -1.140997 1.076307
2024Q2 -0.997453 0.022680 0.516343 -0.989353 1.034712 0.080296 0.448102 -0.797984 0.958575 0.274846 ... -0.626055 1.175747 1.075381 0.501652 0.092143 2.058620 0.317263 0.565639 -0.791388 1.425915
2024Q3 -1.055556 -0.008797 0.516343 -1.020829 1.003235 -0.301066 0.448102 -1.179345 0.577214 -0.384276 ... -1.285177 0.516625 0.068972 0.501652 -0.914267 1.052211 0.107290 0.565639 -1.001362 1.215942
2024Q4 1.691583 0.080847 0.516343 -0.931186 1.092879 0.228743 0.448102 -0.649536 1.107023 0.387874 ... -0.513027 1.288775 0.125630 0.501652 -0.857609 1.108868 0.164198 0.565639 -0.944453 1.272850
2025Q1 1.086629 -0.123339 0.516343 -1.135371 0.888693 -0.125767 0.448102 -1.004047 0.752512 -0.160577 ... -1.061478 0.740324 0.274556 0.501652 -0.708682 1.257795 0.131708 0.565639 -0.976944 1.240360
2025Q2 0.255034 -0.170563 0.516343 -1.182595 0.841470 0.192939 0.448102 -0.685341 1.071218 0.253161 ... -0.647740 1.154062 0.364079 0.501652 -0.619159 1.347318 0.389159 0.565639 -0.719493 1.497810
2025Q3 0.350465 0.081047 0.516343 -0.930985 1.093080 0.159874 0.448102 -0.718405 1.038154 0.029010 ... -0.871891 0.929911 0.505154 0.501652 -0.478084 1.488393 0.589125 0.565639 -0.519527 1.697777
2025Q4 -7.272064 -0.154390 0.516343 -1.166422 0.857643 -0.104595 0.448102 -0.982875 0.773684 0.288565 ... -0.612336 1.189466 1.771885 0.501652 0.788646 2.755123 0.338105 0.565639 -0.770547 1.446756

11 rows × 21 columns

Plot the weighted nowcast with confidence bands, one panel per horizon. The shaded area shows the 95% interval around the weighted average, illustrating how uncertainty changes across forecasting horizons.

In [31]:
lags = {
    -2: 'two_back',
    -1: 'one_back',
     0: 'zero_back',
     1: 'one_ahead',
     2: 'two_ahead'
}

fig, axes = plt.subplots(nrows=5, ncols=1, figsize=(10, 20), sharex=True)

# ------------------------------------------------
# Loop over each vintage and plot
for ax, (V, N) in zip(axes, lags.items()):
    # Plot actuals
    ax.plot(df.index, df['actuals'], color='black', linestyle='-', label='Observed')
    
    # Plot nowcast for vintage N
    ax.plot(df.index, df[N], color=idb_blue, linestyle='--', marker='x', label=f'Weighted Nowcast')
    
    # Plot confidence interval as a shaded area
    ax.fill_between(
        df.index,
        df[f'ci_lower_{N}'],
        df[f'ci_upper_{N}'],
        color=idb_light_blue,
        alpha=0.4,
        label='95% CI'
    )
    
    # Titles and labels
    ax.set_title(f'Information set p={V}', fontsize=12)
    ax.set_ylabel('QoQ GDP Growth (%)')
    ax.legend(fontsize=9, loc='upper right')

axes[-1].set_xlabel('Quarter')

plt.tight_layout()
plt.show()
No description has been provided for this image

Out of sample¶

In [57]:
import os
import pandas as pd
import numpy as np
try :
    date_str
except :
    date_str = "20260505"


PATH_OUTPUT = os.path.join(os.getcwd(), "..", "output/", f"{date_str}")
PATH_DATA = os.path.join(os.getcwd(), "..", "data/")


import matplotlib.ticker as ticker
import matplotlib.pyplot as plt
import seaborn as sns
'''
# Define the color scheme
idb_blue = "#005BAC"
idb_light_blue = "#7FB3E6"

# Load results
forecast_file = f"{date_str} 3_nwcst Tab fcst.xlsx"
performance_file = f"{date_str} 3_nwcst Tab perf.xlsx"
outofsample_file = f"{date_str} 3_nwcst Tab oos.xlsx"


results_table = pd.read_excel(f"{PATH_OUTPUT}/{forecast_file}", index_col='date', parse_dates=True)
results_table.index = pd.to_datetime(results_table.index, format="%Y-%m-%d")
performance_table = pd.read_excel(f"{PATH_OUTPUT}/{performance_file}", parse_dates=True)
'''

forecast = filtered.copy()
forecast = forecast.dropna(how='all', axis=0)

performance = performance_table.copy()
performance = performance.dropna(how='any', axis=0)
performance['inv_rmse'] = 1 / (performance['RMSE']** 2)
performance['weight'] = performance.groupby('Vintage')['inv_rmse'].transform(lambda x: x / x.sum())

performance.reset_index(drop=False, inplace=True)


performance = performance[['Vintage', 'estimator', 'spec', 'weight']]
performance = performance.set_index('Vintage')
performance.index = performance.index.astype(str).str.replace("-2",'two_back')
performance.index = performance.index.astype(str).str.replace("-1",'one_back')
performance.index = performance.index.astype(str).str.replace("0",'zero_back')
performance.index = performance.index.astype(str).str.replace("1",'one_ahead')
performance.index = performance.index.astype(str).str.replace("2",'two_ahead')
performance = performance.reset_index()

performance.rename(columns={'Vintage': 'horizon'}, inplace=True)

#forecast = forecast.reset_index().melt(id_vars=['date', 'actuals', 'estimator', 'spec'],
#        value_vars=['two_back', 'one_back', 'zero_back', 'one_ahead', 'two_ahead'],
#        var_name='Vintage',
#        value_name='value')
forecast = forecast.drop(columns=['weight'], errors='ignore')
forecast = forecast.merge(performance, on=['horizon', 'estimator', 'spec'])
forecast = forecast.dropna(how='all', axis=0)
forecast = forecast[~forecast['value'].isna()]
forecast['value'] = ( ((1+ ( forecast['value']/100) )**4)-1 )*100
performance
Out[57]:
horizon estimator spec weight
0 two_back OLS lars46 0.074801
1 one_back OLS lars46 0.076518
2 zero_back OLS lars46 0.064945
3 one_ahead OLS lars46 0.004265
4 two_ahead OLS lars46 0.004540
... ... ... ... ...
75 two_back LSTM lars49 0.058461
76 one_back LSTM lars49 0.063251
77 zero_back LSTM lars49 0.066015
78 one_ahead LSTM lars49 0.106243
79 two_ahead LSTM lars49 0.113962

80 rows × 4 columns

Build the out-of-sample summary statistics. Load observed GDP growth, then compute the simple average, GBT-only, and RMSE-weighted nowcast for the most recent (out-of-sample) period.

In [58]:
data = pd.read_csv(f"{PATH_DATA}/data.csv", parse_dates=["date"], index_col='date')
data = data[['RGDP0000']].dropna()

from statsmodels.tsa.seasonal import seasonal_decompose
data['RGDP0000'] = seasonal_decompose(data['RGDP0000'].dropna(how='all', axis=0), model='additive', extrapolate_trend='freq', period=4).trend

'''
data = pd.read_csv(f"{PATH_FORMAT}/data_full.csv", parse_dates=["date"], index_col='date')['RGDP0000'].dropna()
'''
gdp = data.resample('Q').agg({'RGDP0000': 'mean'})
gdp.index = gdp.index.to_period('Q').asfreq('M').to_timestamp()
gdp = (gdp.pct_change(periods=1)).dropna()*100
gdp.rename(columns = { 'RGDP0000' : 'actuals' } , inplace=True)
gdp['actuals'] = (((1+gdp['actuals']/100)**4) - 1)*100

actuals = pd.DataFrame(gdp['actuals'].groupby(gdp.index).mean())

avg = pd.DataFrame(forecast.groupby('date')['value'].mean()).rename(columns = {'value' : 'avg'})

est_fcst = pd.DataFrame( forecast.groupby(['date', 'estimator'])['value'].mean() ).rename(columns = {'value' : 'avg'}).reset_index()
gbt_fcst = est_fcst[est_fcst['estimator']=='GBT'].set_index('date')[['avg']].rename(columns = {'avg' : 'GBT'})

avg = pd.concat([avg, gbt_fcst], axis=1)

wavg = forecast.groupby('date').apply(lambda g: np.average(g['value'], weights=g['weight']) if g['weight'].sum() > 0 else np.nan).reset_index(name='weighted_avg').set_index('date')

avg = pd.concat([avg, wavg], axis=1)
avg.index = pd.to_datetime(avg.index)
avg
C:\Users\guerr\AppData\Local\Temp\ipykernel_31188\1029502603.py:10: FutureWarning: 'Q' is deprecated and will be removed in a future version, please use 'QE' instead.
  gdp = data.resample('Q').agg({'RGDP0000': 'mean'})
C:\Users\guerr\AppData\Local\Temp\ipykernel_31188\1029502603.py:25: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
  wavg = forecast.groupby('date').apply(lambda g: np.average(g['value'], weights=g['weight']) if g['weight'].sum() > 0 else np.nan).reset_index(name='weighted_avg').set_index('date')
Out[58]:
avg GBT weighted_avg
date
2023-06-01 0.670382 2.252378 1.055284
2023-09-01 0.411530 1.445939 0.812534
2023-12-01 2.226002 2.414542 1.994482
2024-03-01 0.354004 2.142580 0.214557
2024-06-01 2.016769 1.439138 1.321492
2024-09-01 -0.407287 0.949355 -0.387306
2024-12-01 0.690723 2.106194 0.909163
2025-03-01 0.067927 1.357796 0.043184
2025-06-01 0.862448 2.065746 0.950449
2025-09-01 1.123059 2.920250 1.178173
2025-12-01 3.148724 1.637126 1.443217

Load the out-of-sample prediction panel and filter it by quality. We read all model predictions for the nowcast horizon, convert to annualized growth rates, and keep only models that pass the same RMSE threshold used in-sample.

In [60]:
out_of_sample = pd.read_excel(f"{PATH_OUTPUT}/{outofsample_file}", parse_dates=True)
out_of_sample['date'] = pd.to_datetime(out_of_sample['date'])
out_of_sample['nowcast'] = (((1+out_of_sample['nowcast']/100)**4) - 1)*100


# Filter out of sample by RMSE
filtered_grouped = (
    filtered[filtered['Vintage'] == 2]
    .groupby(['spec', 'estimator'], as_index=False)
    .agg({
        'RMSE': 'mean',
        #'threshold': 'mean'
    })
)

out_of_sample = out_of_sample.merge(
    filtered_grouped,
    on=['spec', 'estimator'],
    how='left'
)
out_of_sample.dropna(subset=['RMSE'], inplace=True)
#out_of_sample = out_of_sample[out_of_sample['RMSE'] <= out_of_sample['threshold']]
out_of_sample
Out[60]:
date nowcast estimator spec RMSE
4 2026-03-01 11.655263 DT lars46 2.549988
5 2026-03-01 1.608246 RF lars46 2.564108
6 2026-03-01 2.647425 GBT lars46 2.554940
11 2026-03-01 11.655263 DT lars49 2.566734
12 2026-03-01 2.722249 RF lars49 2.559070
13 2026-03-01 2.416893 GBT lars49 2.599546

Compute the weighted average and confidence interval for the out-of-sample nowcast. Models are weighted by their historical RMSE, and a 5-95 percentile band is calculated across model predictions.

In [61]:
#lower_bound = out_of_sample['nowcast'].quantile(0.10)
#upper_bound = out_of_sample['nowcast'].quantile(0.90)
#out_of_sample['nowcast'] = out_of_sample['nowcast'].where((out_of_sample['nowcast'] >= lower_bound) & (out_of_sample['nowcast'] <= upper_bound), np.nan)
'''
quantiles = out_of_sample.groupby('date')['nowcast'].quantile([0.10, 0.90]).unstack()
out_of_sample = out_of_sample.merge(quantiles, on='date', suffixes=('', '_quantile'))
out_of_sample['nowcast'] = out_of_sample['nowcast'].where(
    (out_of_sample['nowcast'] >= out_of_sample[0.10]) & 
    (out_of_sample['nowcast'] <= out_of_sample[0.90]), 
    np.nan
)
out_of_sample = out_of_sample.drop(columns=[0.10, 0.90])
'''

out_of_sample = out_of_sample.merge(pd.DataFrame(performance.groupby(['estimator', 'spec'])['weight'].mean()).reset_index(), on=['estimator', 'spec']).set_index('date')

oos_avg = pd.DataFrame(out_of_sample.groupby(out_of_sample.index)['nowcast'].mean()).rename(columns = {'nowcast' : 'avg'})

oos_gbt = pd.DataFrame(out_of_sample.groupby(['date', 'estimator'])['nowcast'].mean().reset_index()).set_index('date').rename(columns = {'nowcast' : 'GBT'})
oos_gbt = pd.DataFrame(oos_gbt[oos_gbt['estimator'] == 'GBT']['GBT'])
oos_avg = pd.concat([oos_avg, oos_gbt], axis=1)

out_of_sample = out_of_sample.dropna(how='any', axis=0)
wavg = out_of_sample.groupby(out_of_sample.index).apply(lambda g: np.average(g['nowcast'], weights=g['weight']) if g['weight'].sum() > 0 else np.nan).reset_index(name='weighted_avg').set_index('date')
oos_avg = pd.concat([oos_avg, wavg], axis=1)

out_of_sample = out_of_sample.dropna(how='any', axis=0)
dfrange = out_of_sample.groupby(out_of_sample.index)['nowcast'].agg(min='min', max='max').reset_index().set_index('date')
df_ci = (
    out_of_sample
    .groupby(out_of_sample.index)['nowcast']
    .quantile([0.05, 0.95])
    .unstack()
    .rename(columns={0.05: 'ci_lower', 0.95: 'ci_upper'})
)
oos_avg = pd.concat([oos_avg, dfrange, df_ci], axis=1)

oos_avg.index = pd.to_datetime(oos_avg.index)
oos_avg
Out[61]:
avg GBT weighted_avg min max ci_lower ci_upper
date
2026-03-01 5.45089 2.532159 5.453093 1.608246 11.655263 1.810408 11.655263

Combine in-sample and out-of-sample nowcast summaries into a single table, merging columns with the same name.

In [62]:
data = pd.concat( [ avg , oos_avg ] , axis=1 )
data = data.groupby(level=0, axis=1).sum(min_count=1)
data
C:\Users\guerr\AppData\Local\Temp\ipykernel_31188\718950512.py:2: FutureWarning: DataFrame.groupby with axis=1 is deprecated. Do `frame.T.groupby(...)` without axis instead.
  data = data.groupby(level=0, axis=1).sum(min_count=1)
Out[62]:
GBT avg ci_lower ci_upper max min weighted_avg
date
2023-06-01 2.252378 0.670382 NaN NaN NaN NaN 1.055284
2023-09-01 1.445939 0.411530 NaN NaN NaN NaN 0.812534
2023-12-01 2.414542 2.226002 NaN NaN NaN NaN 1.994482
2024-03-01 2.142580 0.354004 NaN NaN NaN NaN 0.214557
2024-06-01 1.439138 2.016769 NaN NaN NaN NaN 1.321492
2024-09-01 0.949355 -0.407287 NaN NaN NaN NaN -0.387306
2024-12-01 2.106194 0.690723 NaN NaN NaN NaN 0.909163
2025-03-01 1.357796 0.067927 NaN NaN NaN NaN 0.043184
2025-06-01 2.065746 0.862448 NaN NaN NaN NaN 0.950449
2025-09-01 2.920250 1.123059 NaN NaN NaN NaN 1.178173
2025-12-01 1.637126 3.148724 NaN NaN NaN NaN 1.443217
2026-03-01 2.532159 5.450890 1.810408 11.655263 11.655263 1.608246 5.453093

Add the observed GDP growth series to the combined table, so we can compare nowcasts against actual outcomes side by side.

In [63]:
data = pd.concat([ data, actuals ] , axis=1)
data = pd.concat([ data, gdp ] , axis=1)
data
Out[63]:
GBT avg ci_lower ci_upper max min weighted_avg actuals actuals
date
2015-06-01 NaN NaN NaN NaN NaN NaN NaN 3.018851 3.018851
2015-09-01 NaN NaN NaN NaN NaN NaN NaN 3.625360 3.625360
2015-12-01 NaN NaN NaN NaN NaN NaN NaN 1.640289 1.640289
2016-03-01 NaN NaN NaN NaN NaN NaN NaN 3.231093 3.231093
2016-06-01 NaN NaN NaN NaN NaN NaN NaN 3.894294 3.894294
2016-09-01 NaN NaN NaN NaN NaN NaN NaN 2.801642 2.801642
2016-12-01 NaN NaN NaN NaN NaN NaN NaN 1.756087 1.756087
2017-03-01 NaN NaN NaN NaN NaN NaN NaN 2.300944 2.300944
2017-06-01 NaN NaN NaN NaN NaN NaN NaN 2.832481 2.832481
2017-09-01 NaN NaN NaN NaN NaN NaN NaN 2.612075 2.612075
2017-12-01 NaN NaN NaN NaN NaN NaN NaN 2.673566 2.673566
2018-03-01 NaN NaN NaN NaN NaN NaN NaN 2.524350 2.524350
2018-06-01 NaN NaN NaN NaN NaN NaN NaN 2.359652 2.359652
2018-09-01 NaN NaN NaN NaN NaN NaN NaN 2.515393 2.515393
2018-12-01 NaN NaN NaN NaN NaN NaN NaN 2.325332 2.325332
2019-03-01 NaN NaN NaN NaN NaN NaN NaN 1.876144 1.876144
2019-06-01 NaN NaN NaN NaN NaN NaN NaN 1.295256 1.295256
2019-09-01 NaN NaN NaN NaN NaN NaN NaN 0.620019 0.620019
2019-12-01 NaN NaN NaN NaN NaN NaN NaN -8.044961 -8.044961
2020-03-01 NaN NaN NaN NaN NaN NaN NaN -12.641871 -12.641871
2020-06-01 NaN NaN NaN NaN NaN NaN NaN -8.487563 -8.487563
2020-09-01 NaN NaN NaN NaN NaN NaN NaN -7.088338 -7.088338
2020-12-01 NaN NaN NaN NaN NaN NaN NaN 3.957524 3.957524
2021-03-01 NaN NaN NaN NaN NaN NaN NaN 10.977186 10.977186
2021-06-01 NaN NaN NaN NaN NaN NaN NaN 7.542204 7.542204
2021-09-01 NaN NaN NaN NaN NaN NaN NaN 7.974837 7.974837
2021-12-01 NaN NaN NaN NaN NaN NaN NaN 7.132166 7.132166
2022-03-01 NaN NaN NaN NaN NaN NaN NaN 7.071148 7.071148
2022-06-01 NaN NaN NaN NaN NaN NaN NaN 5.733753 5.733753
2022-09-01 NaN NaN NaN NaN NaN NaN NaN 4.120901 4.120901
2022-12-01 NaN NaN NaN NaN NaN NaN NaN 3.341701 3.341701
2023-03-01 NaN NaN NaN NaN NaN NaN NaN 2.492350 2.492350
2023-06-01 2.252378 0.670382 NaN NaN NaN NaN 1.055284 2.149795 2.149795
2023-09-01 1.445939 0.411530 NaN NaN NaN NaN 0.812534 1.412103 1.412103
2023-12-01 2.414542 2.226002 NaN NaN NaN NaN 1.994482 0.597036 0.597036
2024-03-01 2.142580 0.354004 NaN NaN NaN NaN 0.214557 -1.370031 -1.370031
2024-06-01 1.439138 2.016769 NaN NaN NaN NaN 1.321492 -1.657113 -1.657113
2024-09-01 0.949355 -0.407287 NaN NaN NaN NaN -0.387306 0.129506 0.129506
2024-12-01 2.106194 0.690723 NaN NaN NaN NaN 0.909163 1.341081 1.341081
2025-03-01 1.357796 0.067927 NaN NaN NaN NaN 0.043184 2.689039 2.689039
2025-06-01 2.065746 0.862448 NaN NaN NaN NaN 0.950449 -1.154368 -1.154368
2025-09-01 2.920250 1.123059 NaN NaN NaN NaN 1.178173 3.293706 3.293706
2025-12-01 1.637126 3.148724 NaN NaN NaN NaN 1.443217 1.360150 1.360150
2026-03-01 2.532159 5.450890 1.810408 11.655263 11.655263 1.608246 5.453093 NaN NaN

Draw the final nowcast chart. Shows observed GDP growth (black), the simple average nowcast (red dotted), and the RMSE-weighted nowcast (indigo dashed) with a grey uncertainty band. Sample is restricted to 2022 onwards.

In [64]:
import matplotlib.pyplot as plt

dataframe = data.copy()

# Ensure the last observed point has no interval
last_valid_idx = dataframe['actuals'].last_valid_index()
dataframe.at[last_valid_idx, 'ci_lower'] = dataframe.at[last_valid_idx, 'weighted_avg']
dataframe.at[last_valid_idx, 'ci_upper'] = dataframe.at[last_valid_idx, 'weighted_avg']

# Restrict sample
dataframe = dataframe[dataframe.index > "2022-01-01"]

plt.figure(figsize=(10, 6))

# Confidence interval band
plt.fill_between(
    dataframe.index,
    dataframe['ci_lower'],
    dataframe['ci_upper'],
    color='grey',
    alpha=0.15,
    edgecolor='none',
    label="_nolegend_"
)

# Observed GDP
plt.plot(
    dataframe.index,
    dataframe['actuals'],
    color='black',
    marker='>',
    linewidth=2,
    label='Observed GDP'
)

# Average nowcast
plt.plot(
    dataframe.index,
    dataframe['avg'],
    color='crimson',
    linestyle=":",
    marker='x',
    markersize=5,
    label='Average Nowcast'
)

# Weighted nowcast
plt.plot(
    dataframe.index,
    dataframe['weighted_avg'],
    color='indigo',
    linestyle="--",
    marker='x',
    markersize=8,
    label='RMSE-weighted Nowcast'
)

# Zero line
plt.axhline(y=0, color='black', linestyle='--', linewidth=1)

# Labels
plt.xlabel('Date', fontsize=12)
plt.ylabel('Real GDP Annualized Growth Rate (QoQ%)', fontsize=12)

# Rotate x labels
plt.xticks(rotation=45)

# Remove duplicate legend entries (safety)
handles, labels = plt.gca().get_legend_handles_labels()
by_label = dict(zip(labels, handles))
plt.legend(by_label.values(), by_label.keys(), loc='lower left')

# Layout and save
plt.tight_layout()
plt.savefig(f"{PATH_OUTPUT}/{date_str} 4_rslt Fig out_of_sample.png",
            dpi=300,
            bbox_inches='tight')

plt.show()
No description has been provided for this image