Introduction¶
This notebook provides an example of using calibrated CNit v2.1 parameters to emulate the TRENDY-v11 DGVMs. Note that this notebook uses some data files with relative paths, which may not work on local machines. If you download and rerun this notebook, you must first obtain the external data and then update the file paths accordingly.
The data are available in: https://doi.org/10.5281/zenodo.18116574
- The TRENDY DGVM-calibrated parameters: cali_trendy_best_fitting.csv
- The TRENDY S0-S3 forcing data: TRENDY_cali_forcing.nc
- The TRENDY S0-S3 outputs: TRENDY_cali_target.nc
- The notebook itself: CNit_example_TRENDY_DGVMs_emulation.ipynb
TRENDY DGVMs are used in the Global Carbon Budget to estimate the potential land carbon sink. They are also used to attribute net biome production to environmental drivers. Briefly, the S3 experiment is the historical run. For details on the TRENDY project and experiments, see: https://doi.org/10.1029/2024GB008102
%load_ext nb_black
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import xarray as xr
import os
from cnit import CNitModel, CNitModelConfig, CNitExpConfig
from cnit import Q # Local pint.quantity class for units handling
Set default plotting properties¶
plt.rcParams.update(
{
"font.size": 6, # Default font size for all text
"axes.titlesize": 6, # Axes title
"axes.labelsize": 6, # Axes labels
"xtick.labelsize": 6, # X-axis tick labels
"ytick.labelsize": 6, # Y-axis tick labels
"legend.fontsize": 6, # Legend text
"figure.titlesize": 6, # Figure title
"lines.linewidth": 0.5, # Default line width
"figure.figsize": (9 / 2.54, 6 / 2.54), # Default figure size in cm
"figure.dpi": 150, # Default DPI for figures
"figure.constrained_layout.use": True, # Use constrained layout by default
"xtick.major.size": 1, # X-axis major tick length
"ytick.major.size": 1, # Y-axis major tick length
}
)
Read the forcing data and DGVM outputs¶
Note the path need to be changed before running
res_path = os.path.dirname(os.getcwd()) + "/results/"
data_path = os.path.dirname(os.getcwd()) + "/data/"
trendy_forcing = xr.open_dataset(data_path + "TRENDY_cali_forcing.nc")
trendy_target = xr.open_dataset(data_path + "TRENDY_cali_target.nc")
trendy_target = trendy_target.sel(time=np.arange(1700, 2021))
trendy_forcing = trendy_forcing.sel(time=np.arange(1700, 2021))
Read the calibrated parameters¶
Note: The path needs to be updated before running.
The calibrated parameters are stored in a CSV file, with parameter names as the index and different emulated DGVMs as columns, along with a unit column.
cali_para_df = pd.read_csv(res_path + "cali_trendy_best_fitting.csv", index_col=0)
mdl_list = [i for i in cali_para_df.columns.to_list() if "unit" not in i]
print(mdl_list)
['CABLEPOP', 'CLASSIC', 'CLM5.0', 'DLEM', 'ED', 'ELM', 'IBIS', 'ISAM', 'ISBACTRIP', 'JSBACH', 'JULES', 'LPJ-GUESS', 'LPJml', 'LPJwsl', 'OCN', 'ORCHIDEE', 'SDGVM', 'VISIT', 'YIBS', 'lpxqs']
Check the valid time range of the DGVM outputs and align the initial state using the S3 experiment as the reference.¶
- Some DGVMs provide data for different experimental periods.
- Their initial states for S0–S3 also differ in some cases, which is not reasonable if they start from the same spin-up.
state_var = [
"CplsP",
"CplsL",
"CplsS",
"CplsPLS",
"NplsP",
"NplsL",
"NplsS",
"NplsM",
"NplsPLS",
"NplsPLSM",
]
for mdl in mdl_list:
valid_time = trendy_target["time"][
~trendy_target["CflxNPP"].sel(model=mdl, exp="S3").isnull()
].data
print(mdl, valid_time[0])
for exp in trendy_target.exp.data:
gap = trendy_target.sel(
model=mdl, exp="S3", time=valid_time[0]
) - trendy_target.sel(model=mdl, exp=exp, time=valid_time[0])
trendy_target[state_var].loc[dict(model=mdl, exp=exp, time=valid_time)] = (
trendy_target[state_var].loc[dict(model=mdl, exp=exp, time=valid_time)]
+ gap[state_var]
)
CABLEPOP 1700 CLASSIC 1701 CLM5.0 1701 DLEM 1700 ED 1700 ELM 1700 IBIS 1700 ISAM 1700 ISBACTRIP 1700 JSBACH 1700 JULES 1700 LPJ-GUESS 1700 LPJml 1700 LPJwsl 1700 OCN 1700 ORCHIDEE 1700 SDGVM 1700 VISIT 1860 YIBS 1700 lpxqs 1700
Config models and experiments¶
- We use a simple dictionary for model configurations.
- We use CNitExpConfig for experiment configurations.
mdl_cfg_by_mdl = {}
for mdl in mdl_list:
col_df = cali_para_df[["unit", mdl]].dropna()
mdl_cfg_by_mdl[mdl] = {
para: Q(col_df.loc[para, mdl], col_df.loc[para, "unit"])
for para in col_df.index
}
# For simplicity, we turn off the nitrogen pool calculation
mdl_cfg_by_mdl[mdl]["switch_Npls"] = [0, 0, 0, 0]
exp_cfgs_by_mdl = {}
for mdl in mdl_list:
exp_cfgs_by_mdl[mdl] = []
valid_time = trendy_target["time"][
~trendy_target["CflxNPP"].sel(model=mdl, exp="S3").isnull()
].data
for exp in trendy_forcing.exp.data:
exp_forcing = trendy_forcing.sel(exp=exp, time=valid_time)
switch_N = (
Q(0, "1")
if ("Conly" in exp or cali_para_df.loc["NplsP0", mdl] == 0)
else Q(1, "1")
)
exp_cfg = CNitExpConfig(
name=exp,
switch_N=switch_N,
time_axis=Q(exp_forcing["time"].data, exp_forcing["time"].unit),
dT_s=Q(exp_forcing["dT"].data, exp_forcing["dT"].unit),
CO2_s=Q(exp_forcing["CO2"].data, exp_forcing["CO2"].unit),
CemsLUnet_s=Q(exp_forcing["CemsLUnet"].data, exp_forcing["CemsLUnet"].unit),
CemsLUgrs_s=Q(exp_forcing["CemsLUgrs"].data, exp_forcing["CemsLUgrs"].unit),
NflxAD_s=Q(exp_forcing["NflxAD"].data, exp_forcing["NflxAD"].unit),
NflxFT_s=Q(exp_forcing["NflxFT"].data, exp_forcing["NflxFT"].unit),
NemsLUnet_s=Q(exp_forcing["NemsLUnet"].data, exp_forcing["NemsLUnet"].unit),
NemsLUgrs_s=Q(exp_forcing["NemsLUgrs"].data, exp_forcing["NemsLUgrs"].unit),
NemsLUmin_s=Q(exp_forcing["NemsLUmin"].data, exp_forcing["NemsLUmin"].unit),
)
exp_cfgs_by_mdl[mdl].append(exp_cfg)
Run experiments with emulated DGVMs¶
- Here we run the S1-S3 experiments with the DGVM-calibrated CNit (i.e., DGVM-specific CNit).
- S0 is just the preindustrial control.
%%time
res = []
for mdl in mdl_list:
mdl_cfg = mdl_cfg_by_mdl[mdl]
exp_cfgs = exp_cfgs_by_mdl[mdl]
cnit_mdl = CNitModel.from_dict(mdl_cfg)
mdl_res = cnit_mdl.run_experiments(exp_cfgs)
mdl_res = mdl_res.assign_coords(model=mdl).expand_dims("model")
res.append(mdl_res)
print("finished:", mdl)
finished: CABLEPOP finished: CLASSIC finished: CLM5.0 finished: DLEM finished: ED finished: ELM finished: IBIS finished: ISAM finished: ISBACTRIP finished: JSBACH finished: JULES finished: LPJ-GUESS finished: LPJml finished: LPJwsl finished: OCN finished: ORCHIDEE finished: SDGVM finished: VISIT finished: YIBS finished: lpxqs CPU times: user 21.9 s, sys: 88.8 ms, total: 21.9 s Wall time: 22 s
res_ds = xr.concat(res, dim="model")
Visualize results¶
The results below show the agreement between DGVM outputs and CNit emulations, demonstrating the emulation performance of the calibrated CNit.
time = res_ds.time.data
def plot_single_model(axs, var, mdl):
for ax, exp in zip(axs, ["S1", "S2", "S3"]):
data_ = trendy_target[var].sel(model=mdl, exp=exp)
mdl_ = res_ds[var].sel(model=mdl, experiment=exp)
if np.nansum(data_) != 0:
ax.plot(time, data_, linewidth=1)
ax.plot(time, mdl_, "--", linewidth=1)
ax.text(
0.1,
0.9,
exp,
transform=ax.transAxes,
ha="center",
va="center",
fontsize=8,
)
else:
ax.remove()
axs[0].set_title(mdl, fontsize=8)
var = "CflxNPP"
experiments = ["S1", "S2", "S3"]
n_model_per_row = 4 # number of models per row
n_models = len(mdl_list)
n_rows = int(np.ceil(n_models / n_model_per_row)) # use np.ceil
n_cols = n_model_per_row * len(experiments)
fig, axs = plt.subplots(
n_rows, n_cols, figsize=(3 * n_cols / 2.54, 3 * n_rows / 2.54), squeeze=False
)
for idx, mdl in enumerate(mdl_list):
row = idx // n_model_per_row
col_start = (idx % n_model_per_row) * len(experiments)
axs_row = axs[row, col_start : col_start + len(experiments)]
plot_single_model(axs_row, var, mdl)
axs[0, 0].legend(["model", "CNit"], frameon=0, loc="center left")
# fig.savefig("npp cnit vs dgvms.jpg", dpi=450, bbox_inches="tight", pad_inches=0.02)
<matplotlib.legend.Legend at 0x166f8dc70>
var = "CplsPLS"
experiments = ["S1", "S2", "S3"]
n_model_per_row = 4 # number of models per row
n_models = len(mdl_list)
n_rows = int(np.ceil(n_models / n_model_per_row)) # use np.ceil
n_cols = n_model_per_row * len(experiments)
fig, axs = plt.subplots(
n_rows, n_cols, figsize=(3 * n_cols / 2.54, 3 * n_rows / 2.54), squeeze=False
)
for idx, mdl in enumerate(mdl_list):
row = idx // n_model_per_row
col_start = (idx % n_model_per_row) * len(experiments)
axs_row = axs[row, col_start : col_start + len(experiments)]
plot_single_model(axs_row, var, mdl)
axs[0, 0].legend(["model", "CNit"], frameon=0, loc="center left")
# fig.savefig("cland cnit vs dgvms.jpg", dpi=450, bbox_inches="tight", pad_inches=0.02)
<matplotlib.legend.Legend at 0x168142510>
var = "CflxNetPLS"
experiments = ["S1", "S2", "S3"]
n_model_per_row = 4 # number of models per row
n_models = len(mdl_list)
n_rows = int(np.ceil(n_models / n_model_per_row)) # use np.ceil
n_cols = n_model_per_row * len(experiments)
fig, axs = plt.subplots(
n_rows, n_cols, figsize=(3 * n_cols / 2.54, 3 * n_rows / 2.54), squeeze=False
)
for idx, mdl in enumerate(mdl_list):
row = idx // n_model_per_row
col_start = (idx % n_model_per_row) * len(experiments)
axs_row = axs[row, col_start : col_start + len(experiments)]
plot_single_model(axs_row, var, mdl)
axs[0, 0].legend(["model", "CNit"], frameon=0, loc="center left")
# fig.savefig("nbp cnit vs dgvms.jpg", dpi=450, bbox_inches="tight", pad_inches=0.02)
<matplotlib.legend.Legend at 0x175292510>
Run additional experiments¶
- A key strength of CNit is that, after calibration, it can run large ensembles of experiments, filling the gap where complex models (such as DGVMs or ESMs) cannot feasibly run large ensembles due to computational and time constraints.
- Here, we use the emulated CABLEPOP to run a new experiment, S3_double_warming, in which historical warming is doubled to examine how the carbon cycle responds.
cnit_cable = CNitModel.from_dict(mdl_cfg_by_mdl["CABLEPOP"])
exp_forcing = trendy_forcing.sel(exp="S3")
exp_cfg = CNitExpConfig(
name="S3_double_warming",
switch_N=Q(1, "1"),
time_axis=Q(exp_forcing["time"].data, exp_forcing["time"].unit),
dT_s=Q(exp_forcing["dT"].data * 2, exp_forcing["dT"].unit),
CO2_s=Q(exp_forcing["CO2"].data, exp_forcing["CO2"].unit),
CemsLUnet_s=Q(exp_forcing["CemsLUnet"].data, exp_forcing["CemsLUnet"].unit),
CemsLUgrs_s=Q(exp_forcing["CemsLUgrs"].data, exp_forcing["CemsLUgrs"].unit),
NflxAD_s=Q(exp_forcing["NflxAD"].data, exp_forcing["NflxAD"].unit),
NflxFT_s=Q(exp_forcing["NflxFT"].data, exp_forcing["NflxFT"].unit),
NemsLUnet_s=Q(exp_forcing["NemsLUnet"].data, exp_forcing["NemsLUnet"].unit),
NemsLUgrs_s=Q(exp_forcing["NemsLUgrs"].data, exp_forcing["NemsLUgrs"].unit),
NemsLUmin_s=Q(exp_forcing["NemsLUmin"].data, exp_forcing["NemsLUmin"].unit),
)
res_s3_double_warming = cnit_cable.run_experiments([exp_cfg])
res_s3_double_warming = res_s3_double_warming.sel(experiment="S3_double_warming")
res_s3 = res_ds.sel(model="CABLEPOP", experiment="S3")
fig, axs = plt.subplots(1, 2, figsize=(10 / 2.54, 4 / 2.54))
axs[0].plot(time, res_s3["CplsPLS"], label="S3")
axs[0].plot(time, res_s3_double_warming["CplsPLS"], label="S3_double_warming")
axs[0].set_ylabel("Land carbon (GtC)")
axs[1].plot(time, res_s3["CflxNetPLS"], label="S3")
axs[1].plot(time, res_s3_double_warming["CflxNetPLS"], label="S3_double_warming")
axs[1].set_ylabel("Net biome produciton (GtC/yr)")
axs[0].legend()
<matplotlib.legend.Legend at 0x175d85160>