TUTORIAL Prerequisites 1: Downloading required data for first use of pyaesa

[ ]:
from pyaesa import set_workspace, download_pop_gdp, download_mrio, download_ar6

All of the following datasets should be downloaded once during the first use of the package, after which they can be reused from on-disk archives to substantially reduce the runtime of subsequent executions. Refresh needed only to update population, GDP, GHG emissions historical datasets to include more recent years when they become available.

Recommended citations for the different data sources are provided below for each data source.

Before starting …

Prerequisites 0: Set workspace

[ ]:
# Windows example; update this path before running.
set_workspace(r"C:\Users\username\Documents\aesa_workspace")

# macOS example; update this path before running.
# set_workspace("/Users/username/Documents/aesa_workspace")

Prerequisites 1: Download data

The package separates raw data download from processing data. The download functions are only about downloading the source files that the processing functions will later consume.

This notebook covers the full Download family:

  • download_pop_gdp(...)

  • download_mrio(...)

  • download_ar6(...)

Downstream reuse summary:

Function

Main outputs

Reused later by

download_pop_gdp

raw World Bank / IMF Taiwan / SSP population GDP data

process_pop_gdp(...)

download_mrio

raw EXIOBASE and OECD MRIO archives

process_mrio(...)

download_ar6

raw AR6 scenario explorer + historical emissions data

process_ar6(...) and dynamic AR6 climate change carrying capacities (CC) / allocated carrying capacities (aCC) / ASR workflows that create or reuse processed AR6 outputs through their AR6 CC prerequisites

Population and GDP PPP data: download_pop_gdp(...)

Description

What the function does and what later functions reuse

download_pop_gdp(...) retrieves the raw historical and scenario population / GDP inputs that are later harmonized by process_pop_gdp(...). The processed outputs are then reused by deterministic_asocc(...) and by all deterministic or Monte Carlo downstream workflows that rely on population, GDP, or GDP per capita based allocation logic.

Historical data (World Bank)

Population and GDP (PPP) data are retrieved from the World Bank World Development Indicators database.

13/05/2026: years 1995 to 2024 are covered (the package download function will retrieve more recent years when they become available).

N.B.

In the World Bank database, Taiwan (TWN) is included within China (CHN) while reported separately in SSP scenarios and MRIO tables. Therefore, to ensure consistency across datasets, Taiwan (TWN) data is downloaded via the International Monetary Fund World Economic Outlook database.. It is removed from China (CHN) data in World Bank database at a later stage during data processing.

Prospective data (SSP scenarios)

Population and GDP (PPP) scenario data are retrieved from the IIASA SSP Scenario Explorer.

Years 2025 to 2100 are covered. Downstream workflow uses historical World Bank data whenever available for a given year even if SSP data already exists.

Reference

  • KC, S., Moradhvaj, Potančoková, M., Adhikari, S., Yildiz, D., Mamolo, M., Sobotka, T., Zeman, K., Abel, G., Lutz, W., & Goujon, A. (2024). Wittgenstein Center (WIC) Population and Human Capital Projections - 2023 (Version V13) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.10618931

  • Crespo Cuaresma, Jesús (2017). Income projections for climate change research: A framework based on human capital dynamics. Global Environmental Change, 42, 226–236.

Public argument checklist

The table lists all arguments; the same definitions are available in the function docstring.

Green items = default if omitted.

Do not write green items when the default is intended.

download_pop_gdp(…) arguments

Argument

Description

past_years

If True, include World Bank and IMF Taiwan historical population/GDP raw files. Default True.

future_years

If True, include SSP future population/GDP raw files. Default True.

refresh

If True, clear and rebuild only the selected raw population and GDP tables. past_years=True refreshes the World Bank and IMF Taiwan raw CSV files. future_years=True refreshes the SSP raw CSV file. Processed population and GDP outputs and project outputs are not refreshed. Defaults to False.

Running download_pop_gdp(...)

[ ]:
# Download both historical and future population / GDP inputs.
download_pop_gdp()

MRIOs download_mrio(...)

Description

What the function does and what later functions reuse

The downloaded MRIO archives are used by process_mrio(...). The processed MRIO outputs are then reused by:

  • deterministic_asocc(...) reuses processed MRIO outputs for allocated shares

  • deterministic_io_lca(...) and uncertainty_io_lca(...) reuse processed MRIO outputs for IO-LCA computation and later ASR LCA workflows

  • deterministic and uncertainty aCC / ASR workflows depend on those same upstream outputs whenever they reuse allocation and/or IO-LCA results.

Available MRIO sources

Source key

Historical temporal coverage

Notes

exiobase_3102_ixi

1995-2024

EE MRIO: EXIOBASE ixi option; 2023 and 2024 are nowcasted

exiobase_3102_pxp

1995-2024

EE MRIO: EXIOBASE pxp option; 2023 and 2024 are nowcasted

oecd_v2025

1995-2022

MRIO: OECD ICIO ixi

EXIOBASE 3.9.6 is also available as ``exiobase_396_ixi`` and ``exiobase_396_pxp`` for 1995-2022.

EXIOBASE

EXIOBASE 3.10.2

EXIOBASE 3.10.2 matrices are retrieved from https://doi.org/10.5281/zenodo.20051562 (Stadler et al., 2026) via the Python package PyMRIO (Stadler et al., 2021). Both ixi (industry by industry) and pxp (product by product) variants can be downloaded.

Years 1995-2024 are covered, with 2023 and 2024 provided as nowcast years.

References

  • Stadler, K. (2021). Pymrio: A Python-Based Multi-Regional Input-Output Analysis Toolbox. Journal of Open Research Software, 9(1). https://doi.org/10.5334/jors.251

  • Stadler, K., Wood, R., Bulavskaya, T., Södersten, C.-J., Simas, M., Schmidt, S., Usubiaga, A., Acosta-Fernández, J., Kuenen, J., Bruckner, M., Giljum, S., Lutter, S., Merciai, S., Schmidt, J. H., Theurl, M. C., Plutzar, C., Kastner, T., Eisenmenger, N., Erb, K.-H., … Tukker, A. (2018). EXIOBASE 3: Developing a Time Series of Detailed Environmentally Extended Multi-Regional Input-Output Tables. Journal of Industrial Ecology, 22(3), 502–515. https://doi.org/10.1111/jiec.12715

  • Stadler, K., Wood, R., Bulavskaya, T., Södersten, C.-J., Simas, M., Schmidt, S., Usubiaga, A., Acosta-Fernández, J., Kuenen, J., Bruckner, M., Giljum, S., Lutter, S., Merciai, S., Schmidt, J. H., Theurl, M. C., Plutzar, C., Kastner, T., Eisenmenger, N., Erb, K.-H., … Tukker, A. (2026). EXIOBASE 3 (3.10.2) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.20051562

Reference

  • Stadler, K., Wood, R., Bulavskaya, T., Södersten, C.-J., Simas, M., Schmidt, S., Usubiaga, A., Acosta-Fernández, J., Kuenen, J., Bruckner, M., Giljum, S., Lutter, S., Merciai, S., Schmidt, J. H., Theurl, M. C., Plutzar, C., Kastner, T., Eisenmenger, N., Erb, K.-H., … Tukker, A. (2025). EXIOBASE 3 (3.9.6) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15689391

OECD ICIO

OECD ICIO v2025

OECD ICIO v2025 matrices are retrieved from https://www.oecd.org/en/data/datasets/inter-country-input-output-tables.html via an adaptation of the Python package PyMRIO (Stadler et al., 2021).

OECD ICIO is classified as an ixi (industry by industry) MRIO. It only contains economic data, no LCI (environmental stressors) information.

References

  • Yamano, N. et al. (2023). “Development of the OECD Inter-Country Input-Output Database 2023.” OECD Science, Technology and Industry Working Papers, No. 2023/08, OECD Publishing, Paris. https://doi.org/10.1787/5a5d0665-en

  • Stadler, K. (2021). Pymrio – A Python-Based Multi-Regional Input-Output Analysis Toolbox. Journal of Open Research Software, 9(1). https://doi.org/10.5334/jors.251

Public argument checklist

The table lists all arguments; the same definitions are available in the function docstring.

Green items = default if omitted.

Do not write green items when the default is intended.

download_mrio(…) arguments

Argument

Description

source

MRIO source key (“exiobase_396_ixi”, “exiobase_396_pxp”, “exiobase_3102_ixi”, “exiobase_3102_pxp”, or “oecd_v2025”).

years

Optional year selection. Accepted forms are None, one integer year, a range, or a sequence of integer years. Defaults to None, which selects all supported years for source: EXIOBASE 3.9.6 uses 1995 to 2022, EXIOBASE 3.10.2 uses 1995 to 2024, and OECD ICIO v2025 uses 1995 to 2022.

refresh

If True, download the selected raw MRIO archive scope again to replace previous downloads. For EXIOBASE, the scope is each requested year archive under the selected source and system raw folder. For OECD ICIO, the scope is the OECD bundle containing each requested year, so refreshing one year can replace every yearly CSV extracted from that bundle. Processed MRIO outputs and project outputs are not refreshed. Defaults to False.

Running download_mrio(...)

[ ]:
# Download all available exiobase_3102_ixi years (1995-2024)
download_mrio("exiobase_3102_ixi")
[ ]:
# Download all available exiobase_3102_pxp years (1995-2024)
# download_mrio("exiobase_3102_pxp")
[ ]:
# Download all available oecd_v2025 years (1995-2022)
download_mrio("oecd_v2025")

Climate change IPCC AR6 scenario data download_ar6(...)

Skip if no dynamic climate change carrying capacity is required the study.

Description

What the function does and what later functions reuse

download_ar6(...) retrieves the raw IPCC AR6 Scenario Explorer climate pathways together with the historical GHG/CO2 emissions datasets later reused by process_ar6(...).

This function is needed only when the study will use dynamic climate change carrying capacities rather than only static carrying capacities (i.e., EF3.1: Sala et al. (2020) or Planetary boundaries: Steffen et al. (2015), Sakschewski and Caesar et al. (2025)).

The processed outputs are later reused by:

  • process_ar6(...)

which in turn is reused by

  • deterministic_ar6_cc(...)

  • uncertainty_ar6_cc(...)

  • dynamic deterministic_acc(...)

  • dynamic uncertainty_acc(...)

  • dynamic deterministic_asr(...)

  • dynamic uncertainty_asr(...)

Source

The function retrieves the AR6 public Scenario Explorer table hosted by IIASA together with the historical PRIMAP and Global Carbon Budget datasets used to construct the historical GHG and CO2 historical baselines in process_ar6(...). AR6 categories included: C1-C8; SSPs included: SSP1-SSP5. Downstream process_ar6(...) defaults to the recommended Paris aligned C1 through C4 scope, although additional categories can be included.

When download_ar6() runs, the package also writes a companion TXT file with the recommended source citations and usage notes: pyaesa/data_raw/carrying_capacities/dynamic_climate_change_ar6/recommended_citations_data_sources_and_usage.txt.

Public argument checklist

The table lists all arguments; the same definitions are available in the function docstring.

Green items = default if omitted.

Do not write green items when the default is intended.

download_ar6(…) arguments

Argument

Description

refresh

If True, clears only the AR6 raw output scope, then downloads it again. Processed AR6 outputs, dynamic carrying capacity outputs, and project outputs are not refreshed. Defaults to False.

manager_url

IIASA Scenario Explorer manager endpoint.

Running download_ar6(...)

[ ]:
download_ar6()

What to do next

Continue with tutorials/core_prerequisites/2_process_data.ipynb to convert the raw archive files into processed data ready for the AESA workflow.