Access data from any source#

ESMValCore supports a modular system for reading data from various data sources. In the future, this module may be extended with support for writing output data.

The interface is defined in the esmvalcore.io.protocol module and the other modules here provide an implementation for a particular data source.

esmvalcore.io#

A modular system for reading input data from various sources.

An input data source can be defined in the configuration by using esmvalcore.config.CFG, for example:

>>> from esmvalcore.config import CFG
>>> CFG["projects"]["CMIP6"]["data"]["local"] = {
        "type": "esmvalcore.local.LocalDataSource",
        "rootpath": "~/climate_data",
        "dirname_template": "{project}/{activity}/{institute}/{dataset}/{exp}/{ensemble}/{mip}/{short_name}/{grid}/{version}",
        "filename_template": "{short_name}_{mip}_{dataset}_{exp}_{ensemble}_{grid}*.nc",
    }

or as a YAML configuration file:

projects:
  CMIP6:
    data:
      local:
        type: "esmvalcore.local.LocalDataSource"
        rootpath: "~/climate_data"
        dirname_template: "{project}/{activity}/{institute}/{dataset}/{exp}/{ensemble}/{mip}/{short_name}/{grid}/{version}"
        filename_template: "{short_name}_{mip}_{dataset}_{exp}_{ensemble}_{grid}*.nc"

where CMIP6 is a project, and local is a unique name describing the data source. The data source type, esmvalcore.local.LocalDataSource, in the example above, needs to implement the esmvalcore.io.protocol.DataSource protocol. Any remaining key-value pairs in the configuration, rootpath, dirname_template, and filename_template in this example, are passed as keyword arguments to the data source when it is created.

If there are multiple data sources configured for a project, deduplication of search results happens based on the esmvalcore.io.protocol.DataElement.name attribute and the "version" facet in esmvalcore.io.protocol.DataElement.facets of the data elements provided by the data sources. If no version facet is specified in the search, the latest version will be used. If there is a tie, the data element provided by the data source with the lowest value of esmvalcore.io.protocol.DataSource.priority is chosen.

Functions:

load_data_sources(session[, project])

Get the list of available data sources.

esmvalcore.io.load_data_sources(session: Session, project: str | None = None) list[DataSource][source]#

Get the list of available data sources.

If no priority is configured for a data source, the default priority of 1 is used.

Parameters:
  • session (Session) – The configuration.

  • project (str | None) – If specified, only data sources for this project are returned.

Returns:

A list of available data sources.

Return type:

list of DataSource

Raises:

ValueError: – If the project or its settings are not found in the configuration.