Guidebook -- Data Providers
A “Provider” is a python class that understands how to access a certain kind of LCA data source. Providers implement the BasicArchive
(antelope_core.archives.basic_archive.BasicArchive
) interface, which (though poorly specified) provides a set of useful functions for storing and retrieving entities.
The EntityStore
The EntityStore
is essentially a big hash table that allows you to retrieve an “entity” from a reference string. An entity store has a handful of properties, of which the most useful are:
ref
- the semantic reference for the data collection (this becomes theorigin
of a data set)source
- a file, directory, or URL that contains the data collectionstatic
- a boolean property that indicates (if true) that the entire data collection must be loaded at once.
Its core utilities are:
retrieve_or_fetch_entity(entity_id)
, which loads an entity by its id, and stores it in local memory_add(entity, ref)
, which stores entities in a way that handles validation and retrieval__getitem__
, which retrieves an already-loaded entity by its reference or UUIDload_all()
, which loads all contentserialize()
, which writes the entity store to a JSON file.
It is a partially-abstract class, and the following routines must be implemented by providers:
_load_all()
which is called byload_all()
_fetch()
which is called byretrieve_or_fetch_entity()
when the entity is not known locally
In addition, each provider is responsible for constructing valid entities.
The provider infrastructure is some of the oldest python code in the archive. please don’t judge.
The BasicArchive
implements the key features of the EntityStore
for flows and quantities, as well as introduces a search()
function. A BasicArchive
can be saved and restored from JSON and forms the base class for all other providers. The LcArchive
adds processes to the list of supported entities, and is used for all Life Cycle data sources that contain processes.
Default Providers
The providers available built-in in antelope_core
are:
BasicArchive
(generic container for quantities and flows)LcArchive
(adds processes)Background
(the “trivial” background engine, for accessing files containing rolled-up datasets)OpenLcaJsonLdArchive
(for OpenLCA .zip files)EcoinventLcia
(for ecoinvent-issued LCIA tables)Traci21Factors
(for accessing TRACI 2.1 spreadsheet)XdbClient
(for connecting toxdb
background data servers)
Adding lxml
support ($ pip install lxml
) allows XML-based providers to be loaded:
EcospoldV1Archive
for ecospold v1, including ecoinvent 2.2 and old-style US LCIEcospoldV2Archive
for ecospold v2, including ecoinvent 3.x databasesIlcdArchive
for ILCD datasets (note: this has not been maintained for some time)IlcdLcia
is a subclass ofIlcdArchive
and adds the capability to read stored LCIA results
Adding antelope_background
introduces Tarjan ordering:
TarjanBackground
performs partial ordering of databases and constructs allocated LCI matrices
Adding antelope_foreground
introduces foreground modeling capacity:
LcForeground
provides the capability to create and save fragments, and also stores catalog references to other data sourcesOryxClient
(for connecting tooryx
foreground data servers)