xlsx_editor
Class for interchanging entities with an excel spreadsheet.
There are two basic parts to this operation: * we want to be able to write a subset of entities in an archive to an excel worksheet whose name is the same as the entity_type. We do not want to mix different entity types in the same worksheet. We also only want to include metadata-- quantitative data (exchanges and flow characterizations) are not meant to be manipulated in this way. The export to Excel should include the required columns to the leftmost, followed by nonrequired properties. For a given entity/property pair, if the entity lacks that property the entry should be blank.
- We want to be able to read in metadata from a spreadsheet as well- creating new entities as needed, and modifying
existing entities nondestructively according to a policy with two options:
- 'defer': if a property exists for a given entity, let its current value stand. If a property does not exist and the incoming specification is non-null, assign it..
- 'overwrite': Assign all incoming non-null specifications
For the first one, the only capability of the archive that is needed is (arguably) external to the feature-- namely, the ability to generate the entities. Therefore I think it should be a utility function and not an object at all.
For the second one here, this seems like more a capability of an archive rather than a separate class. I think this should be a context mgr. We (for now) want to be strict about the sheet naming- let's take that flexibility away from the user.
The required fields should be: external_ref, *signature_fields. uuid is supported but not required
XlsxArchiveUpdater
Bases: XlsxUpdater
__init__(archive, xlrd_like, merge='defer', quiet=True)
Updates entity meta-information from a spreadsheet created from above. external_ref is used as a key; if an entity is not found, a new one is created. 'uuid' and reference fields are only used for creation of new entities. Note: uses the default sheet names of 'flow' and 'quantity'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
archive |
|
required | |
xlrd_like |
XLRD workbook-like object OR filename of XLS file |
required | |
merge |
['defer'] - do not overwrite existing properties ['overwrite'] - do overwrite existing properties |
'defer'
|
Returns:
Type | Description |
---|---|
|
XlsxEntityTypeWriter
Bases: object
XlsxUpdater
Bases: object
This class uses the contents of a properly formatted XLS file (or XLS-file-like object) to create or update entities in an archive.
Uses sheet names that match the archive's "_entity_types" property, so 'flow' and 'quantity' nominally. 'flows' and 'quantities' also work.
Each sheet must be in strict tabular format, with the first row being headers and each subsequent row being one entity.
The only required column is 'external_ref', though 'name' and 'reference' are recommended for new entities. NOTE: for the time being, quantities require a 'referenceUnit' column and flows require a 'referenceQuantity' column, though the hope is to eliminate that requirement in the future. For quantities, the 'referenceUnit' column should be a unit string; for flows the 'referenceQuantity' column should be a known quantity signifier (external_ref, link, or canonical name recognized by the term manager)
Optional columns include 'uuid' and 'origin'
All other columns are assigned as properties.
FLOW PROPERTIES If there is a sheet called 'flowproperties', then it is interpreted as a list of flow characterizations. This sheet can have the following columns: - 'flow' - 'ref_quantity' - (optional) if present, used only as a check against the flow's native reference_entity - 'ref_unit' - (optional) a convertible unit in ref_quantity - 'quantity' - the quantity being characterized - 'unit' - (optional) a convertible unit in quantity - 'value' - the characterization factor - 'context' - (optional) context for the characterization
CELL CONTENTS Cells are read as strings, except for the following: - if a cell's first character is '!', the subsequent characters are interpreted as an entity reference - if a cell's first character is '*', the subsequent characters are EVALUATED, so obviously this is terribly insecure, but it was intended to allow people to store lists, sets, and dicts.
MERGE STRATEGY The updater has two merge strategies: * "defer": (default) If an entity exists and has a property already defined, defer to the existing property * "overwrite": Any non-null incoming property is assigned to the entity, overwriting any existing value.
__enter__()
Return self object to use with "with" statement.
__init__(xlrd_like, merge='defer', quiet=True)
Updates entity meta-information from a spreadsheet created from above. external_ref is used as a key; if an entity is not found, a new one is created. 'uuid' and reference fields are only used for creation of new entities. Note: uses the default sheet names of 'flow' and 'quantity'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
xlrd_like |
XLRD workbook-like object OR filename of XLS file |
required | |
merge |
['defer'] - do not overwrite existing properties ['overwrite'] - do overwrite existing properties |
'defer'
|
Returns:
Type | Description |
---|---|
|