deriva.core package

Submodules

deriva.core.annotation module

Definitions and implementation for validating ERMrest schema annotations.

deriva.core.annotation.validate(model_obj, tag_name=None, validate_model_names=True)[source]

Validate the annotation(s) of the model object.

Parameters:
  • model_obj – model object container of annotations
  • tag_name – tag name of the annotation to validate, if none, will validate all known annotations
  • validate_model_names – validate model names used in annotations
Returns:

a list of validation errors, if any

deriva.core.base_cli module

class deriva.core.base_cli.BaseCLI(description, epilog, version='1.5.0', hostname_required=False, config_file_required=False)[source]

Bases: object

parse_cli()[source]
remove_options(options)[source]
class deriva.core.base_cli.KeyValuePairArgs(option_strings, dest, nargs=None, **kwargs)[source]

Bases: argparse.Action

deriva.core.datapath module

Definitions and implementations for data-path expressions to query and manipulate (insert, update, delete).

exception deriva.core.datapath.DataPathException(message, reason=None)[source]

Bases: Exception

Exception in a datapath expression.

class deriva.core.datapath.Min(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for minimum non-NULL value.

class deriva.core.datapath.Max(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for maximum non-NULL value.

class deriva.core.datapath.Sum(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for sum of non-NULL values.

class deriva.core.datapath.Avg(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for average of non-NULL values.

class deriva.core.datapath.Cnt(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for count of non-NULL values.

class deriva.core.datapath.CntD(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for count of distinct non-NULL values.

class deriva.core.datapath.Array(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for an array containing all values (including NULL).

class deriva.core.datapath.ArrayD(arg)[source]

Bases: deriva.core.datapath.AggregateFunction

Aggregate function for an array containing distinct values (including NULL).

class deriva.core.datapath.Bin(arg, nbins, minval=None, maxval=None)[source]

Bases: deriva.core.datapath.AggregateFunction

Binning function.

deriva.core.deriva_binding module

class deriva.core.deriva_binding.DerivaBinding(scheme, server, credentials=None, caching=True, session_config=None)[source]

Bases: object

This is a base-class for implementation purposes. Not useful for clients.

static check_path(path)[source]
delete(path, headers={}, guard_response=None)[source]

Perform DELETE request, returning response object.

Arguments:

path: the path within this bound server headers: headers to set in request guard_response: expected current resource state

as previously seen response object.

Uses guard_response to build appropriate ‘if-match’ header to assure change is only applied to expected state.

Raises ConcurrentUpdate for 412 status.

get(path, headers={}, raise_not_modified=False, stream=False)[source]

Perform GET request, returning response object.

Arguments:

path: the path within this bound server headers: headers to set in request raise_not_modified: raise HTTPError for 304 response

status when true.
stream: whether to defer content retrieval to
streaming access mode on response object.

May consult built-in cache and apply ‘if-none-match’ request header unless input headers already include ‘if-none-match’ or ‘if-match’. On cache hit, returns cached response unless raise_not_modified=true.

Caching of new results is disabled when stream=True.

get_authn_session()[source]
get_server_uri()[source]
head(path, headers={}, raise_not_modified=False)[source]

Perform HEAD request, returning response object.

Arguments:

path: the path within this bound server headers: headers to set in request raise_not_modified: raise HTTPError for 304 response

status when true.

May consult built-in cache and apply ‘if-none-match’ request header unless input headers already include ‘if-none-match’ or ‘if-match’. On cache hit, returns cached response unless raise_not_modified=true. Cached response may include content retrieved by GET on the same resource.

post(path, data=None, json=None, headers={})[source]

Perform POST request, returning response object.

Arguments:
path: the path within this bound server data: a buffer or file-like content value json: data to serialize as JSON content headers: headers to set in request

Raises ConcurrentUpdate for 412 status.

post_authn_session(credentials)[source]
put(path, data=None, json=None, headers={}, guard_response=None)[source]

Perform PUT request, returning response object.

Arguments:

path: the path within this bound server data: a buffer or file-like content value json: data to serialize as JSON content headers: headers to set in request guard_response: expected current resource state

as previously seen response object.

Uses guard_response to build appropriate ‘if-match’ header to assure change is only applied to expected state.

Raises ConcurrentUpdate for 412 status.

set_credentials(credentials, server)[source]
class deriva.core.deriva_binding.DerivaClientContext(*args, **kwargs)[source]

Bases: dict

Represent Deriva-Client-Context header content.

Well-known keys (originally defined for Chaise):

  • cid: client application ID i.e. program name
  • wid: window ID i.e. request stream ID
  • pid: page ID i.e. sub-stream ID
  • action: UX action embodied by request(s)
  • table: table upon which application is focused (if any)
  • uinit: True if request initiated by user action

Default values to use process-wide:

  • cid: os.path.basename(sys.argv[0]) if available
  • wid: a random UUID

The process-wide defaults MAY be customized by mutating DerivaClientContext.defaults prior to constructing instances.

encoded()[source]

Encode self as string suitable for Deriva-Client-Context HTTP header.

merged(overrides)[source]
prune()[source]

Prune redundant keys to shorten context representation.

Keys with None value or uinit=False are unnecessary as these are implicitly assumed when absent.

set_defaults(defaults=None)[source]

Set default key-values in self if key not already set.

exception deriva.core.deriva_binding.DerivaPathError[source]

Bases: ValueError

deriva.core.deriva_server module

class deriva.core.deriva_server.DerivaServer(scheme, server, credentials=None, caching=True, session_config=None)[source]

Bases: deriva.core.deriva_binding.DerivaBinding

Persistent handle for a Deriva server.

connect_ermrest(catalog_id, snaptime=None)[source]

Connect to an ERMrest catalog.

Arguments:
catalog_id: e.g., ‘1’ or ‘1@2PM-DGYP-56Z4’ snaptime: e.g., ‘2PM-DGYP-56Z4’ (optional)
create_ermrest_catalog()[source]

Create an ERMrest catalog.

deriva.core.ermrest_catalog module

class deriva.core.ermrest_catalog.ErmrestCatalog(scheme, server, catalog_id, credentials=None, caching=True, session_config=None)[source]

Bases: deriva.core.deriva_binding.DerivaBinding

Persistent handle for an ERMrest catalog.

Provides basic REST client for HTTP methods on arbitrary paths. Caller has to understand ERMrest APIs and compose appropriate paths, headers, and/or content.

Additional utility methods provided for accessing catalog metadata.

catalog_id
clone_catalog(dst_catalog=None, copy_data=True, copy_annotations=True, copy_policy=True, truncate_after=True, exclude_schemas=None)[source]

Clone this catalog’s content into dest_catalog, creating a new catalog if needed.

Parameters:
  • dst_catalog – Destination catalog or None to request creation of new destination (default).
  • copy_data – Copy table contents when True (default).
  • copy_annotations – Copy annotations when True (default).
  • copy_policy – Copy access-control policies when True (default).
  • truncate_after – Truncate destination history after cloning when True (default).
  • exclude_schemas – A list of schema names to exclude from the cloning process.

When dest_catalog is provided, attempt an idempotent clone, assuming content MAY be partially cloned already using the same parameters. This routine uses a table-level annotation “tag:isrd.isi.edu,2018:clone-state” to save progress markers which help it restart efficiently if interrupted.

Cloning preserves source row RID values so that any RID-based foreign keys are still valid. It is not generally advisable to try to merge more than one source into the same clone, nor to clone on top of rows generated locally in the destination, since this could cause duplicate RID conflicts.

Truncation after cloning avoids retaining incremental snapshots which contain partial clones.

delete(path, headers={}, guard_response=None)[source]

Perform DELETE request, returning response object.

Arguments:

path: the path within this bound catalog headers: headers to set in request guard_response: expected current resource state

as previously seen response object.

Uses guard_response to build appropriate ‘if-match’ header to assure change is only applied to expected state.

Raises ConcurrentUpdate for 412 status.

delete_ermrest_catalog(really=False)[source]

Perform DELETE request, destroying catalog on server.

Arguments:
really: delete when True, abort when False (default)
exists()[source]

Simple boolean test for catalog existence.

Returns:True if exists, False if not (404), otherwise raises exception
getAsFile(path, destfilename, headers={}, callback=None, delete_if_empty=False, paged=False, page_size=100000)[source]

Retrieve catalog data streamed to destination file. Caller is responsible to clean up file even on error, when the file may or may not be exist. If “delete_if_empty” is True, the file will be inspected for “empty” content. In the case of json/json-stream content, the presence of a single empty JSON object will be tested for. In the case of CSV content, the file will be parsed with CSV reader to determine that only a single header line and no row data is present.

getCatalogModel()[source]
getCatalogSchema()[source]
getDefaultColumns(row, table, exclude=None, quote_url=True)[source]
getPathBuilder()[source]

Returns the ‘path builder’ interface for this catalog.

getTableColumns(fq_table_name)[source]
getTableSchema(fq_table_name)[source]
latest_snapshot()[source]

Gets a handle to this catalog’s latest snapshot.

static splitQualifiedCatalogName(name)[source]
table_schemas = {}
validateRowColumns(row, fq_tableName)[source]
exception deriva.core.ermrest_catalog.ErmrestCatalogMutationError[source]

Bases: Exception

class deriva.core.ermrest_catalog.ErmrestSnapshot(scheme, server, catalog_id, snaptime, credentials=None, caching=True, session_config=None)[source]

Bases: deriva.core.ermrest_catalog.ErmrestCatalog

Persistent handle for an ERMrest catalog snapshot.

Inherits from ErmrestCatalog and provides the same interfaces, except that the interfaces are now bound to a fixed snapshot of the catalog.

snaptime

The snaptime for this catalog snapshot instance.

deriva.core.ermrest_config module

deriva.core.ermrest_model module

class deriva.core.ermrest_model.ArrayType(type_doc)[source]

Bases: deriva.core.ermrest_model.Type

Named domain type.

prejson(prune=True)[source]
class deriva.core.ermrest_model.Column(table, column_doc)[source]

Bases: object

Named column.

alter(name=<deriva.core.ermrest_model.NoChange object>, type=<deriva.core.ermrest_model.NoChange object>, nullok=<deriva.core.ermrest_model.NoChange object>, default=<deriva.core.ermrest_model.NoChange object>, comment=<deriva.core.ermrest_model.NoChange object>, acls=<deriva.core.ermrest_model.NoChange object>, acl_bindings=<deriva.core.ermrest_model.NoChange object>, annotations=<deriva.core.ermrest_model.NoChange object>)[source]

Alter existing schema definition.

Parameters:
  • name – Replacement column name (default nochange)
  • type – Replacement Type instance (default nochange)
  • nullok – Replacement nullok value (default nochange)
  • default – Replacement default value (default nochange)
  • comment – Replacement comment (default nochange)
  • acls – Replacement ACL configuration (default nochange)
  • acl_bindings – Replacement ACL bindings (default nochange)
  • annotations – Replacement annotations (default nochange)

Returns self (to allow for optional chained access).

apply(existing=None)[source]

Apply configuration to corresponding column in catalog unless existing already matches.

Parameters:existing – An instance comparable to self, or None to apply configuration unconditionally.

The state of self.comment, self.annotations, self.acls, and self.acl_bindings will be applied to the server unless they match their corresponding state in existing.

asset

Convenience property for managing content of object annotation tag:isrd.isi.edu,2017:asset

catalog
clear(clear_comment=False)[source]

Clear all configuration in column

NOTE: as a backwards-compatible heuristic, comments are retained by default so that a typical configuration-management client does not strip useful documentation from existing models.

column_display

Convenience property for managing content of object annotation tag:isrd.isi.edu,2016:column-display

classmethod define(cname, ctype, nullok=True, default=None, comment=None, acls={}, acl_bindings={}, annotations={})[source]

Build a column definition.

display

Convenience property for managing content of object annotation tag:misd.isi.edu,2015:display

drop()[source]

Remove this column from the remote database.

generated

Convenience property for managing presence of annotation tag:isrd.isi.edu,2016:generated

immutable

Convenience property for managing presence of annotation tag:isrd.isi.edu,2016:immutable

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

prejson_colref()[source]
uri_path

URI to this model resource.

class deriva.core.ermrest_model.DomainType(type_doc)[source]

Bases: deriva.core.ermrest_model.Type

Named domain type.

prejson(prune=True)[source]
class deriva.core.ermrest_model.ForeignKey(table, fkey_doc)[source]

Bases: object

Named foreign key.

alter(constraint_name=<deriva.core.ermrest_model.NoChange object>, on_update=<deriva.core.ermrest_model.NoChange object>, on_delete=<deriva.core.ermrest_model.NoChange object>, comment=<deriva.core.ermrest_model.NoChange object>, acls=<deriva.core.ermrest_model.NoChange object>, acl_bindings=<deriva.core.ermrest_model.NoChange object>, annotations=<deriva.core.ermrest_model.NoChange object>)[source]

Alter existing schema definition.

Parameters:
  • constraint_name – Replacement constraint name string
  • on_update – Replacement on-update action string
  • on_delete – Replacement on-delete action string
  • comment – Replacement comment (default nochange)
  • acls – Replacement ACL configuration (default nochange)
  • acl_bindings – Replacement ACL bindings (default nochange)
  • annotations – Replacement annotations (default nochange)

Returns self (to allow for optional chained access).

apply(existing=None)[source]

Apply configuration to corresponding table in catalog unless existing already matches.

Parameters:existing – An instance comparable to self, or None to apply configuration unconditionally.

The state of self.comment, self.annotations, self.acls, and self.acl_bindings will be applied to the server unless they match their corresponding state in existing.

catalog
clear(clear_comment=False)[source]

Clear all configuration in foreign key

NOTE: as a backwards-compatible heuristic, comments are retained by default so that a typical configuration-management client does not strip useful documentation from existing models.

column_map

Mapping of foreign_key_columns elements to referenced_columns elements.

columns

Sugared access to self.column_definitions

classmethod define(fk_colnames, pk_sname, pk_tname, pk_colnames, on_update='NO ACTION', on_delete='NO ACTION', constraint_names=[], comment=None, acls={}, acl_bindings={}, annotations={})[source]
digest_referenced_columns(model)[source]

Finish construction deferred until model is known with all tables.

drop()[source]

Remove this foreign key from the remote database.

foreign_key

Convenience property for managing content of object annotation tag:isrd.isi.edu,2016:foreign-key

name

Constraint name (schemaobj, name_str) used in API dictionaries.

name_in_model(model)[source]

Constraint name (schemaobj, name_str) used in API dictionaries fetching schema from model.

While self.name works as a key within the same model tree, self.name_in_model(dstmodel) works in dstmodel tree by finding the equivalent schemaobj in that model via schema name lookup.

names

Constraint names field as seen in JSON document.

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

uri_path

URI to this model resource.

class deriva.core.ermrest_model.Key(table, key_doc)[source]

Bases: object

Named key.

alter(constraint_name=<deriva.core.ermrest_model.NoChange object>, comment=<deriva.core.ermrest_model.NoChange object>, annotations=<deriva.core.ermrest_model.NoChange object>)[source]

Alter existing schema definition.

Parameters:
  • constraint_name – Unqualified constraint name string
  • comment – Replacement comment (default nochange)
  • annotations – Replacement annotations (default nochange)

Returns self (to allow for optional chained access).

apply(existing=None)[source]

Apply configuration to corresponding table in catalog unless existing already matches.

Parameters:existing – An instance comparable to self, or None to apply configuration unconditionally.

The state of self.comment and self.annotations will be applied to the server unless they match their corresponding state in existing.

catalog
clear(clear_comment=False)[source]

Clear all configuration in key

NOTE: as a backwards-compatible heuristic, comments are retained by default so that a typical configuration-management client does not strip useful documentation from existing models.

columns

Sugared access to self.unique_columns

classmethod define(colnames, constraint_names=[], comment=None, annotations={})[source]

Build a key definition.

drop()[source]

Remove this key from the remote database.

name

Constraint name (schemaobj, name_str) used in API dictionaries.

name_in_model(model)[source]

Constraint name (schemaobj, name_str) used in API dictionaries fetching schema from model.

While self.name works as a key within the same model tree, self.name_in_model(dstmodel) works in dstmodel tree by finding the equivalent schemaobj in that model via schema name lookup.

names

Constraint names field as seen in JSON document.

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

uri_path

URI to this model resource.

class deriva.core.ermrest_model.KeyedList(l)[source]

Bases: list

Keyed list.

append(e)[source]

Append element to list and record its key.

class deriva.core.ermrest_model.Model(catalog, model_doc)[source]

Bases: object

Top-level catalog model.

apply(existing=None)[source]

Apply catalog configuration to catalog unless existing already matches.

Parameters:existing – An instance comparable to self.

The configuration in self will be applied recursively to the corresponding model nodes in schema.

If existing is not provided (default), the current whole configuration will be retrieved from the catalog and used automatically to determine whether the configuration goals under this Model tree are already met or need to be remotely applied.

bulk_upload

Convenience property for managing content of object annotation tag:isrd.isi.edu,2017:bulk-upload

catalog
clear(clear_comment=False)[source]

Clear all configuration in catalog and children.

NOTE: as a backwards-compatible heuristic, comments are retained by default so that a typical configuration-management client does not strip useful documentation from existing models.

column(sname, tname, cname)[source]

Return column configuration for column with given name.

create_schema(schema_def)[source]

Add a new schema to this model in the remote database based on schema_def.

Returns a new Schema instance based on the server-supplied representation of the newly created schema.

The returned Schema is also added to self.schemas.

digest_fkeys()[source]

Finish second-pass digestion of foreign key definitions using full model w/ all schemas and tables.

display

Convenience property for managing content of object annotation tag:misd.isi.edu,2015:display

fkey(constraint_name_pair)[source]

Return configuration for foreign key with given name pair.

Accepts (schema_name, constraint_name) pairs as found in many faceting annotations and (schema_obj, constraint_name) pairs as found in fkey.name fields.

classmethod fromcatalog(catalog)[source]

Retrieve catalog config as a Model management object.

classmethod fromfile(catalog, schema_file)[source]

Deserialize a JSON schema file as a Model management object.

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

table(sname, tname)[source]

Return table configuration for table with given name.

uri_path

URI to this model resource.

class deriva.core.ermrest_model.NoChange[source]

Bases: object

Special class used to distinguish no-change default arguments to methods.

Values for no-change are distinct from all valid values for

these arguments.

class deriva.core.ermrest_model.Schema(model, sname, schema_doc)[source]

Bases: object

Named schema.

alter(schema_name=<deriva.core.ermrest_model.NoChange object>, comment=<deriva.core.ermrest_model.NoChange object>, acls=<deriva.core.ermrest_model.NoChange object>, annotations=<deriva.core.ermrest_model.NoChange object>)[source]

Alter existing schema definition.

Parameters:
  • schema_name – Replacement schema name (default nochange)
  • comment – Replacement comment (default nochange)
  • acls – Replacement ACL configuration (default nochange)
  • annotations – Replacement annotations (default nochange)

Returns self (to allow for optional chained access).

apply(existing=None)[source]

Apply configuration to corresponding schema in catalog unless existing already matches.

Parameters:existing – An instance comparable to self, or None to apply configuration unconditionally.

The state of self.comment, self.annotations, and self.acls will be applied to the server unless they match their corresponding state in existing.

catalog
clear(clear_comment=False)[source]

Clear all configuration in schema and children.

NOTE: as a backwards-compatible heuristic, comments are retained by default so that a typical configuration-management client does not strip useful documentation from existing models.

create_table(table_def)[source]

Add a new table to this schema in the remote database based on table_def.

Returns a new Table instance based on the server-supplied representation of the newly created table.

The returned Table is also added to self.tables.

classmethod define(sname, comment=None, acls={}, annotations={})[source]

Build a schema definition.

display

Convenience property for managing content of object annotation tag:misd.isi.edu,2015:display

drop()[source]

Remove this schema from the remote database.

prejson(prune=True)[source]

Produce native Python representation of schema, suitable for JSON serialization.

uri_path

URI to this model resource.

class deriva.core.ermrest_model.Table(schema, tname, table_doc)[source]

Bases: object

Named table.

alter(schema_name=<deriva.core.ermrest_model.NoChange object>, table_name=<deriva.core.ermrest_model.NoChange object>, comment=<deriva.core.ermrest_model.NoChange object>, acls=<deriva.core.ermrest_model.NoChange object>, acl_bindings=<deriva.core.ermrest_model.NoChange object>, annotations=<deriva.core.ermrest_model.NoChange object>)[source]

Alter existing schema definition.

Parameters:
  • schema_name – Destination schema name (default nochange)
  • table_name – Replacement table name (default nochange)
  • comment – Replacement comment (default nochange)
  • acls – Replacement ACL configuration (default nochange)
  • acl_bindings – Replacement ACL bindings (default nochange)
  • annotations – Replacement annotations (default nochange)

A change of schema name is a transfer of the existing table to an existing destination schema (not a rename of the current containing schema).

Returns self (to allow for optional chained access).

alternatives

Convenience property for managing content of object annotation tag:isrd.isi.edu,2016:table-alternatives

apply(existing=None)[source]

Apply configuration to corresponding table in catalog unless existing already matches.

Parameters:existing – An instance comparable to self, or None to apply configuration unconditionally.

The state of self.comment, self.annotations, self.acls, and self.acl_bindings will be applied to the server unless they match their corresponding state in existing.

catalog
clear(clear_comment=False)[source]

Clear all configuration in table and children.

NOTE: as a backwards-compatible heuristic, comments are retained by default so that a typical configuration-management client does not strip useful documentation from existing models.

columns

Sugared access to self.column_definitions

create_column(column_def)[source]

Add a new column to this table in the remote database based on column_def.

Returns a new Column instance based on the server-supplied representation of the new column, and adds it to self.column_definitions too.

create_fkey(fkey_def)[source]

Add a new foreign key to this table in the remote database based on fkey_def.

Returns a new ForeignKey instance based on the server-supplied representation of the new foreign key, and adds it to self.fkeys too.

create_key(key_def)[source]

Add a new key to this table in the remote database based on key_def.

Returns a new Key instance based on the server-supplied representation of the new key, and adds it to self.keys too.

classmethod define(tname, column_defs=[], key_defs=[], fkey_defs=[], comment=None, acls={}, acl_bindings={}, annotations={}, provide_system=True)[source]

Build a table definition.

Parameters:
  • tname – the name of the newly defined table
  • column_defs – a list of Column.define() results for extra or overridden column definitions
  • key_defs – a list of Key.define() results for extra or overridden key constraint definitions
  • fkey_defs – a list of ForeignKey.define() results for foreign key definitions
  • comment – a comment string for the table
  • acls – a dictionary of ACLs for specific access modes
  • acl_bindings – a dictionary of dynamic ACL bindings
  • annotations – a dictionary of annotations
  • provide_system – whether to inject standard system column definitions when missing from column_defs
classmethod define_asset(sname, tname, hatrac_template=None, column_defs=[], key_defs=[], fkey_defs=[], comment=None, acls={}, acl_bindings={}, annotations={}, provide_system=True)[source]

Build an asset table definition.

Parameters:
  • sname – the name of the schema for the asset table
  • tname – the name of the newly defined table
  • hatrac_template

    template for the hatrac URL. Will undergo substitution to template can include elmenents such at {{{MD5}}} or {{{Filename}}}. The default template puts files in

    /hatrac/schema_name/table_name/md5.filename

    where the filename and md5 value is computed on upload and the schema_name and table_name are the values of the provided arguments. If value is set to False, no hatrac_template is used.

  • column_defs – a list of Column.define() results for extra or overridden column definitions
  • key_defs – a list of Key.define() results for extra or overridden key constraint definitions
  • fkey_defs – a list of ForeignKey.define() results for foreign key definitions
  • comment – a comment string for the table
  • acls – a dictionary of ACLs for specific access modes
  • acl_bindings – a dictionary of dynamic ACL bindings
  • annotations – a dictionary of annotations
  • provide_system – whether to inject standard system column definitions when missing from column_defs

These core asset table columns are generated automatically if absent from the input column_defs.

  • Filename: ermrest_curie, unique not null, default curie template “%s:{RID}” % curie_prefix
  • URL: Location of the asset, unique not null. Default template is:
    /hatrac/sname/tname/{{{MD5}}}.{{{Filename}}} where tname is the name of the asset table.
  • Length: Length of the asset.
  • MD5: text
  • Description: markdown, not null

However, caller-supplied definitions override the default.

In addition to creating the columns, this function also creates an asset annotation on the URL column to facilitate use of the table by Chaise.

classmethod define_vocabulary(tname, curie_template, uri_template='/id/{RID}', column_defs=[], key_defs=[], fkey_defs=[], comment=None, acls={}, acl_bindings={}, annotations={}, provide_system=True)[source]

Build a vocabulary table definition.

Parameters:
  • tname – the name of the newly defined table
  • curie_template – the RID-based template for the CURIE of locally-defined terms, e.g. ‘MYPROJECT:{RID}’
  • uri_template – the RID-based template for the URI of locally-defined terms, e.g. ‘https://server.example.org/id/{RID}’
  • column_defs – a list of Column.define() results for extra or overridden column definitions
  • key_defs – a list of Key.define() results for extra or overridden key constraint definitions
  • fkey_defs – a list of ForeignKey.define() results for foreign key definitions
  • comment – a comment string for the table
  • acls – a dictionary of ACLs for specific access modes
  • acl_bindings – a dictionary of dynamic ACL bindings
  • annotations – a dictionary of annotations
  • provide_system – whether to inject standard system column definitions when missing from column_defs

These core vocabulary columns are generated automatically if absent from the input column_defs.

  • ID: ermrest_curie, unique not null, default curie template “%s:{RID}” % curie_prefix
  • URI: ermrest_uri, unique not null, default URI template “/id/{RID}”
  • Name: text, unique not null
  • Description: markdown, not null
  • Synonyms: text[]

However, caller-supplied definitions override the default.

display

Convenience property for managing content of object annotation tag:misd.isi.edu,2015:display

drop()[source]

Remove this table from the remote database.

fkey_by_column_map(from_to_map, raise_nomatch=True)[source]

Return fkey from self.foreign_keys with matching {referencing: referenced} column mapping.

from_to_map: dict-like mapping with items() method yielding (from_col, to_col) pairs raise_nomatch: for True, raise KeyError on non-match, else return None

fkeys_by_columns(from_columns, partial=False, raise_nomatch=True)[source]

Iterable of fkeys from self.foreign_keys with matching columns.

from_columns: iterable of referencing column instances or column names partial: include fkeys which cover a superset of from_columns raise_nomatch: for True, raise KeyError on empty iterable

generated

Convenience property for managing presence of annotation tag:isrd.isi.edu,2016:generated

immutable

Convenience property for managing presence of annotation tag:isrd.isi.edu,2016:immutable

is_association(min_arity=2, max_arity=2, unqualified=True, pure=True, no_overlap=True)[source]

Return (truthy) integer arity if self is a matching association, else False.

min_arity: minimum number of associated fkeys (default 2) max_arity: maximum number of associated fkeys (default 2) or None unqualified: reject qualified associations when True (default True) pure: reject impure assocations when True (default True) no_overlap: reject overlapping associations when True (default True)

The default behavior with no arguments is to test for pure, unqualified, non-overlapping, binary assocations.

An association is comprised of several foreign keys which are covered by a non-nullable composite row key. This allows specific combinations of foreign keys to appear at most once.

The arity of an association is the number of foreign keys being associated. A typical binary association has arity=2.

An unqualified association contains only the foreign key material in its row key. Conversely, a qualified association mixes in other material which means that a specific combination of foreign keys may repeat with different qualifiers.

A pure association contains only row key material. Conversely, an impure association includes additional metadata columns not covered by the row key. Unlike qualifiers, impure metadata merely decorates an association without augmenting its identifying characteristics.

A non-overlapping association does not share any columns between multiple foreign keys. This means that all combinations of foreign keys are possible. Conversely, an overlapping association shares some columns between multiple foreign keys, potentially limiting the combinations which can be represented in an association row.

These tests ignore the five ERMrest system columns and any corresponding constraints.

key_by_columns(unique_columns, raise_nomatch=True)[source]

Return key from self.keys with matching unique columns.

unique_columns: iterable of column instances or column names raise_nomatch: for True, raise KeyError on non-match, else return None

prejson(prune=True)[source]
classmethod system_column_defs(custom=[])[source]

Build standard system column definitions, merging optional custom definitions.

classmethod system_key_defs(custom=[])[source]

Build standard system key definitions, merging optional custom definitions.

table_display

Convenience property for managing content of object annotation tag:isrd.isi.edu,2016:table-display

uri_path

URI to this model element.

visible_columns

Convenience property for managing content of object annotation tag:isrd.isi.edu,2016:visible-columns

visible_foreign_keys

Convenience property for managing content of object annotation tag:isrd.isi.edu,2016:visible-foreign-keys

class deriva.core.ermrest_model.Type(type_doc)[source]

Bases: object

Named type.

prejson(prune=True)[source]
deriva.core.ermrest_model.equivalent(doc1, doc2, method=None)[source]

Determine whether two dict/array/literal documents are structurally equivalent.

deriva.core.ermrest_model.make_type(type_doc)[source]

Create instance of Type, DomainType, or ArrayType as appropriate for type_doc.

deriva.core.ermrest_model.object_annotation(tag_uri)[source]

Decorator to establish property getter/setter/deleter for object annotations.

Usage example:

@presence_annotation(tag.display) def display(self): pass

The stub method will be discarded.

deriva.core.ermrest_model.presence_annotation(tag_uri)[source]

Decorator to establish property getter/setter/deleter for presence annotations.

Usage example:

@presence_annotation(tag.generated) def generated(self): pass

The stub method will be discarded.

deriva.core.ermrest_model.strip_nochange(d)[source]

deriva.core.hatrac_cli module

class deriva.core.hatrac_cli.DerivaHatracCLI(description, epilog)[source]

Bases: deriva.core.base_cli.BaseCLI

Deriva Hatrac Command-line Interface.

delacl(args)[source]

Implements the getacl sub-command.

delobj(args)[source]

Implements the delobj sub-command.

getacl(args)[source]

Implements the getacl sub-command.

getobj(args)[source]

Implements the getobj sub-command.

list(args)[source]

Implements the list sub-command.

main()[source]

Main routine of the CLI.

mkdir(args)[source]

Implements the mkdir sub-command.

putobj(args)[source]

Implements the putobj sub-command.

rmdir(args)[source]

Implements the mkdir sub-command.

setacl(args)[source]

Implements the setacl sub-command.

exception deriva.core.hatrac_cli.DerivaHatracCLIException(message)[source]

Bases: Exception

Base exception class for DerivaHatracCli.

exception deriva.core.hatrac_cli.ResourceException(message, cause)[source]

Bases: deriva.core.hatrac_cli.DerivaHatracCLIException

Remote resource exception.

exception deriva.core.hatrac_cli.UsageException(message)[source]

Bases: deriva.core.hatrac_cli.DerivaHatracCLIException

Usage exception.

deriva.core.hatrac_cli.main()[source]

deriva.core.hatrac_store module

exception deriva.core.hatrac_store.HatracHashMismatch[source]

Bases: ValueError

exception deriva.core.hatrac_store.HatracJobAborted[source]

Bases: Exception

exception deriva.core.hatrac_store.HatracJobPaused[source]

Bases: Exception

exception deriva.core.hatrac_store.HatracJobTimeout[source]

Bases: Exception

class deriva.core.hatrac_store.HatracStore(scheme, server, credentials=None, session_config=None)[source]

Bases: deriva.core.deriva_binding.DerivaBinding

cancel_upload_job(path, job_id)[source]
content_equals(path, filename=None, md5=None, sha256=None)[source]

Check if a remote object’s content is equal to the content of the at least one of the specified input file, input md5, or input sha256 by comparing MD5 hashes. :return: True IFF the object exists and the MD5 or SHA256 hash matches the MD5 or SHA256 hash of the input file

or the passed MD5 or SHA256 parameters.
create_namespace(namespace_path, parents=True)[source]

Create a namespace.

create_upload_job(path, file_path, md5, sha256, create_parents=True, chunk_size=10485760, content_type=None, content_disposition=None)[source]
del_acl(resource_name, access, role=None)[source]

Delete the object or namespace ACL resource.

del_obj(path)[source]

Delete an object.

delete_namespace(namespace_path)[source]

Delete a namespace.

finalize_upload_job(path, job_id)[source]
get_acl(resource_name, access=None, role=None)[source]

Get the object or namespace ACL resource.

get_obj(path, headers={}, destfilename=None, callback=None)[source]

Retrieve resource optionally streamed to destination file.

If destfilename is provided, download content to file with that name. Caller is responsible to clean up file even on error, when the file may or may not be exist.

If hatrac provides a Content-MD5 response header, the resulting download file will be hash-verified on success or raise HatracHashMismatch on errors. This is not verified when destfilename is None, as the client must instead consume and validate content directly from the response object.

get_upload_job(path, job_id)[source]
is_valid_namespace(namespace_path)[source]

Check if a namespace already exists.

put_loc(path, file_path, headers={}, md5=None, sha256=None, content_type=None, content_disposition=None, chunked=False, chunk_size=10485760, create_parents=True, allow_versioning=True, callback=None, cancel_job_on_error=True)[source]
Parameters:
  • path
  • file_path
  • headers
  • md5
  • sha256
  • content_type
  • content_disposition
  • chunked
  • chunk_size
  • create_parents
  • allow_versioning
  • callback
  • cancel_job_on_error
Returns:

put_obj(path, data, headers={}, md5=None, sha256=None, parents=True, content_type=None, content_disposition=None, allow_versioning=True)[source]

Idempotent upload of object, returning object location URI.

Arguments:
path: name of object data: filename or seekable file-like object headers: additional headers md5: a base64 encoded md5 digest may be provided in order to skip the automatic hash computation sha256: a base64 encoded sha256 digest may be provided in order to skip the automatic hash computation parents: automatically create parent namespace(s) if missing content_type: the content-type of the object (optional) content_disposition: the preferred content-disposition of the object (optional) allow_versioning: reject with NotModified if content already exists (optional)

Automatically computes and sends Content-MD5 if no digests provided.

If an object-version already exists under the same name with the same Content-MD5, that location is returned instead of creating a new one.

put_obj_chunked(path, file_path, job_id, chunk_size=10485760, callback=None, start_chunk=0, cancel_job_on_error=True)[source]
retrieve_namespace(namespace_path)[source]

Retrieve a namespace.

set_acl(resource_name, access, roles, add_role=False)[source]

Set the object or namespace ACL resource.

if ‘add_role’ is True, the operation will add a single role to the ACL, else it will attempt to replace all of the ACL’s roles. This option is only valid when a list of one role is given.

deriva.core.polling_ermrest_catalog module

class deriva.core.polling_ermrest_catalog.PollingErmrestCatalog(scheme, server, catalog_id, credentials={}, caching=True, session_config=None, amqp_server=None)[source]

Bases: deriva.core.ermrest_catalog.ErmrestCatalog

Persistent handle for an ERMrest catalog.

Provides a higher-level state_change_once() idiom to efficiently find candidate rows, transform them, and apply updates.

Provides a higher-level blocking_poll() idiom to efficiently poll a catalog, using AMQP to optimize polling where possible. (AMQP is currently limited to clients on localhost of catalog in practice.)

These features can be composed to implement condition-action agents with domain-specific logic, e.g.

catalog = ErmrestCatalog(…) idle_etag = None

def look_for_work():

global idle_etag idle_etag, batch = catalog.state_change_once(

# claim up to 5 items per batch ‘/entity/Foo/state=actionable?limit=5’, ‘/attributegroup/Foo/id;state’, lambda row: {‘id’: row[‘id’], ‘state’: ‘claimed’}, idle_etag

) for candidate, update in batch:

# assume we have free reign on claimed candidates # using state=claimed as a semaphore revision = candidate.copy() revision[‘state’] = update[‘state’] … # do agent work revision[‘state’] = ‘complete’ catalog.put(‘/entity/Foo’, [revision])

catalog.blocking_poll(look_for_work)

blocking_poll(look_for_work, polling_seconds=600, coalesce_seconds=0.1)[source]

Use ERMrest change-notice monitoring to optimize polled work processing.

Client-provided look_for_work function finds actual work in ERMrest and processes it. We only optimize the scheduling of this work.

Run look_for_work() whenever there might be more work in ERMrest.

If look_for_work() returns True, assume there is more work.

If look_for_work() returns non-True, wait for ERMrest change-notice or polling_seconds timeout before looking again (whichever comes first).

On any change-monitoring communication error, assume there might be more work and restart the monitoring process.

Other exceptions abort the blocking_poll() call.

state_change_once(query_datapath, update_datapath, row_transform_func, idle_etag=None)[source]

Perform generic conditional state update via GET-PUT sequence.

Arguments:
query_datapath: a query for candidate rows update_datapath: an update to consume update rows row_transform_func: maps candidate to update rows idle_etag: no-op if table is still in this state
Returns: (idle_etag, [(candidate, update)…])
idle_etag: value to thread to future calls [(candidate, update)…]: each row that was updated

Exceptions from the transform or update process will abort without returning results.

  1. GET query_datapath to get candidate row(s)
  2. apply row_transform_func(row) to get updated content
  3. PUT update_datapath to deliver transformed content – discards rows transformed to None

Uses opportunistic concurrency control with ETag, If-Match, etc. for safety.

Module contents

deriva.core.get_credential(host, credential_file='/root/.deriva/credential.json', globus_credential_file='/root/.deriva/globus-credential.json', config_file='/root/.deriva/config.json', requested_scope=None, force_scope_lookup=False, match_scope_tag='deriva-all')[source]

This function is used to get authorization credentials (in dict form) for use with various deriva-py API calls which take it as a parameter. A user must have already authenticated to the target host using either deriva-auth or deriva-globus-auth-utils login prior to calling this function, or the credential set for the host will not be found.

Parameters:
  • host – The hostname to retrieve the credential set for.
  • credential_file – Optional path to non-default location of the webauthn cookie credential file.
  • globus_credential_file – Optional path to non-default location of the GlobusAuth bearer token store.
  • config_file – Optional path to the non-default location of the deriva-py config file.
  • requested_scope – Optional, specific scope request string for the given host. If not specified, the webauthn service on the host will be queried to determine the host-to-scope mapping that should be used.
  • force_scope_lookup – Optional parameter to force the webauthn scope query and update the cached value in the deriva-py config file. A scope lookup will always be performed the first time a host-to-scope mapping is needed and is not already present in the configuration file for a given host.
  • match_scope_tag – In the case that a host-to-scope mapping request returns multiple scopes, this is the key value (“tag”) to match against in the result dict. By convention, the default is set to “deriva-all”, which is the expected response from webauthn.
Returns:

A dict containing credential authorization values mapped by authorization type

deriva.core.get_credential(host, credential_file=DEFAULT_CREDENTIAL_FILE, globus_credential_file=DEFAULT_GLOBUS_CREDENTIAL_FILE, config_file=DEFAULT_CONFIG_FILE, requested_scope=None, force_scope_lookup=False, match_scope_tag="deriva-all")[source]

This function is used to get authorization credentials (in dict form) for use with various deriva-py API calls which take it as a parameter. A user must have already authenticated to the target host using either deriva-auth or deriva-globus-auth-utils login prior to calling this function, or the credential set for the host will not be found.

Parameters:
  • host – The hostname to retrieve the credential set for.
  • credential_file – Optional path to non-default location of the webauthn cookie credential file.
  • globus_credential_file – Optional path to non-default location of the GlobusAuth bearer token store.
  • config_file – Optional path to the non-default location of the deriva-py config file.
  • requested_scope – Optional, specific scope request string for the given host. If not specified, the webauthn service on the host will be queried to determine the host-to-scope mapping that should be used.
  • force_scope_lookup – Optional parameter to force the webauthn scope query and update the cached value in the deriva-py config file. A scope lookup will always be performed the first time a host-to-scope mapping is needed and is not already present in the configuration file for a given host.
  • match_scope_tag – In the case that a host-to-scope mapping request returns multiple scopes, this is the key value (“tag”) to match against in the result dict. By convention, the default is set to “deriva-all”, which is the expected response from webauthn.
Returns:

A dict containing credential authorization values mapped by authorization type

deriva.core.read_credential(credential_file=DEFAULT_CREDENTIAL_FILE, create_default=False, default=DEFAULT_CREDENTIAL)[source]
deriva.core.write_credential(credential_file=DEFAULT_CREDENTIAL_FILE, credential=DEFAULT_CREDENTIAL)[source]
deriva.core.read_config(config_file=DEFAULT_CONFIG_FILE, create_default=False, default=DEFAULT_CONFIG)[source]
deriva.core.write_config(config_file=DEFAULT_CONFIG_FILE, config=DEFAULT_CONFIG)[source]
core.DEFAULT_CONFIG_PATH = System dependent default path to the configuration directory.
core.DEFAULT_CONFIG_FILE = System dependent default path to the config file.
core.DEFAULT_CREDENTIAL_FILE = System dependent default path to the credential file.
core.DEFAULT_GLOBUS_CREDENTIAL_FILE = System dependent default path to the Globus Auth credential file.