deriva.core package

Submodules

deriva.core.base_cli module

class deriva.core.base_cli.BaseCLI(description, epilog, version=None)[source]

Bases: object

parse_cli()[source]
remove_options(options)[source]
class deriva.core.base_cli.KeyValuePairArgs(option_strings, dest, nargs=None, **kwargs)[source]

Bases: argparse.Action

deriva.core.datapath module

class deriva.core.datapath.Catalog(model_doc, **kwargs)[source]

Bases: object

Handle to a Catalog.

class deriva.core.datapath.Column(sname, tname, column_doc, **kwargs)[source]

Bases: object

Represents a column in a table.

ciregexp(other)[source]
fqname

the url encoded fully qualified name

instancename
regexp(other)[source]
ts(other)[source]
uname

the url encoded name

class deriva.core.datapath.DataPath(root)[source]

Bases: object

Represents an arbitrary data path.

context
delete()[source]

Deletes the entity set referenced by the data path.

entities(*attributes, **renamed_attributes)[source]

Returns the entity set computed by this data path. Optionally, caller may specify the attributes to be included in the entity set. The attributes may be from the current context of the path or from a linked table instance. Columns may be renamed in the output and will take the name of the keyword parameter used. If no attributes are specified, the entity set will contain whole entities of the type of the path’s context. :param attributes: a list of Columns. :param renamed_attributes: a list of renamed Columns. :return: an entity set

filter(filter_expression)[source]

Filters the path based on the specified formula. :param filter_expression: should be a valid Predicate object :return: self

Links this path with another table. At present, the implementation only supports single column keys. :param right: the right hand table of the link expression :param on: an equality comparison between keys and foreign keys :param join_type: the join type of this link which may be ‘left’, ‘right’, ‘full’ outer joins or ‘’ for inner join link by default. :return: self

uri
exception deriva.core.datapath.DataPathException(message, reason=None)[source]

Bases: Exception

DataPath exception

class deriva.core.datapath.EntitySet(uri, fetcher_fn)[source]

Bases: object

A set of entities. The EntitySet is produced by a path. The results may be explicitly fetched. The EntitySet behaves like a container. If the EntitySet has not been fetched explicitly, on first use of container operations, it will be implicitly fetched from the catalog.

dataframe

Pandas DataFrame representation of this path.

fetch(limit=None)[source]

Fetches the entities from the catalog. :param limit: maximum number of entities to fetch from the catalog. :return: self

class deriva.core.datapath.Filter(r, formula)[source]

Bases: deriva.core.datapath.PathOperator

class deriva.core.datapath.FilterPredicate(lop, op, rop)[source]

Bases: deriva.core.datapath.Predicate

class deriva.core.datapath.JunctionPredicate(left, op, right)[source]

Bases: deriva.core.datapath.Predicate

Bases: deriva.core.datapath.PathOperator

class deriva.core.datapath.NegationPredicate(child)[source]

Bases: deriva.core.datapath.Predicate

class deriva.core.datapath.PathOperator(r)[source]

Bases: object

class deriva.core.datapath.Predicate[source]

Bases: object

class deriva.core.datapath.Project(r, attributes, renamed_attributes)[source]

Bases: deriva.core.datapath.PathOperator

class deriva.core.datapath.ResetContext(r, alias)[source]

Bases: deriva.core.datapath.PathOperator

class deriva.core.datapath.Root(r)[source]

Bases: deriva.core.datapath.PathOperator

class deriva.core.datapath.Schema(sname, schema_doc, **kwargs)[source]

Bases: object

Represents a Schema.

class deriva.core.datapath.Table(sname, tname, table_doc, **kwargs)[source]

Bases: object

Represents a Table.

alias(alias_name)[source]

Returns a table alias object. :param alias_name: a string to use as the alias name

entities(*attributes, **renamed_attributes)[source]
filter(filter_expression)[source]
fqname

the url encoded fully qualified name

fromname
insert(entities, defaults=None, add_system_defaults=True)[source]

Inserts entities into the table. :param entities: an iterable collection of entities (i.e., rows) to be inserted into the table. :param defaults: optional, set of column names to be assigned the default expression value. :param add_system_defaults: flag to add system columns to the set of default columns. :return newly created entities.

instancename
path

Always a new DataPath instance that is rooted at this table. Note that this table will be automatically aliased using its own table name.

uname

the url encoded name

update(entities, correlation={'RID'}, targets=None)[source]

Update entities of a table.

For more information see the ERMrest protocol for the attributegroup interface. By default, this method will correlate the input data (entities) based on the RID column of the table. By default, the method will use all column names found in the first row of the entities input, which are not found in the correlation set and not defined as ‘system columns’ by ERMrest, as the targets if targets is not set.

Parameters:
  • entities – an iterable collection of entities (i.e., rows) to be updated in the table.
  • correlation – an iterable collection of column names used to correlate input set to the set of rows to be

updated in the catalog. E.g., {‘col name’} or {mytable.mycolumn} will work if you pass a Column object. :param targets: an iterable collection of column names used as the targets of the update operation. :return: EntitySet of updated entities as returned by the corresponding ERMrest interface.

uri
class deriva.core.datapath.TableAlias(base_table, alias_name)[source]

Bases: deriva.core.datapath.Table

Represents a table alias.

entities(*attributes, **renamed_attributes)[source]
fqname

the url encoded fully qualified name

fromname
instancename
path

Returns the parent path for this alias.

uname

the url encoded name

uri
deriva.core.datapath.from_catalog(catalog)[source]

Creates a datapath.Catalog object from an ErmrestCatalog object. :param catalog: an ErmrestCatalog object :return: a datapath.Catalog object

deriva.core.datapath.logger = <Logger deriva.core.datapath (WARNING)>

Logger for this module

deriva.core.deriva_binding module

class deriva.core.deriva_binding.DerivaBinding(scheme, server, credentials=None, caching=True, session_config=None)[source]

Bases: object

This is a base-class for implementation purposes. Not useful for clients.

static check_path(path)[source]
delete(path, headers={}, guard_response=None)[source]

Perform DELETE request, returning response object.

Arguments:

path: the path within this bound server headers: headers to set in request guard_response: expected current resource state

as previously seen response object.

Uses guard_response to build appropriate ‘if-match’ header to assure change is only applied to expected state.

Raises ConcurrentUpdate for 412 status.

get(path, headers={}, raise_not_modified=False, stream=False)[source]

Perform GET request, returning response object.

Arguments:

path: the path within this bound server headers: headers to set in request raise_not_modified: raise HTTPError for 304 response

status when true.
stream: whether to defer content retrieval to
streaming access mode on response object.

May consult built-in cache and apply ‘if-none-match’ request header unless input headers already include ‘if-none-match’ or ‘if-match’. On cache hit, returns cached response unless raise_not_modified=true.

Caching of new results is disabled when stream=True.

get_authn_session()[source]
get_server_uri()[source]
head(path, headers={}, raise_not_modified=False)[source]

Perform HEAD request, returning response object.

Arguments:

path: the path within this bound server headers: headers to set in request raise_not_modified: raise HTTPError for 304 response

status when true.

May consult built-in cache and apply ‘if-none-match’ request header unless input headers already include ‘if-none-match’ or ‘if-match’. On cache hit, returns cached response unless raise_not_modified=true. Cached response may include content retrieved by GET on the same resource.

post(path, data=None, json=None, headers={})[source]

Perform POST request, returning response object.

Arguments:
path: the path within this bound server data: a buffer or file-like content value json: data to serialize as JSON content headers: headers to set in request

Raises ConcurrentUpdate for 412 status.

post_authn_session(credentials)[source]
put(path, data=None, json=None, headers={}, guard_response=None)[source]

Perform PUT request, returning response object.

Arguments:

path: the path within this bound server data: a buffer or file-like content value json: data to serialize as JSON content headers: headers to set in request guard_response: expected current resource state

as previously seen response object.

Uses guard_response to build appropriate ‘if-match’ header to assure change is only applied to expected state.

Raises ConcurrentUpdate for 412 status.

set_credentials(credentials, server)[source]
exception deriva.core.deriva_binding.DerivaPathError[source]

Bases: ValueError

deriva.core.deriva_server module

class deriva.core.deriva_server.DerivaServer(scheme, server, credentials=None, caching=True, session_config=None)[source]

Bases: deriva.core.deriva_binding.DerivaBinding

Persistent handle for a Deriva server.

connect_ermrest(catalog_id, snaptime=None)[source]

Connect to an ERMrest catalog.

Arguments:
catalog_id: e.g., ‘1’ or ‘1@2PM-DGYP-56Z4’ snaptime: e.g., ‘2PM-DGYP-56Z4’ (optional)
create_ermrest_catalog()[source]

Create an ERMrest catalog.

deriva.core.ermrest_catalog module

class deriva.core.ermrest_catalog.ErmrestCatalog(scheme, server, catalog_id, credentials=None, caching=True, session_config=None)[source]

Bases: deriva.core.deriva_binding.DerivaBinding

Persistent handle for an ERMrest catalog.

Provides basic REST client for HTTP methods on arbitrary paths. Caller has to understand ERMrest APIs and compose appropriate paths, headers, and/or content.

Additional utility methods provided for accessing catalog metadata.

applyCatalogConfig(config)[source]
clone_catalog(dst_catalog=None, copy_data=True, copy_annotations=True, copy_policy=True, truncate_after=True)[source]

Clone this catalog’s content into dest_catalog, creating a new catalog if needed.

Parameters:
  • dst_catalog – Destination catalog or None to request creation of new destination (default).
  • copy_data – Copy table contents when True (default).
  • copy_annotations – Copy annotations when True (default).
  • copy_policy – Copy access-control policies when True (default).
  • truncate_after – Truncate destination history after cloning when True (default).

When dest_catalog is provided, attempt an idempotent clone, assuming content MAY be partially cloned already using the same parameters. This routine uses a table-level annotation “tag:isrd.isi.edu,2018:clone-state” to save progress markers which help it restart efficiently if interrupted.

Cloning preserves source row RID values so that any RID-based foreign keys are still valid. It is not generally advisable to try to merge more than one source into the same clone, nor to clone on top of rows generated locally in the destination, since this could cause duplicate RID conflicts.

Truncation after cloning avoids retaining incremental snapshots which contain partial clones.

delete(path, headers={}, guard_response=None)[source]

Perform DELETE request, returning response object.

Arguments:

path: the path within this bound catalog headers: headers to set in request guard_response: expected current resource state

as previously seen response object.

Uses guard_response to build appropriate ‘if-match’ header to assure change is only applied to expected state.

Raises ConcurrentUpdate for 412 status.

delete_ermrest_catalog(really=False)[source]

Perform DELETE request, destroying catalog on server.

Arguments:
really: delete when True, abort when False (default)
getAsFile(path, destfilename, headers={}, callback=None)[source]

Retrieve catalog data streamed to destination file. Caller is responsible to clean up file even on error, when the file may or may not be exist.

getCatalogConfig()[source]
getCatalogModel()[source]
getCatalogSchema()[source]
getDefaultColumns(row, table, exclude=None, quote_url=True)[source]
getPathBuilder()[source]

Returns the ‘path builder’ interface for this catalog.

getTableColumns(fq_table_name)[source]
getTableSchema(fq_table_name)[source]
latest_snapshot()[source]

Gets a handle to this catalog’s latest snapshot.

static splitQualifiedCatalogName(name)[source]
table_schemas = {}
validateRowColumns(row, fq_tableName)[source]
exception deriva.core.ermrest_catalog.ErmrestCatalogMutationError[source]

Bases: Exception

class deriva.core.ermrest_catalog.ErmrestSnapshot(scheme, server, catalog_id, snaptime, credentials=None, caching=True, session_config=None)[source]

Bases: deriva.core.ermrest_catalog.ErmrestCatalog

Persistent handle for an ERMrest catalog snapshot.

Inherits from ErmrestCatalog and provides the same interfaces, except that the interfaces are now bound to a fixed snapshot of the catalog.

snaptime

The snaptime for this catalog snapshot instance.

deriva.core.ermrest_config module

class deriva.core.ermrest_config.AttrDict[source]

Bases: dict

Dictionary with optional attribute-based lookup.

For keys that are valid attributes, self.key is equivalent to self[key].

class deriva.core.ermrest_config.CatalogColumn(sname, tname, column_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.NodeConfigAclBinding

Column-level configuration management.

acl_bindings: column-level dynamic ACL bindings acls: column-level ACL configuration annotations: column-level annotations name: name of column

Convenience access to common annotations:
self.asset: tag.asset object self.column_display:: tag.column_display object self.display: tag.display object self.generated: treat tag.generated as a boolean self.immutable: treat tag.immutable as a boolean
asset
column_display
prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_config.CatalogConfig(model_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.NodeConfigAcl

Top-level catalog configuration management.

acls: catalog-level ACL configuration annotations: catalog-level annotations schemas: all schemas in catalog, by name

apply(catalog, existing=None)[source]

Apply catalog configuration to catalog unless existing already matches.

Parameters:
  • catalog – The EmrestCatalog instance to which configuration will be applied.
  • existing – An instance comparable to self.

The configuration in self will be applied recursively to the corresponding model nodes in schema. For each node, the comment, annotations, acls, and/or acl_bindings will be applied where applicable.

If existing is not provided (default), the current whole configuration will be retrieved from the catalog and used automatically to determine whether the configuration goals under this CatalogConfig instance are already met or need to be remotely applied.

clear()[source]

Clear all configuration in catalog and children.

column(sname, tname, cname)[source]

Return column configuration for column with given name.

classmethod fromcatalog(catalog)[source]

Retrieve catalog config as a CatalogConfig management object.

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

table(sname, tname)[source]

Return table configuration for table with given name.

class deriva.core.ermrest_config.CatalogForeignKey(sname, tname, fkey_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.NodeConfigAclBinding

Foreign key-level configuration management.

acl_bindings: foreign key-level acl-bindings acls: foreign key-level acls annotations: foreign key-level annotations

foreign_key
prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_config.CatalogKey(sname, tname, key_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.NodeConfig

Key-level configuration management.

annotations: column-level annotations names: name(s) of key constraint

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_config.CatalogSchema(sname, schema_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.NodeConfigAcl

Schema-level configuration management.

acls: schema-level ACL configuration annotations: schema-level annotations tables: all tables in schema, by name

Convenience access for common annotations:
self.display: access mutable tag.display object
apply(catalog, existing=None)[source]

Apply schema configuration to catalog unless existing already matches.

Parameters:
  • catalog – The EmrestCatalog instance to which configuration will be applied.
  • existing – An instance comparable to self.

The configuration in self will be applied recursively to the corresponding model nodes in catalog. For each node, the comment, annotations, acls, and/or acl_bindings will be applied where applicable unless existing value is equivalent.

clear()[source]

Clear all configuration in schema and children.

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_config.CatalogTable(sname, tname, table_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.NodeConfigAclBinding

Table-level configuration management.

acl_bindings: table-level dynamic ACL bindings acls: table-level ACL configuration annotations: table-level annotations column_definitions: columns in table

Convenience access to common annotations:
self.alternatives: tag.table_alternatives object self.display: tag.display object self.generated: treat tag.generated as a boolean self.immutable: treat tag.immutable as a boolean self.table_display: tag.table_display object self.visible_columns: tag.visible_columns object self.visible_foreign_keys: tag.visible_foreign_keys object
alternatives
apply(catalog, existing=None)[source]

Apply table configuration to catalog unless existing already matches.

Parameters:
  • catalog – The EmrestCatalog instance to which configuration will be applied.
  • existing – An instance comparable to self.

The configuration in self will be applied recursively to the corresponding model nodes in catalog. For each node, the comment, annotations, acls, and/or acl_bindings will be applied where applicable unless existing is supplied and is equivalent.

clear()[source]

Clear all configuration in table and children.

prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

table_display
visible_columns
visible_foreign_keys
class deriva.core.ermrest_config.KeyedList(l)[source]

Bases: list

Keyed list.

append(e)[source]

Append element to list and record its key.

class deriva.core.ermrest_config.MultiKeyedList(l)[source]

Bases: list

Multi-keyed list.

append(e)[source]

Append element to list and record its keys.

class deriva.core.ermrest_config.NodeConfig(uri_path, node_doc)[source]

Bases: object

Generic model document node configuration management.

annotations: map of annotations for node by key comment: comment string or None (if supported by sub-class)

Convenience access for common annotations:
self.display: access mutable tag.display object self.generated: treat tag.generated as a boolean self.immutable: treat tag.immutable as a boolean
annotation_obj(tag)[source]

Generic access to annotation object under given tag.

Returns object stored under tag in node’s annotations, so that side-effects applied to it will affect the annotation.

If annotation is not yet present, an empty object is added and returned.

annotation_presence(tag)[source]

Return True if annotation is present for given tag, False otherwise.

apply(catalog, existing=None)[source]

Apply configuration to corresponding node in catalog unless existing already matches.

Parameters:
  • catalog – The EmrestCatalog instance to which configuration will be applied.
  • existing – An instance comparable to self, or None to apply configuration unconditionally.

The configuration in self.comment and self.annotations will be applied to the remote model node corresponding to self, unless existing node configuration is supplied and is equivalent.

clear(clear_comment=False)[source]

Clear existing annotations on node, also clearing comment if clear_comment is True.

NOTE: as a backwards-compatible heuristic, comments are retained by default so that a typical configuration-management client does not strip useful documentation from existing models.

comment

Comment on this node in model, if supported.

Raises TypeError if accessed when unsupported, e.g. on top-level catalog objects.

display
generated
immutable
prejson()[source]

Produce a representation of configuration as generic Python data structures

set_annotation_presence(tag, value)[source]

Add or remove annotation with given tag depending on boolean presence value.

True: add or replace tag with None value False: remove tag if it exists

class deriva.core.ermrest_config.NodeConfigAcl(uri_path, node_doc)[source]

Bases: deriva.core.ermrest_config.NodeConfig

Generic model acl-bearing document node configuration management.

acls: map of acls for node by key annotations: map of annotations for node by key comment: comment string or None (if supported by sub-class)

Convenience access for common annotations:
self.display: access mutable tag.display object self.generated: treat tag.generated as a boolean self.immutable: treat tag.immutable as a boolean
apply(catalog, existing=None)[source]

Apply configuration to corresponding node in catalog unless existing already matches.

Parameters:
  • catalog – The EmrestCatalog instance to which configuration will be applied.
  • existing – An instance comparable to self, or None to apply configuration unconditionally.

The configuration in self.comment, self.annotations, and self.acls will be applied to the remote model node corresponding to self, unless existing node configuration is supplied and is equivalent.

clear()[source]

Clear existing acls and annotations on node.

prejson()[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_config.NodeConfigAclBinding(uri_path, node_doc)[source]

Bases: deriva.core.ermrest_config.NodeConfigAcl

Generic model acl_binding-bearing document node configuration management.

acl_bindings: map of acl bindings for node by key acls: map of acls for node by key annotations: map of annotations for node by key comment: comment string or None (if supported by sub-class)

Convenience access for common annotations:
self.display: access mutable tag.display object self.generated: treat tag.generated as a boolean self.immutable: treat tag.immutable as a boolean
apply(catalog, existing=None)[source]

Apply configuration to corresponding node in catalog unless existing already matches.

Parameters:
  • catalog – The EmrestCatalog instance to which configuration will be applied.
  • existing – An instance comparable to self, or None to apply configuration unconditionally.

The configuration in self.comment, self.annotations, self.acls, and self.acl_bindings will be applied to the remote model node corresponding to self, unless existing node configuration is supplied and is equivalent.

clear()[source]

Clear existing acl_bindings, acls, and annotations on node.

prejson()[source]

Produce a representation of configuration as generic Python data structures

deriva.core.ermrest_config.equivalent(doc1, doc2, method=None)[source]

Determine whether two dict/array/literal documents are structurally equivalent.

deriva.core.ermrest_model module

class deriva.core.ermrest_model.ArrayType(type_doc, **kwargs)[source]

Bases: deriva.core.ermrest_model.Type

Named domain type.

prejson(prune=True)[source]
class deriva.core.ermrest_model.Column(sname, tname, column_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.CatalogColumn

Named column.

classmethod define(cname, ctype, nullok=True, default=None, comment=None, acls={}, acl_bindings={}, annotations={})[source]

Build a column definition.

delete(catalog, table=None)[source]

Remove this column from the remote database.

Also remove this column from the local table object (if provided).

Parameters:
  • catalog – an ErmrestCatalog object
  • table – a Table object or None
prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_model.DomainType(type_doc, **kwargs)[source]

Bases: deriva.core.ermrest_model.Type

Named domain type.

prejson(prune=True)[source]
class deriva.core.ermrest_model.ForeignKey(sname, tname, fkey_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.CatalogForeignKey

Named foreign key.

classmethod define(fk_colnames, pk_sname, pk_tname, pk_colnames, on_update='NO ACTION', on_delete='NO ACTION', constraint_names=[], comment=None, acls={}, acl_bindings={}, annotations={})[source]
delete(catalog, table=None)[source]

Remove this foreign key from the remote database.

Also remove this foreign key from the local table object (if provided).

Parameters:
  • catalog – an ErmrestCatalog object
  • table – a Table object or None
prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_model.Key(sname, tname, key_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.CatalogKey

Named key.

classmethod define(colnames, constraint_names=[], comment=None, annotations={})[source]

Build a key definition.

delete(catalog, table=None)[source]

Remove this key from the remote database.

Also remove this key from the local table object (if provided).

Parameters:
  • catalog – an ErmrestCatalog object
  • table – a Table object or None
prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_model.Model(model_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.CatalogConfig

Top-level catalog model.

create_schema(catalog, schema_def)[source]

Add a new schema to this model in the remote database based on schema_def.

Returns a new Schema instance based on the server-supplied representation of the newly created schema.

The returned Schema is also added to self.schemas.

update_referenced_by()[source]

Introspects the ‘foreign_keys’ and updates the ‘referenced_by’ properties on the ‘Table’ objects. :param model: an ERMrest model object

class deriva.core.ermrest_model.Schema(sname, schema_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.CatalogSchema

Named schema.

create_table(catalog, table_def)[source]

Add a new table to this schema in the remote database based on table_def.

Returns a new Table instance based on the server-supplied representation of the newly created table.

The returned Table is also added to self.tables.

classmethod define(sname, comment=None, acls={}, annotations={})[source]

Build a schema definition.

delete(catalog, model=None)[source]

Remove this schema from the remote database.

Also remove this schema from the local model object (if provided).

Parameters:
  • catalog – an ErmrestCatalog object
  • schema – a Schema object or None
prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

class deriva.core.ermrest_model.Table(sname, tname, table_doc, **kwargs)[source]

Bases: deriva.core.ermrest_config.CatalogTable

Named table.

create_column(catalog, column_def)[source]

Add a new column to this table in the remote database based on column_def.

Returns a new Column instance based on the server-supplied representation of the new column, and adds it to self.column_definitions too.

create_fkey(catalog, fkey_def)[source]

Add a new foreign key to this table in the remote database based on fkey_def.

Returns a new ForeignKey instance based on the server-supplied representation of the new foreign key, and adds it to self.fkeys too.

create_key(catalog, key_def)[source]

Add a new key to this table in the remote database based on key_def.

Returns a new Key instance based on the server-supplied representation of the new key, and adds it to self.keys too.

classmethod define(tname, column_defs=[], key_defs=[], fkey_defs=[], comment=None, acls={}, acl_bindings={}, annotations={}, provide_system=True)[source]

Build a table definition.

Parameters:
  • tname – the name of the newly defined table
  • column_defs – a list of Column.define() results for extra or overridden column definitions
  • key_defs – a list of Key.define() results for extra or overridden key constraint definitions
  • fkey_defs – a list of ForeignKey.define() results for foreign key definitions
  • comment – a comment string for the table
  • acls – a dictionary of ACLs for specific access modes
  • acl_bindings – a dictionary of dynamic ACL bindings
  • annotations – a dictionary of annotations
  • provide_system – whether to inject standard system column definitions when missing from column_defs
classmethod define_vocabulary(tname, curie_template, uri_template='/id/{RID}', column_defs=[], key_defs=[], fkey_defs=[], comment=None, acls={}, acl_bindings={}, annotations={}, provide_system=True)[source]

Build a vocabulary table definition.

Parameters:
  • tname – the name of the newly defined table
  • curie_template – the RID-based template for the CURIE of locally-defined terms, e.g. ‘MYPROJECT:{RID}’
  • uri_template – the RID-based template for the URI of locally-defined terms, e.g. ‘https://server.example.org/id/{RID}’
  • column_defs – a list of Column.define() results for extra or overridden column definitions
  • key_defs – a list of Key.define() results for extra or overridden key constraint definitions
  • fkey_defs – a list of ForeignKey.define() results for foreign key definitions
  • comment – a comment string for the table
  • acls – a dictionary of ACLs for specific access modes
  • acl_bindings – a dictionary of dynamic ACL bindings
  • annotations – a dictionary of annotations
  • provide_system – whether to inject standard system column definitions when missing from column_defs

These core vocabulary columns are generated automatically if absent from the input column_defs.

  • id: ermrest_curie, unique not null, default curie template “%s:{RID}” % curie_prefix
  • uri: ermrest_uri, unique not null, default URI template “/id/{RID}”
  • name: text, unique not null
  • description: markdown, not null
  • synonyms: text[]

However, caller-supplied definitions override the default.

delete(catalog, schema=None)[source]

Remove this table from the remote database.

Also remove this table from the local schema object (if provided).

Parameters:
  • catalog – an ErmrestCatalog object
  • schema – a Schema object or None
prejson(prune=True)[source]

Produce a representation of configuration as generic Python data structures

classmethod system_column_defs(custom=[])[source]

Build standard system column definitions, merging optional custom definitions.

classmethod system_key_defs(custom=[])[source]

Build standard system key definitions, merging optional custom definitions.

class deriva.core.ermrest_model.Type(type_doc, **kwargs)[source]

Bases: object

Named type.

prejson(prune=True)[source]
deriva.core.ermrest_model.make_type(type_doc, **kwargs)[source]

Create instance of Type, DomainType, or ArrayType as appropriate for type_doc.

deriva.core.hatrac_cli module

class deriva.core.hatrac_cli.DerivaHatracCLI(description, epilog)[source]

Bases: deriva.core.base_cli.BaseCLI

Deriva Hatrac Command-line Interface.

delacl(args)[source]

Implements the getacl sub-command.

delobj(args)[source]

Implements the delobj sub-command.

getacl(args)[source]

Implements the getacl sub-command.

getobj(args)[source]

Implements the getobj sub-command.

list(args)[source]

Implements the list sub-command.

main()[source]

Main routine of the CLI.

mkdir(args)[source]

Implements the mkdir sub-command.

putobj(args)[source]

Implements the putobj sub-command.

rmdir(args)[source]

Implements the mkdir sub-command.

setacl(args)[source]

Implements the setacl sub-command.

exception deriva.core.hatrac_cli.DerivaHatracCLIException(message)[source]

Bases: Exception

Base exception class for DerivaHatracCli.

exception deriva.core.hatrac_cli.ResourceException(message, cause)[source]

Bases: deriva.core.hatrac_cli.DerivaHatracCLIException

Remote resource exception.

exception deriva.core.hatrac_cli.UsageException(message)[source]

Bases: deriva.core.hatrac_cli.DerivaHatracCLIException

Usage exception.

deriva.core.hatrac_cli.main()[source]

deriva.core.hatrac_store module

exception deriva.core.hatrac_store.HatracHashMismatch[source]

Bases: ValueError

exception deriva.core.hatrac_store.HatracJobAborted[source]

Bases: Exception

exception deriva.core.hatrac_store.HatracJobPaused[source]

Bases: Exception

exception deriva.core.hatrac_store.HatracJobTimeout[source]

Bases: Exception

class deriva.core.hatrac_store.HatracStore(scheme, server, credentials=None, session_config=None)[source]

Bases: deriva.core.deriva_binding.DerivaBinding

cancel_upload_job(path, job_id)[source]
content_equals(path, filename=None, md5=None, sha256=None)[source]

Check if a remote object’s content is equal to the content of the at least one of the specified input file, input md5, or input sha256 by comparing MD5 hashes. :return: True IFF the object exists and the MD5 or SHA256 hash matches the MD5 or SHA256 hash of the input file

or the passed MD5 or SHA256 parameters.
create_namespace(namespace_path, parents=True)[source]

Create a namespace.

create_upload_job(path, file_path, md5, sha256, create_parents=True, chunk_size=5760000, content_type=None, content_disposition=None)[source]
del_acl(resource_name, access, role=None)[source]

Delete the object or namespace ACL resource.

del_obj(path)[source]

Delete an object.

delete_namespace(namespace_path)[source]

Delete a namespace.

finalize_upload_job(path, job_id)[source]
get_acl(resource_name, access=None, role=None)[source]

Get the object or namespace ACL resource.

get_obj(path, headers={}, destfilename=None, callback=None)[source]

Retrieve resource optionally streamed to destination file.

If destfilename is provided, download content to file with that name. Caller is responsible to clean up file even on error, when the file may or may not be exist.

If hatrac provides a Content-MD5 response header, the resulting download file will be hash-verified on success or raise HatracHashMismatch on errors. This is not verified when destfilename is None, as the client must instead consume and validate content directly from the response object.

get_upload_job(path, job_id)[source]
is_valid_namespace(namespace_path)[source]

Check if a namespace already exists.

put_loc(path, file_path, headers={}, md5=None, sha256=None, content_type=None, content_disposition=None, chunked=False, chunk_size=5760000, create_parents=True, allow_versioning=True, callback=None)[source]
Parameters:
  • path
  • file_path
  • headers
  • md5
  • sha256
  • content_type
  • content_disposition
  • chunked
  • chunk_size
  • create_parents
  • allow_versioning
  • callback
Returns:

put_obj(path, data, headers={}, md5=None, sha256=None, parents=True)[source]

Idempotent upload of object, returning object location URI.

Arguments:
path: name of object data: filename or seekable file-like object headers: additional headers md5: a base64 encoded md5 digest may be provided in order to skip the automatic hash computation sha256: a base64 encoded sha256 digest may be provided in order to skip the automatic hash computation parents: automatically create parent namespace(s) if missing

Automatically computes and sends Content-MD5 if no digests provided.

If an object-version already exists under the same name with the same Content-MD5, that location is returned instead of creating a new one.

put_obj_chunked(path, file_path, job_id, chunk_size=5760000, callback=None, start_chunk=0)[source]
retrieve_namespace(namespace_path)[source]

Retrieve a namespace.

set_acl(resource_name, access, roles, add_role=False)[source]

Set the object or namespace ACL resource.

if ‘add_role’ is True, the operation will add a single role to the ACL, else it will attempt to replace all of the ACL’s roles. This option is only valid when a list of one role is given.

deriva.core.polling_ermrest_catalog module

class deriva.core.polling_ermrest_catalog.PollingErmrestCatalog(scheme, server, catalog_id, credentials={}, caching=True, session_config=None)[source]

Bases: deriva.core.ermrest_catalog.ErmrestCatalog

Persistent handle for an ERMrest catalog.

Provides a higher-level state_change_once() idiom to efficiently find candidate rows, transform them, and apply updates.

Provides a higher-level blocking_poll() idiom to efficiently poll a catalog, using AMQP to optimize polling where possible. (AMQP is currently limited to clients on localhost of catalog in practice.)

These features can be composed to implement condition-action agents with domain-specific logic, e.g.

catalog = ErmrestCatalog(…) idle_etag = None

def look_for_work():

global idle_etag idle_etag, batch = catalog.state_change_once(

# claim up to 5 items per batch ‘/entity/Foo/state=actionable?limit=5’, ‘/attributegroup/Foo/id;state’, lambda row: {‘id’: row[‘id’], ‘state’: ‘claimed’}, idle_etag

) for candidate, update in batch:

# assume we have free reign on claimed candidates # using state=claimed as a semaphore revision = candidate.copy() revision[‘state’] = update[‘state’] … # do agent work revision[‘state’] = ‘complete’ catalog.put(‘/entity/Foo’, [revision])

catalog.blocking_poll(look_for_work)

blocking_poll(look_for_work, polling_seconds=600, coalesce_seconds=0.1)[source]

Use ERMrest change-notice monitoring to optimize polled work processing.

Client-provided look_for_work function finds actual work in ERMrest and processes it. We only optimize the scheduling of this work.

Run look_for_work() whenever there might be more work in ERMrest.

If look_for_work() returns True, assume there is more work.

If look_for_work() returns non-True, wait for ERMrest change-notice or polling_seconds timeout before looking again (whichever comes first).

On any change-monitoring communication error, assume there might be more work and restart the monitoring process.

Other exceptions abort the blocking_poll() call.

state_change_once(query_datapath, update_datapath, row_transform_func, idle_etag=None)[source]

Perform generic conditional state update via GET-PUT sequence.

Arguments:
query_datapath: a query for candidate rows update_datapath: an update to consume update rows row_transform_func: maps candidate to update rows idle_etag: no-op if table is still in this state
Returns: (idle_etag, [(candidate, update)…])
idle_etag: value to thread to future calls [(candidate, update)…]: each row that was updated

Exceptions from the transform or update process will abort without returning results.

  1. GET query_datapath to get candidate row(s)
  2. apply row_transform_func(row) to get updated content
  3. PUT update_datapath to deliver transformed content – discards rows transformed to None

Uses opportunistic concurrency control with ETag, If-Match, etc. for safety.

Module contents

exception deriva.core.ConcurrentUpdate[source]

Bases: ValueError

exception deriva.core.NotModified[source]

Bases: ValueError

deriva.core.resource_path(relative_path, default='/Users/cristinawilliams/Sites/division-docs/deriva-docs/docs-src')[source]

required to find bundled data at runtime in Pyinstaller single-file exe mode

deriva.core.urlquote(s, safe='')[source]

Quote all reserved characters according to RFC3986 unless told otherwise.

The urllib.urlquote has a weird default which excludes ‘/’ from quoting even though it is a reserved character. We would never want this when encoding elements in Deriva REST API URLs, so this wrapper changes the default to have no declared safe characters.

deriva.core.get_credential(host, credential_file=DEFAULT_CREDENTIAL_FILE)[source]
deriva.core.read_credential(credential_file=DEFAULT_CREDENTIAL_FILE, create_default=False, default=DEFAULT_CREDENTIAL)[source]
deriva.core.write_credential(credential_file=DEFAULT_CREDENTIAL_FILE, credential=DEFAULT_CREDENTIAL)[source]
deriva.core.read_config(config_file=DEFAULT_CONFIG_FILE, create_default=False, default=DEFAULT_CONFIG)[source]
deriva.core.write_config(config_file=DEFAULT_CONFIG_FILE, config=DEFAULT_CONFIG)[source]
core.DEFAULT_CONFIG_PATH = System dependent default path to the configuration directory.
core.DEFAULT_CONFIG_FILE = System dependent default path to the config file.
core.DEFAULT_CREDENTIAL_FILE = System dependent default path to the credential file.