Managing Data in DERIVA with deriva-client
¶
The deriva-client
package bundles an application suite of Python-based
client software for use with the DERIVA platform. These tools provide functions such as:
- Authentication services for programmatic and non browser-based application access.
- Bulk import and export of catalog assets and (meta) data.
- Catalog configuration, mutation and administration.
- Tools for working with
bdbags
, a file container format used by DERIVA for the import and export of data.
Installed Applications¶
Command-Line Interface (CLI) applications¶
Executable Name | Description |
---|---|
bdbag |
The bdbag application provides a variety of functions for working with BagIt file archives, a file packaging format used by DERIVA for data export. This format is created by the DERIVA web applications when exporting data sets using the BDBAG option. |
bdbag-utils |
The bdbag-utils application is used to make some of the more repetitive and programmable tasks associated with creating and maintaining bags easier. |
deriva-acl-config |
The deriva-acl-config utility reads a configuration file and uses it to set ACLs for an ERMRest catalog (or for a schema or table within that catalog). |
deriva-annotation-config |
The deriva-annotation-config utility reads a configuration file and uses it to set annotations for an ERMRest catalog (or for a schema or table within that catalog). |
deriva-annotation-dump |
Outputs the current set of annotations in use for the specified catalog in JSON format. |
deriva-annotation-rollback |
Provides a function to rollback the entire annotation hierarchy for the specified catalog to a given point in time specified by catalog snapshot ID. |
deriva-catalog-config |
The deriva-catalog-config application provides functions to set up catalog schema and tables with a standard baseline annotation and ACL configuration. |
deriva-catalog-dump |
The deriva-catalog-dump application provides functions to dump the current configuration of a catalog as a set of deriva-py scripts. The scripts are pure deriva-py and have placeholder variables to set annotations, acls, and acl-bindings. |
deriva-csv |
The deriva-csv application provides functions to upload csv or other table-like data to a catalog with options to create a new table, validate input data and upload data. |
deriva-download-cli |
The deriva-download-cli is used for orchestrating the bulk export of tabular data (stored in ERMRest catalogs) and download of asset data (stored in Hatrac, or other supported HTTP-accessible object store). |
deriva-hatrac-cli |
The deriva-hatrac-cli is a command-line utility for interacting directly with the DERIVA Hatrac object store. |
deriva-upload-cli |
The deriva-upload-cli provides batch upload functionality for both catalog (ERMRest) and asset (Hatrac) data. This application is generally used for automating the bulk transfer of data to DERIVA servers. |
deriva-sitemap-cli |
The deriva-sitemap-cli utility creates a sitemap containing record entries for all publicly-readable rows in one or more ERMRest tables. |
deriva-globus-auth-utils |
The deriva-globus-auth-utils provides numerous utility functions for working with the Globus Auth API in addition to Globus Auth Native App login functionality. |
Graphical User Interface (GUI) applications¶
Executable Name | Application Name | Description |
---|---|---|
deriva-auth |
DERIVA Authentication Agent | Provides credential authentication and refresh services for one or more DERIVA servers. This application is intended to be run in the background after the user completes the login sequence for each server. |
deriva-upload |
DERIVA Upload Utility | Provides batch upload functionality for both catalog and asset data. This application is an interactive tool used for the bulk transfer of data to DERIVA servers. |
Installer packages for Windows and MacOSX¶
Pre-packaged installers of deriva-client
for Windows and MacOSX are
available.
These installer packages include a bundled Python interpreter and all
other software dependencies and are recommended for Windows and MacOSX
users who are looking for a more traditional “turnkey” installation that
does not require them to install Python and manage Python software package
installations.
Installing deriva-client
from PyPi via pip
¶
For users who already have the base Python interpreter installed and are
comfortable installing Python software via the pip
application,
deriva-client
can be easily installed along with all of it’s dependencies
directly from PyPi using basic
pip
commands. For those users who wish to write programs against the
various APIs included in deriva-client
, this is the recommended
installation method.
Installation Prerequisites¶
- A Python 3.5.4 or greater system installation is required. The latest stable version of Python is recommended.
- Verify that the appropriate Python 3 interpreter can be invoked from a
command shell using the
python3
command. This can be tested simply with the following command:
python3 --version
Installation Quickstart¶
The following commands can be used to perform a venv
-based virtual
environment installation to the current working directory.
Mac/Linux¶
The following commands assume a BASH
(or compatible) command shell is
used. For a different command interpreter (e.g. CSH
), invoke the source
command on the appropriate activation script in the virtual environment’s bin
directory.
python3 -m venv ./deriva-client-venv
source ./deriva-client-venv/bin/activate
python3 -m pip install --upgrade pip setuptools wheel
pip install deriva-client
Important Note: For MacOSX users running Python 3.5.x with pip version < 9.0.3¶
If you encounter the following error:
Could not fetch URL https://pypi.python.org/simple/pip/:
There was a problem confirming the ssl certificate:
[SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:720) - skipping
This error means that you cannot update pip
, setuptools
, and wheel
via the command provided above. You can work around this error by issuing the
following commands instead, and then continue with the installation procedure as described.
curl https://bootstrap.pypa.io/get-pip.py | python3
pip install --upgrade setuptools
Windows¶
The following commands assume a Windows Command Prompt
command shell is used. For a
Powershell
shell, the activate.ps1
activation script should be invoked instead.
python3 -m venv .\deriva-client-venv
.\deriva-client-venv\Scripts\activate
python3 -m pip install --upgrade pip setuptools wheel
pip install deriva-client
IMPORTANT NOTE: Python virtual environments versus user environments¶
While a virtual environment installation is generally the safest way to install and isolate multiple software packages, it also must be activated before use and deactivated after use. If this requirement is too cumbersome, the recommended alternative is to install the software into a user environment instead. See the complete installation procedure below for more information.
Installation Procedure¶
- For MacOSX and Linux systems which include Python as a core part of the
operating system, it is highly recommended to install this software
into a virtual environment or a user environment, so that it does not interfere or conflict
with the operating system’s Python installation. The native Python3
venv
module, thevirtualenv
package from PyPi, or the Anaconda Distribution environment are all suitable for use as virtual environments. - Instead of using a virtual environment, it is also possible to
install the software into a user environment
using the
--user
argument when invokingpip install
. - Recent versions of
pip
,setuptools
, andwheel
are recommended. If these components are already installed, updating them to the latest versions available is optional.
Installation Sequence¶
Create and/or activate the target virtual environment, if any. This step is specific to the type of virtual environment being used.
Update
pip
,setuptools
, andwheel
(optional).For virtual environments execute the following (ensure the environment is active):
python -m pip install --upgrade pip setuptools wheel
For user environments execute the following:
python3 -m pip install --user --upgrade pip setuptools wheel
For Linux system python installations it is recommended to use the system’s package manager such as
dnf
,apt
, oryum
to update the following packages:python3-pip
,python3-setuptools
, andpython3-wheel
.
Install
deriva-client
directly from PyPi using thepip install
command.For virtual environments execute the following (ensure the environment is active):
pip install deriva-client
For user environments execute the following:
pip3 install --user deriva-client
For system-wide python installations (only do this if you understand the complexities involved):
pip3 install deriva-client
IMPORTANT NOTES: Using pip
to install software into system-wide Python locations¶
- Many newer Linux (as well as MacOSX) distributions contain both Python2
and Python3 installed alongside each other. In these environments, both
the python interpreter and
pip
are symbolically linked to the system default version, which in general results inpython
andpip
being linked to the Python2 versions. - Python3 versions are commonly accessed via
python3
andpip3
. If you are working outside of a Python3 virtual environment and installing either to the system-wide Python location (not recommended) or a user-based location (e.g. with thepip
--user
argument), then you must substitutepip3
forpip
when issuingpip
installation commands. - Also note that when installing into the system Python location via
pip
on Linux/MacOSX, the commands must be run as root or thesudo
command must be prefixed to the command line.
Managing data with the datapath
API (deriva-py)¶
The deriva-py
package (part of deriva-client
) also includes a
Python API for a programmatic interface for ERMRest.
The datapath
module in particular is an interface for building ERMRest “data paths” and retrieving data from ERMRest catalogs. It also supports data manipulation (insert, update, delete). In its present form, the module provides a limited
programmatic interface to ERMRest.
Reference Documentation¶
Source Code¶
The source code for the primary components of deriva-client
can be found at the links below: