DataPath :: Data Update Example¶
This notebook demonstrates how to perform simple data manipulations.
[1]:
from deriva.core import ErmrestCatalog, get_credential
This example uses a development server with a throw away catalog. You will not have sufficient permissions to be able to run this example. This notebook is for documentation purpose only.
[2]:
scheme = 'https'
hostname = 'dev.facebase.org'
catalog_number = 1
Use DERIVA-Auth to get a credential
or use None
if your catalog allows anonymous access.
[3]:
credential = get_credential(hostname)
Now, connect to your catalog and the pathbuilder
interface for the catalog.
[4]:
assert scheme == 'http' or scheme == 'https', "Invalid http scheme used."
assert isinstance(hostname, str), "Hostname not set."
assert isinstance(catalog_number, int), "Invalid catalog number"
catalog = ErmrestCatalog(scheme, hostname, catalog_number, credential)
pb = catalog.getPathBuilder()
For this example, we will create or modify entities of the “Dataset” table of a catalog that uses the FaceBase data model.
[5]:
dataset = pb.isa.dataset
dataset
[5]:
Table name: 'dataset'
List of columns:
Column name: 'id' Type: serial4 Comment: 'None'
Column name: 'accession' Type: text Comment: 'None'
Column name: 'title' Type: text Comment: 'None'
Column name: 'project' Type: int8 Comment: 'None'
Column name: 'funding' Type: text Comment: 'None'
Column name: 'summary' Type: text Comment: 'None'
Column name: 'description' Type: markdown Comment: 'None'
Column name: 'view_gene_summary' Type: text Comment: 'None'
Column name: 'mouse_genetic' Type: text Comment: 'None'
Column name: 'human_anatomic' Type: text Comment: 'None'
Column name: 'study_design' Type: markdown Comment: 'None'
Column name: 'release_date' Type: date Comment: 'None'
Column name: 'status' Type: text Comment: 'None'
Column name: 'gene_summary' Type: int4 Comment: 'None'
Column name: 'thumbnail' Type: int4 Comment: 'None'
Column name: 'show_in_jbrowse' Type: boolean Comment: 'None'
Column name: '_keywords' Type: text Comment: 'None'
Column name: 'RID' Type: ermrest_rid Comment: 'System-generated unique row ID.'
Column name: 'RCB' Type: ermrest_rcb Comment: 'System-generated row created by user provenance.'
Column name: 'RMB' Type: ermrest_rmb Comment: 'System-generated row modified by user provenance.'
Column name: 'RCT' Type: ermrest_rct Comment: 'System-generated row creation timestamp.'
Column name: 'RMT' Type: ermrest_rmt Comment: 'System-generated row modification timestamp'
Insert example¶
Here we will insert an entity into the dataset table.
[6]:
new_entity = {
'title': 'A test dataset by derivapy',
'description': 'This was created by the deriva-py API.',
'project': 311
}
entities = dataset.insert([new_entity], defaults={'id', 'accession'})
The insert operation returns the inserted entities, which now have any system generated attributes filled in.
[7]:
list(entities)
[7]:
[{'RCB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',
'RCT': '2018-05-25T14:16:20.951563-07:00',
'RID': 572108,
'RMB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',
'RMT': '2018-05-25T14:16:20.951563-07:00',
'_keywords': None,
'accession': 'FB00000974',
'description': 'This was created by the deriva-py API.',
'funding': None,
'gene_summary': None,
'human_anatomic': None,
'id': 14219,
'mouse_genetic': None,
'project': 311,
'release_date': None,
'show_in_jbrowse': None,
'status': None,
'study_design': None,
'summary': None,
'thumbnail': None,
'title': 'A test dataset by derivapy',
'view_gene_summary': None}]
Update example¶
Here we will change the description for the entity we inserted and update it in the catalog.
[8]:
entities[0]['description'] = 'A test dataset that was updated by derivapy'
[9]:
updated_entities = dataset.update(entities)
Similar to the insert operation, the update operation also returns the updated entities. Notice that the system-managed ‘RMT’ (Row Modified Timestamp) attribute has been update too.
[10]:
list(updated_entities)
[10]:
[{'RCB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',
'RCT': '2018-05-25T14:16:20.951563-07:00',
'RID': 572108,
'RMB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',
'RMT': '2018-05-25T14:16:25.63306-07:00',
'_keywords': None,
'accession': 'FB00000974',
'description': 'A test dataset that was updated by derivapy',
'funding': None,
'gene_summary': None,
'human_anatomic': None,
'id': 14219,
'mouse_genetic': None,
'project': 311,
'release_date': None,
'show_in_jbrowse': None,
'status': None,
'study_design': None,
'summary': None,
'thumbnail': None,
'title': 'A test dataset by derivapy',
'view_gene_summary': None}]
Update with custom correlation and targets specified¶
You can also specify which columns to use to correlate the input with the existing rows in the table and which columsn to be the targets of the update. Per the ERMrest protocol, extra data in the update payload (entities
) will be ignored. The inputs must be iterable
s of strings or objects that implement the __str__
method.
[11]:
entities[0]['description'] = 'Yet another update using derivapy'
entities[0]['title'] = 'And a title change'
updated_entities = dataset.update(entities, [dataset.id], [dataset.description, 'title'])
[12]:
list(updated_entities)
[12]:
[{'description': 'Yet another update using derivapy',
'id': 14219,
'title': 'And a title change'}]
Delete example¶
Unlike insert
and update
which are performed within the context of a table, the delete
operation is performed within the context of a data path.
We know the RID
from above, which is a single-column key for the entities in the dataset
(and any other EMRrest) table. We can use this attribute to form a path to the newly inserted and updated entity.
Note: Any filters could be used in this example; we do not have to use a key column only. We use the key only because we want to delete that specific entity which we just created. If we wanted to, we could link addition tables and apply additional filters to delete entities computed from a complex path.
[13]:
path = dataset.filter(dataset.RID == entities[0]['RID'])
On successful delete, no content will be returned.
[14]:
path.delete()