{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# DataPath :: Data Update Example\n", "This notebook demonstrates how to perform simple data manipulations." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from deriva.core import ErmrestCatalog, get_credential" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example uses a development server with a throw away catalog. You *will not* have sufficient permissions to be able to run this example. This notebook is for documentation purpose only." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "scheme = 'https'\n", "hostname = 'dev.facebase.org'\n", "catalog_number = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use DERIVA-Auth to get a `credential` or use `None` if your catalog allows anonymous access." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "credential = get_credential(hostname)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, connect to your catalog and the `pathbuilder` interface for the catalog." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "assert scheme == 'http' or scheme == 'https', \"Invalid http scheme used.\"\n", "assert isinstance(hostname, str), \"Hostname not set.\"\n", "assert isinstance(catalog_number, int), \"Invalid catalog number\"\n", "catalog = ErmrestCatalog(scheme, hostname, catalog_number, credential)\n", "pb = catalog.getPathBuilder()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this example, we will create or modify entities of the \"Dataset\" table of a catalog that uses the FaceBase data model." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Table name: 'dataset'\n", "List of columns:\n", " Column name: 'id'\tType: serial4\tComment: 'None'\n", " Column name: 'accession'\tType: text\tComment: 'None'\n", " Column name: 'title'\tType: text\tComment: 'None'\n", " Column name: 'project'\tType: int8\tComment: 'None'\n", " Column name: 'funding'\tType: text\tComment: 'None'\n", " Column name: 'summary'\tType: text\tComment: 'None'\n", " Column name: 'description'\tType: markdown\tComment: 'None'\n", " Column name: 'view_gene_summary'\tType: text\tComment: 'None'\n", " Column name: 'mouse_genetic'\tType: text\tComment: 'None'\n", " Column name: 'human_anatomic'\tType: text\tComment: 'None'\n", " Column name: 'study_design'\tType: markdown\tComment: 'None'\n", " Column name: 'release_date'\tType: date\tComment: 'None'\n", " Column name: 'status'\tType: text\tComment: 'None'\n", " Column name: 'gene_summary'\tType: int4\tComment: 'None'\n", " Column name: 'thumbnail'\tType: int4\tComment: 'None'\n", " Column name: 'show_in_jbrowse'\tType: boolean\tComment: 'None'\n", " Column name: '_keywords'\tType: text\tComment: 'None'\n", " Column name: 'RID'\tType: ermrest_rid\tComment: 'System-generated unique row ID.'\n", " Column name: 'RCB'\tType: ermrest_rcb\tComment: 'System-generated row created by user provenance.'\n", " Column name: 'RMB'\tType: ermrest_rmb\tComment: 'System-generated row modified by user provenance.'\n", " Column name: 'RCT'\tType: ermrest_rct\tComment: 'System-generated row creation timestamp.'\n", " Column name: 'RMT'\tType: ermrest_rmt\tComment: 'System-generated row modification timestamp'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset = pb.isa.dataset\n", "dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert example\n", "Here we will insert an entity into the dataset table." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "new_entity = {\n", " 'title': 'A test dataset by derivapy', \n", " 'description': 'This was created by the deriva-py API.',\n", " 'project': 311\n", "}\n", "entities = dataset.insert([new_entity], defaults={'id', 'accession'})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The insert operation returns the inserted entities, which now have any system generated attributes filled in." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'RCB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',\n", " 'RCT': '2018-05-25T14:16:20.951563-07:00',\n", " 'RID': 572108,\n", " 'RMB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',\n", " 'RMT': '2018-05-25T14:16:20.951563-07:00',\n", " '_keywords': None,\n", " 'accession': 'FB00000974',\n", " 'description': 'This was created by the deriva-py API.',\n", " 'funding': None,\n", " 'gene_summary': None,\n", " 'human_anatomic': None,\n", " 'id': 14219,\n", " 'mouse_genetic': None,\n", " 'project': 311,\n", " 'release_date': None,\n", " 'show_in_jbrowse': None,\n", " 'status': None,\n", " 'study_design': None,\n", " 'summary': None,\n", " 'thumbnail': None,\n", " 'title': 'A test dataset by derivapy',\n", " 'view_gene_summary': None}]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(entities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Update example\n", "Here we will change the description for the entity we inserted and update it in the catalog." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "entities[0]['description'] = 'A test dataset that was updated by derivapy'" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "updated_entities = dataset.update(entities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to the insert operation, the update operation also returns the updated entities. Notice that the system-managed 'RMT' (Row Modified Timestamp) attribute has been update too." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'RCB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',\n", " 'RCT': '2018-05-25T14:16:20.951563-07:00',\n", " 'RID': 572108,\n", " 'RMB': 'https://auth.globus.org/bb256144-d274-11e5-adb1-13a4cc43acbd',\n", " 'RMT': '2018-05-25T14:16:25.63306-07:00',\n", " '_keywords': None,\n", " 'accession': 'FB00000974',\n", " 'description': 'A test dataset that was updated by derivapy',\n", " 'funding': None,\n", " 'gene_summary': None,\n", " 'human_anatomic': None,\n", " 'id': 14219,\n", " 'mouse_genetic': None,\n", " 'project': 311,\n", " 'release_date': None,\n", " 'show_in_jbrowse': None,\n", " 'status': None,\n", " 'study_design': None,\n", " 'summary': None,\n", " 'thumbnail': None,\n", " 'title': 'A test dataset by derivapy',\n", " 'view_gene_summary': None}]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(updated_entities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Update with custom correlation and targets specified\n", "You can also specify which columns to use to correlate the input with the existing rows in the table and which columsn to be the targets of the update. Per the ERMrest protocol, extra data in the update payload (`entities`) will be ignored. The inputs must be `iterable`s of strings or objects that implement the `__str__` method." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "entities[0]['description'] = 'Yet another update using derivapy'\n", "entities[0]['title'] = 'And a title change'\n", "updated_entities = dataset.update(entities, [dataset.id], [dataset.description, 'title'])" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'description': 'Yet another update using derivapy',\n", " 'id': 14219,\n", " 'title': 'And a title change'}]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(updated_entities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Delete example\n", "Unlike `insert` and `update` which are performed within the context of a table, the `delete` operation is performed within the context of a data path.\n", "\n", "We know the `RID` from above, which is a single-column key for the entities in the `dataset` (and any other EMRrest) table. We can use this attribute to form a path to the newly inserted and updated entity.\n", "\n", "Note: Any filters could be used in this example; we do not have to use a key column only. We use the key only because we want to delete that specific entity which we just created. If we wanted to, we could link addition tables and apply additional filters to delete entities computed from a _complex_ path." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": true }, "outputs": [], "source": [ "path = dataset.filter(dataset.RID == entities[0]['RID'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On successful delete, no content will be returned." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "path.delete()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 2 }