REST API reference: deriva/export
¶
The deriva/export
service endpoint provides export functionality from an ERMrest catalog via a REST API.
The service is generally configured to be co-located on the same server with the ERMrest catalog that it is providing data services for.
Exporting Files¶
This API endpoint is used to perform one or more queries to an ERMrest catalog and create corresponding individual
result files which can then be retrieved via GET
. URLs for result files are returned in the response body as Content-Type:text/uri-list
.
Export file(s)¶
Executes one more ERMrest queries and writes the results to individual files.
URL¶
/deriva/export/file
Method:¶
POST
URL Params¶
None
Data Params¶
The input data is composed of a JSON object with the following form:
root (object)¶
Variable | Type | Inclusion | Description |
---|---|---|---|
catalog |
catalog |
required | A catalog object. See below. |
catalog
(object)¶
Variable | Type | Inclusion | Description |
---|---|---|---|
host |
string | required | The hostname (and port) of the ERMrest service to query. |
catalog_id |
string | optional | The catalog identifier, e.g. 1 . |
username |
string | optional | If username is not specified, authentication is assumed to come from the webauthn authentication context stored in the caller's cookie. |
password |
string | optional | The password field is only used when username is specifed. |
queries |
array[query ] |
required | An array of query objects. See below. |
query
(object)¶
Parent Object | Variable | Type | Inclusion | Description | Interpolatable |
---|---|---|---|---|---|
query | processor |
string | required | This is a string value used to select from one of the built-in query output processor formats. Valid values are env , csv , json , json-stream , download , or fetch . |
No |
query | processor_type |
string | optional | A fully qualified Python class name declaring an external processor class instance to use. If this parameter is present, it OVERRIDES the default value mapped to the specified processor . This class MUST be derived from the base class deriva.transfer.download.processors.BaseDownloadProcessor . For example, "processor_type": "deriva.transfer.download.processors.CSVDownloadProcessor" . |
No |
query | processor_params |
object | required | This is an extensible JSON Object that contains processor implementation-specific parameters. | No |
processor_params | query_path |
string | required | This is string representing the actual ERMRest query path to be used in the HTTP(S) GET request. It SHOULD already be percent-encoded per RFC 3986 if it contains any characters outside of the unreserved set. |
Yes |
processor_params | output_path |
string | required | This is a POSIX-compliant path fragment indicating the target location of the retrieved data relative to the specified base download directory. | Yes |
processor_params | output_filename |
string | optional | This is a POSIX-compliant path fragment indicating the OVERRIDE filename of the retrieved data relative to the specified base download directory and the value of output_path , if any. |
Yes |
Success Response:¶
Code: 200
Content:
http://localhost:8080/deriva/export/file/9ad15e5b-9c2c-4faf-8829-05fa8252c8bc/genotypes.csv
http://localhost:8080/deriva/export/file/9ad15e5b-9c2c-4faf-8829-05fa8252c8bc/phenotypes.csv
Error Responses:¶
- 404: NOT FOUND
- 401: UNAUTHORIZED
- 400: BAD REQUEST
- 500: INTERNAL SERVER ERROR
Sample Call:¶
var exportParameters =
{
"catalog":
{
"host": "http://localhost:8080",
"catalog_id": "1",
"username": "devuser",
"password": "devpass",
"query_processors": [
{
"processor": "csv",
"processor_params": {
"query_path": "/entity/A:=dev:subject/A1:=snp_v/snp_id=rs6265",
"output_path": "genotypes",
}
},
{
"processor": "csv",
"processor_params": {
"query_path": "/entity/A:=dev:subject/A1:=snp_v/snp_id=rs6265/$A/B:=dev:subject_phenotypes_v",
"output_path": "phenotypes",
}
}
]
}
};
$.ajax({
url: "/deriva/export/file",
dataType: "json",
type : "POST",
data: exportParameters,
success : function(r) {
console.log(r);
}
});
Retrieve exported file(s)¶
Retrieves a file previously created by a POST
.
URL¶
/deriva/export/file/<id>/<filename>
Method:¶
GET
URL Params¶
Required:
id=[string]
Optional:
filename=[string]
- This argument is required when the uri-list
returned from POST
contains more than one entry.
If it is not specified and there is more than one file result, a 400 Bad Request
is returned.
Data Params¶
None
Error Responses:¶
- 404: NOT FOUND
- 403: FORBIDDEN
- 401: UNAUTHORIZED
- 400: BAD REQUEST
- 500: INTERNAL SERVER ERROR
Sample Call:¶
$.ajax({
url: "/deriva/export/file/9ad15e5b-9c2c-4faf-8829-05fa8252c8bc/genotypes.csv",
type : "GET",
success : function(r) {
console.log(r);
}
});
Retrieve log file for exported file(s)¶
Retrieves the log file generated by an invocation of POST
.
URL¶
/deriva/export/file/<id>/log
Method:¶
GET
Data Params¶
None
Error Responses:¶
- 404: NOT FOUND
- 403: FORBIDDEN
- 401: UNAUTHORIZED
- 400: BAD REQUEST
- 500: INTERNAL SERVER ERROR
Sample Call:¶
$.ajax({
url: "/deriva/export/file/9ad15e5b-9c2c-4faf-8829-05fa8252c8bc/log",
type : "GET",
success : function(r) {
console.log(r);
}
});
Exporting Bags¶
This API endpoint is used to perform one or more queries to an ERMrest catalog and create a
BDBag archive file which can then be retrieved via GET
. The URL for the result bag is returned in the response body as Content-Type:text/uri-list
.
Export bag¶
Executes one more ERMrest queries and writes the results to a BDBag.
URL¶
/deriva/export/bdbag
Method:¶
POST
URL Params¶
None
Data Params¶
The input data is composed of a JSON object with the following form:
root (object)¶
Variable | Type | Inclusion | Description |
---|---|---|---|
bag |
bag |
required | A bag object. See below. |
catalog |
catalog |
required | A catalog object. See below. |
bag
(object)¶
Variable | Type | Inclusion | Description |
---|---|---|---|
bag_name |
string, enum ["zip" ,"tgz" ,"bz2" ,"tar" ] |
required | The base file name of the bag. An appropriate extension will be added to the base depending on the archive type selected. |
bag_archiver |
string | required | The archive format used to serialize the result bag. |
bag_metadata |
object | optional | A simple 'dictionary' object consisting of key-value pairs. The only supported primitive type for value pairs is string . The metdata object will be written directly to the bag's bag-info.txt file. |
catalog
(object)¶
Variable | Type | Inclusion | Description |
---|---|---|---|
host |
string | required | The hostname (and port) of the ERMrest service to query. |
catalog_id |
string | optional | The catalog identifier, e.g. 1 . |
username |
string | optional | If username is not specified, authentication is assumed to come from the webauthn authentication context stored in the caller's cookie. |
password |
string | optional | The password field is only used when username is specifed. |
queries |
array[query ] |
required | An array of query objects. See below. |
query
(object)¶
Parent Object | Variable | Type | Inclusion | Description | Interpolatable |
---|---|---|---|---|---|
query | processor |
string | required | This is a string value used to select from one of the built-in query output processor formats. Valid values are env , csv , json , json-stream , download , or fetch . |
No |
query | processor_type |
string | optional | A fully qualified Python class name declaring an external processor class instance to use. If this parameter is present, it OVERRIDES the default value mapped to the specified processor . This class MUST be derived from the base class deriva.transfer.download.processors.BaseDownloadProcessor . For example, "processor_type": "deriva.transfer.download.processors.CSVDownloadProcessor" . |
No |
query | processor_params |
object | required | This is an extensible JSON Object that contains processor implementation-specific parameters. | No |
processor_params | query_path |
string | required | This is string representing the actual ERMRest query path to be used in the HTTP(S) GET request. It SHOULD already be percent-encoded per RFC 3986 if it contains any characters outside of the unreserved set. |
Yes |
processor_params | output_path |
string | required | This is a POSIX-compliant path fragment indicating the target location of the retrieved data relative to the specified base download directory. | Yes |
processor_params | output_filename |
string | optional | This is a POSIX-compliant path fragment indicating the OVERRIDE filename of the retrieved data relative to the specified base download directory and the value of output_path , if any. |
Yes |
Success Response:¶
Code: 200
Content:
http://localhost:8080/deriva/export/file/9ad15e5b-9c2c-4faf-8829-05fa8252c8bc/sample-bag.zip
Error Responses:¶
- 404: NOT FOUND
- 401: UNAUTHORIZED
- 400: BAD REQUEST
- 500: INTERNAL SERVER ERROR
Sample Call:¶
var exportParameters =
{
"bag":
{
"bag_name": "sample-bag",
"bag_archiver":"zip",
"bag_metadata":
{
"Source-Organization": "USC Information Sciences Institute, Informatics Systems Research Division",
"Contact-Name": "Mike D'Arcy",
"External-Description": "A bag containing a sample PheWas cohort for downstream analysis.",
"Internal-Sender-Identifier": "USC-ISI-IRSD"
}
},
"catalog":
{
"host": "https://localhost:8080",
"path": "/ermrest/catalog/1",
"username": "",
"password": "",
"queries":
[
{
"processor": "csv",
"query_path": "/entity/A:=pnc:subject/A1:=snp_v/snp_id=rs6265/$A/B:=pnc:metrics_v",
"output_path": "metrics"
},
{
"processor": "csv",
"processor_params": {
"query_path": "/entity/A:=pnc:subject/A1:=snp_v/snp_id=rs6265",
"output_path": "genotypes"
}
},
{
"processor": "csv",
"processor_params": {
"query_path": "/entity/A:=pnc:subject/A1:=snp_v/snp_id=rs6265/$A/B:=pnc:subject_phenotypes_v",
"output_path": "phenotypes"
}
},
{
"processor": "fetch",
"processor_params": {
"query_path": "/attribute/A:=pnc:subject/A1:=snp_v/snp_id=rs6265/$A/B:=pnc:image_files/filename::ciregexp::0mm.mgh/url:=B:uri,length:=B:bytes,filename:=B:filepath,sha256:=B:sha256sum",
"output_path": "images"
}
}
]
}
};
$.ajax({
url: "/deriva/export/file",
dataType: "json",
type : "POST",
data: exportParameters,
success : function(r) {
console.log(r);
}
});
Retrieve an exported bag¶
Retrieves a bag previously created by a POST
.
URL¶
/deriva/export/bdbag/<id>
Method:¶
GET
Data Params¶
None
Error Responses:¶
- 404: NOT FOUND
- 403: FORBIDDEN
- 401: UNAUTHORIZED
- 400: BAD REQUEST
- 500: INTERNAL SERVER ERROR
Sample Call:¶
$.ajax({
url: "/deriva/export/file/9ad15e5b-9c2c-4faf-8829-05fa8252c8bc",
type : "GET",
success : function(r) {
console.log(r);
}
});
Retrieve log file for exported bag¶
Retrieves the log file generated by an invocation of POST
.
URL¶
/deriva/export/bdbag/<id>/log
Method:¶
GET
Data Params¶
None
Error Responses:¶
- 404: NOT FOUND
- 403: FORBIDDEN
- 401: UNAUTHORIZED
- 400: BAD REQUEST
- 500: INTERNAL SERVER ERROR
Sample Call:¶
$.ajax({
url: "/deriva/export/file/9ad15e5b-9c2c-4faf-8829-05fa8252c8bc/log",
type : "GET",
success : function(r) {
console.log(r);
}
});