REST API documentation

Introduction

This document describes the REST APIs provided by the Harmonizome. These APIs are for developers who want to integrate the Harmonizome's data into their own applications or who want to run batch scripts against the data.

If you're comfortable with Python, you can use an API wrapper written in Python. See Quick Start in Python.

Quick Start in Python

  • Download this Python module: harmonizomeapi.py.
  • Call get with a supported Entity. For example:
>>> from harmonizomeapi import Harmonizome, Entity
>>> pid_dataset = Harmonizome.get(Entity.DATASET, name='PID pathways')
  • To get a list of the entities, omit the name.
>>> entity_list = Harmonizome.get(Entity.GENE)
  • To get more of the same entity, pass in the response object to the next function.
>>> more = Harmonizome.next(entity_list)

Traversing the URLs

These APIs provide direct access to the data via URL paths and were designed to be used without any knowledge beyond the base URL. The base URL returns a list of the available data entities:

GET /Harmonizome/api/1.0

{
    "version" :1.0,
    "entities":[
        {
            "entity": "attribute",
            "href": "/api/1.0/attribute"
        },
        {
            "entity": "dataset",
            "href": "/api/1.0/dataset"
        },
        {
            "entity": "gene",
            "href": "/api/1.0/gene"
        },
        {
            "entity": "gene set",
            "href": "/api/1.0/gene_set"
        },
        ...
    ]
}

Any enitity's href property can be requested for more information:

GET /Harmonizome/api/1.0/gene

{
    "count": 56720,
    "selection": [0, 100],
    "next": "/api/1.0/gene?cursor=100",
    "entities": [
        {
            "symbol": "LOC105377913",
            "href": "/api/1.0/gene/LOC105377913"
         },
         {
             "symbol": "LOC105377912",
             "href": "/api/1.0/gene/LOC105377912"
         },
         {
             "symbol":"LOC105377911",
             "href":"/api/1.0/gene/LOC105377911"
         },
         ...
     ]
}

The Cursor

In order to minimize database queries and request times, this API uses a technique called "cursoring" to paginate large result sets. Add a query parameter cursor to the GET request to see the selection of data starting at the cursor:

GET /Harmonizome/api/1.0/gene?cursor=3141

If no cursor is provided, the API defaults to 0. The maximum result set size is 100, and the next property will be returned with a link to the next selection of data.

Entities

Entities provide actual data and have no href property. For example

GET /Harmonizome/api/1.0/gene/nanog

Will return

{
    "symbol": "NANOG",
    "name": "Nanog homeobox",
    "ncbiEntrezGeneId": 79923,
    "ncbiEntrezGeneUrl": "http://www.ncbi.nlm.nih.gov/gene/79923",
    ...
}

Note that the geneSets property is an array of entities; therefore, each gene set has its own href property which can be traversed.

Associations

Every gene in the database has functional associations. These associations are expensive to compute and slow to render in the browser. By default, the gene endpoint does not display this information. If you would like to access the associations via the API, you can add the query string argument showAssociations=true. For example:

GET /Harmonizome/api/1.0/gene/nanog?showAssociations=true
{
    "symbol": "NANOG",
    ...
    "associations": [
        {
            "geneSet": {
                "name": "V/Allen Brain Atlas Adult Human Brain Tissue Gene Expression Profiles",
                "href": "/api/1.0/gene_set/V/Allen+Brain+Atlas+Adult+Human+Brain+Tissue+Gene+Expression+Profiles"
            },
            "thresholdValue": 1,
            "standardizedValue": 1.33291
        },
        ...