RESTful API

To accommodate batch queries, RVS accepts REpresentational State Transfer (REST) requests to obtain data for different resource types, including population frequencies, impacts such as protein changes, and computational predictions. Supported arguments are gene, chromosomal location, dbSNP ID, and phenotype.

REST requests to RVS are composed of the requested resource, including frequencies, predicted functional impact, annotated phenotypes, and literature references; and an argument that can be a (list of) gene, dbSNP, coordinates, etc. You can therefore request all observation frequencies for variants in a given gene, for example. Any combination of a resource with an argument is supported; there are additional arguments that help to filter the results, for instance, by considering only variants that result in a CDS or protein change, or limiting results to information on the canonical transcript only.

Results will be returned in JavaScript Object Notation (JSON) format.

We kindly ask you to be sensible about using this option: do not submit hundreds of requests at once, and wait for five seconds in between requests. We reserve the right to block individual IP addresses should we register an inappropriate bulk access.

Resources

A resource refers to the general data type that you are requesting for a list of variants.

  • frequency: returns ancestry-specific allele frequencies for requested variants
  • impact: returns immediate effects of the variants on the DNA, CDS, and protein level, such as 'missense' or 'frameshift' mutation, exon number, and transcript
  • prediction: provides about a dozen scores from SIFT, PolyPhen-2, GWAVA, CADD, and others
  • disease: returns any known phenotype association: clinical significance and pharmacogenetics
  • literature: returns literature references (PubMed IDs) discussing mutations in the requested gene, region, etc.

Arguments

An argument specifies the variants for which you want to obtain the given resource. You can select variants by dbSNP, by chromosomal location, by gene, by associated disease, or by using our allele-specific variant key.

  • gene refers to the HUGO symbol
  • dbsnp may include or skip the 'rs' prefix, but cannot contain allele-specific information, as in "rs123456A>G".
  • If you provide a disease as an argument, we will search RVS for any variant that is associated with a disease (based on ClinVar, OMIM, and literature) where that disease contains the name provided (case insensitive): for example 'epilep' will return variants implicated in "Epileptic encephalopathy" as well as "Idiopathic epilepsy, generalised". Querying for 'epilepsy', on the other hand, will only return the latter example.
    We are working on integrating our current data with ICD9 or 10 codes, Disease Ontology concepts, and/or UMLS/MeSH entries, in order to expand queries.
  • region refers to a chromosomal coordinate or a region; regions are limited to the first 100,000bp. Examples for valid coordinates are "chrX:10000-12000", "22:10083", and "chrMT:4990-5010".
  • vkey are one or more variant keys that specify a chromosomal location and allele; see here for details and a Python package.

Examples

Resource: frequency
Example: https://rvs.u.hpc.mssm.edu/rest/frequency/dbsnp/rs139420557
Example: https://rvs.u.hpc.mssm.edu/rest/frequency/gene/AGRN
Example: https://rvs.u.hpc.mssm.edu/rest/frequency/vkey/_103c8k03c8k013
Example: https://rvs.u.hpc.mssm.edu/rest/frequency/disease/epilepsy
Example: https://rvs.u.hpc.mssm.edu/rest/frequency/region/chr1:109137-109503

Resource: disease
Example: https://rvs.u.hpc.mssm.edu/rest/disease/dbsnp/rs41307846
Example: https://rvs.u.hpc.mssm.edu/rest/disease/gene/MDM2
Example: https://rvs.u.hpc.mssm.edu/rest/disease/vkey/_107TRI07TRI01

Resource: prediction
Example: https://rvs.u.hpc.mssm.edu/rest/prediction/dbsnp/rs139420557
Example: https://rvs.u.hpc.mssm.edu/rest/prediction/gene/MDM2
Example: https://rvs.u.hpc.mssm.edu/rest/prediction/vkey/_103dHW03dHW012
Example: https://rvs.u.hpc.mssm.edu/rest/prediction/disease/epilepsy

Resource: impact
Example: https://rvs.u.hpc.mssm.edu/rest/impact/dbsnp/rs139420557
Example: https://rvs.u.hpc.mssm.edu/rest/impact/vkey/_103dHW03dHW012,_78Mhjb8Mhjb01
Example: https://rvs.u.hpc.mssm.edu/rest/impact/gene/MDM2

Resource: literature
Example: https://rvs.u.hpc.mssm.edu/rest/literature/dbsnp/2088578
Example: https://rvs.u.hpc.mssm.edu/rest/literature/gene/MDM2
Example: https://rvs.u.hpc.mssm.edu/rest/literature/disease/epilepsy

Queries that involve entire genes as arguments can take 1-2 minutes to return results.
Queries that use a chromosomal region as an arguments will be limited to the first 100,000bp that fall into the given region.
Sources for literature references are: dbSNP, literature mining (SETH, ClinVar, OMIM, COSMIC, SwissVar, HGMD, PharmGKB (the last two only for Mount Sinai users). Literature mining sources (prefix "TEXT") indicate where the variant was found: abstract, full text, or supplementary data.

Optional arguments

Optional arguments, in addition to one of gene, dbsnp, vkey, disease, or region, help to further specify which data you want to retrieve, and therefore limit the results. For example, you can limit the results to variants that have an impact on the protein sequence, therefore excluding intronic and intergenic variants. To do so, simply add /proteinchange to the end of the query. Note that proteinchange includes silent mutations in addition to true amino acid sequence alterations.

Optional arguments: proteinchange, cdschange, canonical, canonicaluniprot, canonicalensembl, limit
Example: https://rvs.u.hpc.mssm.edu/rest/disease/gene/MDM2/proteinchange
Example: https://rvs.u.hpc.mssm.edu/rest/frequency/gene/TP53/cdschange/limit
Example: https://rvs.u.hpc.mssm.edu/rest/frequency/gene/AGRN/proteinchange/limit/250
  • The optional argument canonical refers to transcripts that are marked as canonical in either Ensembl or UniProt (isoform). If you are looking for transcripts that are canonical in both sources, you need to specify canonicalensembl and canonicaluniprot at the same time.
  • Providing a limit as an optional argument will restrict the result to the first 100 results. You can submit an additional value as the next arguments ("/limit/10000") to change the maximum number of results.

Batch queries

Each request supports up to ten genes, 100 dbSNP IDs, 100 vkeys, or one disease term. Genes, dbSNP, and vkeys must be comma-separated without white spaces. A comma in a disease term will be interpreted literally. Requests with more than the maximum number of genes/dbSNP/vkeys will be cut to the maximum allowed.

Example: https://rvs.u.hpc.mssm.edu/rest/disease/gene/MDM2,TP53,BRAF/proteinchange/canonicalensembl
Will return the disease associations for all protein-altering variants in the canonical Ensembl transcripts of MDM2, TP53, and BRAF.

We provide an example implementation for a Perl client to annotate VCF files with ExAC allele frequencies, based on RVS REST API calls. Please check the readme file for instructions, as you may need to also install REST and JSON Perl modules.