PubMed contains millions of publications that co-mention terms that describe drugs with other
biomedical terms such as genes or diseases. DrugShot is a web-based server application that
enables users to enter any biomedical search term into a simple input form to receive ranked
lists of drugs and small molecules based on their relevance to the search term. To produce ranked
lists of drugs, DrugShot cross-references returned PubMed IDs with DrugRIF or AutoRIF, curated resources of
drug-PMID associations, to produce an associated compound list where each compound is ranked
according to total co-mentions with the search term from shared PubMed IDs. Additionally, lists of
compounds predicted to be associated with the search terms are generated based on co-occurrence in
the literature and signature similarity from L1000 drug-induced gene expression profiles. Through its search
functionality and abstraction of drug lists from different sources, DrugShot can facilitate hypothesis
generation by suggesting small molecule sets related to any biomedical term of interest.
DrugShot is a search engine that accepts any search term to return a list of small molecules that are mostly associated with the search terms.
It can be used to identify novel associations between small molecules and biological mechanisms and processes. A free text search is redirected
to a PubMed search via the NCBI E-utilities API for retrieving publications that match the search terms. The resulting list of publications
is cross-referenced with DrugRIF or AutoRIF to convert PMIDs to lists of small molecules. DrugRIF and AutoRIF enlist associations between publications
and small molecules. DrugShot returns the list of small molecules ranked by their frequency of mentions within the publications returned by the search terms.
In addition, the returned small molecules from the initial query are supplanted with additional small molecules based on literature co-mentions and
L1000 signature similarity.
Below we explain how the underlying datasets used by DrugShot were generated, and how to make decisions about the various options available
for querying DrugShot.
Underlying Data Sets
Below we provide a short introduction about the underlying datasets used by DrugShot. These datasets are available from the download section.
Here we have a short introduction of the datasets that are being used by DrugShot. The data used by DrugShot are
AutoRIF contains associations between small molecules and publications. AutoRIF is automatically generated, and it currently contains 7,992,745
associations. We constructed AutoRIF by querying the NCBI E-utilities API with drug names (each matched with a MeSH identifier) and collected the PMIDs for each query.
DrugRIF is an alternative file that contains associations between small molecules and publications. DrugRIF is automatically generated, and it currently contains 1,996,791
associations. We constructed DrugRIF by querying PubMed with InChIKeys (each with an associated small molecule name) and collected the PMIDs for each query.
Drug-drug L1000 Signature Similarity Matrix from SEP-L1000
The drug-drug signature similarity matrix used by DrugShot contain pairwise correlations between drug-induced gene expression profiles.
The correlations were calculated using cosine similarity between drug-induced gene expression signatures calculated using the Characteristic Direction method.
This pairwise drug similarity matrix is used to transitively associate drug sets returned from the original DrugShot query with
their most correlated co-expressed drugs.
Drug-drug Literature Co-mentions Matrices
The AutoRIF and DrugRIF drug-drug literature co-mentions matrices contain pairwise drug-drug similarity based on the co-occurrence counts of drugs in each respective association file.
The matrices contains the counts of how often two drugs co-occur in the same query list.
Submitting Queries to DrugShot
The query box of DrugShot contains two free text input fields where users can enter multiple search terms in combination. The top field is
for terms that will be converted to associations with drugs. The bottom free text input field is for terms that the user wishes to exclude
from the search. For example, the figure above describes how to conduct a search if we want to identify drugs that are most relevant to the
search term “liver fibrosis” but not "cancer".
The input field labeled "Top Associated Drugs to Make Predictions" sets the number of top drugs from the query to use for generating the
predicted lists. This value can be changed later after the query completes.
The query time varies from instantly returned results to about 30 seconds. The time for the query to complete is mostly dependent on the
number of associated publications found for the search terms. A very general search term such as "cancer" or “diabetes” will result in a
longer wait time than a more specific search term such as "hair loss". Below the free text input fields, several sample queries are provided.
Clicking on these terms triggers a query.
Associated Drug Table
The associated drug table contains the same information that is displayed in the scatter plot. The table will appear once the query results are
returned. The number of drugs returned by the query can vary, depending on how many drugs are associated with the search terms. The
slider above the table enables users to set the threshold for how many drugs to include for predicting additional drugs. In this example, the
top 50 associated drugs are used to perform the prediction.
Predicted Drug Tables
Predicting additional drugs that should be associated with the search terms is performed using the drug-drug L1000 signature similarity matrix from SEP-L1000
and the drug-drug literature co-mentions matrix from DrugRIF. The results table can be viewed by selecting the corresponding tab. The prediction
tables list the top 200 drugs most associated with the drug set from the associated drug table on the left.
The data from both tables can be downloaded in a variety of formats. The drug set within each table can be directly submitted to DrugEnrichr
for further analysis.
Drug Set Augmentation
The drug set augmentation page lets users upload a set of drugs. DrugShot can compute the group similarity of the drug set to all drugs in the given drug-drug similarity matrix.
DrugShot will identify the top 200 drugs ranked by their similarity score. In the case of the drug signature similarity matrix, the average cosine similarity between the input drug set
to the drugs in the signature similarity matrix is calculated. In the case of the literature co-mentions matrix, the average PubMedID co-mentions between the input drug set and drugs in the
literature co-mentions matrix are calculated.
The novelty of drugs is also reported here by returning the number of publications per drug. Drugs are grouped into four quantiles of novelty based on the
publication count of drugs in DrugRIF. A drug is rare if it has 7 or fewer publications. A drug is uncommon when it has between 8 and 25 publications.
Common drugs have 26 to 87 publications. All drugs with at least 88 publications are considered very common.
These bins divide the drugs into equisized bins.
Please acknowledge DrugShot in your publications by citing the following reference:
DrugShot: querying biomedical search terms to retrieve prioritized lists of small molecules
Eryk Kropiwnicki, Alexander Lachmann, Daniel J. B. Clarke,
Zhuorui Xie, Kathleen M. Jagodnik and Avi Ma'ayan
BMC Bioinformatics 23, 76 (2022). https://doi.org/10.1186/s12859-022-04590-5
Article number: 76, 19 February 2022
The DrugShot source code is available from GitHub under the Apache License 2.0. Commercial users should contact Mount Sinai Innovation Partners at MSIPInfo@mssm.edu for licensing.
The DrugShot source code is available on GitHub at https://github.com/MaayanLab/drugshot
DrugShot is not to be used for treating or diagnosing human subjects. DrugShot or any documents available from this server are provided as is
without any warranty of any kind, either express, implied, or statutory, including, but not limited to, any implied warranties of merchantability,
fitness for particular purpose and freedom from infringement, or that DrugShot or any documents available from this server will be error free.
The Ma'ayan Laboratory makes no representations that the use of DrugShot or any documents available from this server will not infringe any patent or
proprietary rights of third parties. In no event will the Ma'ayan Laboratory or any of its members be liable for any damages, including but not limited
to direct, indirect, special or consequential damages, arising out of, resulting from, or in any way connected with the use of DrugShot or documents
available from this server.
Should you have any questions regarding DrugShot please don't hesitate to contact us