Download Section

The download section contains all data that was compiled for Geneshot. It contains various gene pair similarity matrices as well as a preprocessed GeneRIF and AutoRIF files. Gene association to biological terms is performed using the NCBI e-utilities API for PubMed.


GeneRIF was downloaded from and dates where replaced with the publication date derived the PubMed IDs. The file contains 396,020 gene associations to PubMed IDs.

14 MB


AutoRIF was built by querying PubMed with all human gene symbols using the NCBI e-utilities API. All PubMed IDs matching the gene symbol query with their associated publication date are contained in this file. The file contains 4,908,396 gene associations to PubMed IDs.

120 MB

Pairwise Gene Correlation

The pairwise gene correlation co-expression matrix was calculated using the human ARCHS4 gene expression samples across a diverse set of cellular backgrounds. The gene counts where quantile normalized before calculating the Pearson’s correlation coefficient.

Correlation matrix
2.3 GB

Pairwise Gene Co-occurrence

We calculated the co-occurrence of genes using user submitted gene lists from Enrichr.

Co-occurrence matrix
1.5 GB

Should you have any questions regarding the data please don't hesitate to .