What's New in Harmonizome 3.0?

New Datasets

Disease or Phenotype Associations

  • DepMap CRISPR Gene Dependency
    • Dependency scores for cell lines following single gene knockdowns
    • 15,946 genes, 1095 cell lines, 697,098 associations
  • DisGeNET Gene-Disease Associations
    • Gene-disease associations sourced from curated repositories, GWAS catalogues, animal models and the scientific literature
    • 15,960 genes, 15,709 diseases, 652,358 associations
  • DisGeNET Gene-Phenotype Associations
    • Gene-phenotype associations sourced from curated repositories, GWAS catalogues, animal models and the scientific literature
    • 14,002 genes, 6,832 phenotypes, 196,561 associations
  • IMPC Knockout Mouse Phenotypes
    • Observed phenotypes of mice following gene knockout
    • 6,763 genes, 667 phenotypes, 36,451 associations
  • MGI Mouse Phenotype Associations 2023
    • Observed phenotypes of transgenic mice collected from mouse phenotyping studies, updated for 2023
    • 12,894 genes, 10,234 phenotypes, 201,654 associations

Genomics

  • ChEA Transcription Factor Targets 2022
    • Transcription factor targets from published ChIP-chip, ChIP-seq, and other transcription factor functional studies, updated for 2022
    • 17,962 genes, 757 transcription factors, 917,047 associations

Physical Interactions

Proteomics

Structural or Functional Annotations

Transcriptomics

New Download Type

Knowledge Graph Serializations

A compressed folder containing serialized dataset associations ready for knowledge graph ingestion in multiple formats:

  • RDF
  • 
                            gobp23.rdf
                            @prefix gene: ncbi.nlm.nih.gov/gene/
                            @prefix RO: purl.obolibrary.org/RO_
                            @prefix GO: amigo.geneontology.org/amigo/term/GO:
    
                            gene:28655 RO:0000056 GO:0050830 .
                            gene:728637 RO:0000056 GO:0010789 .
                            gene:728637 RO:0000056 GO:0045143 .
                            gene:728637 RO:0000056 GO:0051754 .
                            gene:1564 RO:0000056 GO:0042178 .
                            gene:101928527 RO:0000056 GO:1900101 .
                            gene:100130520 RO:0000056 GO:0030593 .
                            gene:550643 RO:0000056 GO:0000956 .
                            gene:550643 RO:0000056 GO:0010607 .
                            gene:100507043 RO:0000056 GO:0032469 .
                            gene:100507043 RO:0000056 GO:0035774 .
                            gene:100507043 RO:0000056 GO:0045665 .
                            gene:100507043 RO:0000056 GO:0048812 .
                            gene:100507043 RO:0000056 GO:0051480 .
                            gene:101929726 RO:0000056 GO:0007520 .
                            gene:101929726 RO:0000056 GO:0014905 .
                            ...
                        
  • JSON
  • 
                            gobp23.json
                            "Version": "1",
                                "nodes": {
                                    "28655": {
                                        "type": "gene",
                                        "properties": {
                                            "id": 28655,
                                            "label": "TRAV27"
                                        }
                                    },
                                    ...,
                                    "GO:0002222": {
                                        "type": "biological process",
                                        "properties": {
                                            "id": "GO:0002222",
                                            "label": "stimulatory killer cell immunoglobulin-like receptor signaling pathway"
                                        }
                                    }}
                                "edges": [
                                    {
                                        "source": 28655,
                                        "relation": "participates in",
                                        "target": "GO:0050830",
                                        "properties": {
                                            "id": "28655:GO:0050830",
                                            "source_id": 28655,
                                            "source_label": "TRAV27",
                                            "target_label": "defense response to Gram-positive bacterium",
                                            "target_id": "GO:0050830",
                                            "directed": true,
                                            "threshold": 1
                                        }
                                    },
                                    ...,
                                    {
                                        "source": 728637,
                                        "relation": "participates in",
                                        "target": "GO:0010789",
                                        "properties": {
                                            "id": "728637:GO:0010789",
                                            "source_id": 728637,
                                            "source_label": "MEIKIN",
                                            "target_label": "meiotic sister chromatid cohesion involved in meiosis I",
                                            "target_id": "GO:0010789",
                                            "directed": true,
                                            "threshold": 1
                                        }
                                    }
                                ]
                        
  • TSV
  • gobp23_tsv/nodes.tsv
    namespace id label
    0 NCBI Entrez 28655 TRAV27
    1 NCBI Entrez 728637 MEIKIN
    2 NCBI Entrez 1564 CYP2D7
    3 NCBI Entrez 101928527 PIGBOS1
    4 NCBI Entrez 100130520 CD300H
    gobp23_tsv/edges.tsv
    source relation target threshold
    0 28655 participates in GO:0050830 1
    1 728637 participates in GO:0010789 1
    2 728637 participates in GO:0045143 1
    3 728637 participates in GO:0051754 1
    4 1564 participates in GO:0042178 1

New Visualizations

Dataset Pair Visualizations

  • Hierarchically clustered heat maps to compare the similarity of attributes from two datasets
  • Added over 1,000 new dataset pair attribute similarity heat maps

Visualization of Tabula Sapiens Gene-Cell Type Associations and PANTHER Pathways

UMAP

  • Interactive cluster plots created using TF-IDF vectorization of gene sets

Documentation

Documentation has been written to help users navigate the site and use each of the visualization and analysis modules. Screenshots from each page are included to help illustrate and explain useful features.

Dataset Pair Crossing

A tool to compare the similarity of attributes from two datasets based on the similarity of their associated gene sets. Selecting two datasets will return a table of gene sets from those datasets with information about their overlapping genes. Visit the Cross page to begin crossing datasets.

Chatbot

An AI chatbot powered by OpenAI's GPT-4o model. Using natural language, users can interact with the Harmonizome knowledgebase to find connections between genes and attributes from different datasets and resources. Visit the Chatbot page to begin chatting.

Harmonizome KG

A knowledge graph application to serve the serialized Harmonizome datasets in a Neo4j database. We have developed a UI that allows users to access this information and create customized subnetworks without any coding necessary. Subnetworks can be created based on a search term, selecting types of edges available from different resources integrated in Harmonizome, and selecting pairs of terms to find paths between nodes. Visit the Knowledge Graph to begin creating subnetworks.