ARCHS4
Computational Tool
FAIR Metrics
2 evaluations
General Information
DescriptionARCHS4 provides access to gene counts from HiSeq 2000 and HiSeq 2500 platforms for human and mouse experiments from GEO and SRA.
Homepagehttp://amp.pharm.mssm.edu/archs4/
Publications
Massive Mining of Publicly Available RNA-seq Data from Human and Mouse
Alexander Lachmann; Denis Torre; Alexandra B. Keenan; Kathleen M. Jagodnik; Hyojin J. Lee; Moshe C. Silverstein; Lily Wang; Avi Ma'ayan
RNA-sequencing (RNA-seq) is currently the leading technology for genome-wide transcript quantification. While the volume of RNA-seq data is rapidly increasing, the currently publicly available RNA-seq data is provided mostly in raw form, with small portions processed non-uniformly. This is mainly because the computational demand, particularly for the alignment step, is a significant barrier for global and integrative retrospective analyses. To address this challenge, we developed all RNA-seq and ChIP-seq sample and signature search (ARCHS4), a web resource that makes the majority of previously published RNA-seq data from human and mouse freely available at the gene count level. Such uniformly processed data enables easy integration for downstream analyses. For developing the ARCHS4 resource, all available FASTQ files from RNA-seq experiments were retrieved from the Gene Expression Omnibus (GEO) and aligned using a cloud-based infrastructure. In total 137,792 samples are accessible through ARCHS4 with 72,363 mouse and 65,429 human samples. Through efficient use of cloud resources and dockerized deployment of the sequencing pipeline, the alignment cost per sample is reduced to less than one cent. ARCHS4 is updated automatically by adding newly published samples to the database as they become available. Additionally, the ARCHS4 web interface provides intuitive exploration of the processed data through querying tools, interactive visualization, and gene landing pages that provide average expression across cell lines and tissues, top co-expressed genes, and predicted biological functions and protein-protein interactions for each gene based on prior knowledge combined with co-expression. Benchmarking the quality of these predictions, co-expression correlation data created from ARCHS4 outperforms co-expression data created from other major gene expression data repositories such as GTEx and CCLE. ARCHS4 is freely accessible from: http://amp.pharm.mssm.edu/archs4.
Metrics:
Canned Analyses generated by the tool

Dataset Accession
Tool Name

Interactive heatmap visualization of RNA-seq dataset GSE16256
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE17312
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE18927
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE22959
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE24565
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE26880
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE27016
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE27452
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE28115
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
Interactive heatmap visualization of RNA-seq dataset GSE29278
Highly interactive web-based heatmap visualization of the top 500 most variable genes in t...
...
per page
Datasets analyzed by the tool

Keyword
Tool Name

GSE65926
Circular RNAs in the mammalian brain are highly abundant, conserved, and dynamically expressed
4
GSE53386
Single-cell RNA-Seq transcriptome analysis of circular RNAs in mouse embryos
4
GSE50445
IVT-seq reveals extreme bias in RNA-sequencing
3
GSE55504
Domains of genomewide gene expression dysregulation in Down syndrome [RNA-seq]
3
GSE65621
Asymmetry of STAT action in driving IL-27 and IL-6 transcriptional outputs and cytokine specificity
3
GSE58679
Human hepatocyte metaplasia in injured humanized mouse livers
2
GSE80136
Whole transcriptome splicing analysis in isogenic lung epithelial and adenocarcinoma cell lines with or without a recurrent splicing factor mutation, ...
2
GSE54473
Transcriptionally repressive histone modifying enzymes a role in olfactory receptor expression
2
GSE74977
The LRF/ZBTB7A transcription factor is a BCL11A-independent repressor of fetal hemoglobin
2
GSE86903
Comparative principles of DNA methylation reprogramming during human and mouse in vitro primordial germ cell specification [Mouse and Human RNA-seq an...
2
...
per page
FAIR Evaluation
Sorry, you must be signed in to evaluate this object.