Guidance for RNA-seq co-expression network construction and analysis: safety in numbers 


We have made thresholded versions of the aggregate networks described in the paper available in R.



Each aggregate network is the average of individual co-expression networks, built from individual expression based experiments (i.e., a tissue transcriptome, case-control disease studies, cell time-course etc.). For each experiment, we take the expression levels across samples or conditions of a gene or transcript, and calculate the Spearman correlation coefficient of all gene pairs. This generates a weighted co-expression network: each gene is a node, and each gene pair correlation value (rho) is the edge weight between the gene pair. The network is ranked and standardized, so that each edge has a weight from 0 to 1. In order to aggregate these individual co-expression networks, a standard set of genes needs to be used across all experiments. For microarray data, this can be achieved by using a single platform for all co-expression networks. For RNA-seq data, a set of genes/transcripts (e.g, a gtf file from gencode) ensures that the reads  are mapped to a common genome or transcriptome.

We have made available two aggregate networks, that are binarized by thresholding to the top 0.5% interactions, converting those weights into 1′s and the remaining edges weighted as 0. The RNA-seq aggregate network is an aggregate of 50 individual co-expression networks, across 1,970 samples. The network has 30,705 nodes (Ensembl IDs). The microarray aggregate network is an aggregate of 43 individual co-expression networks, across 5,134 samples, with 20,283 nodes (Entrez IDs). These individual experiments were all from the ‘Affymetrix Human Genome U133 Plus 2.0 Array’ microarray platform (GPL570), only using the probes that mapped to a single gene. Median expression values  were used for genes that were mapped to by multiple probes. For more details, see our  paper.


Binary network thresholded to the top 0.5% of gene connections
Filename: rnaseq.aggregate.thresh.spearman.Rdata
Objects: rnaseq.agg, genes


Binary network thresholded to the top 0.5% of gene connections
Filename: microarray.aggregate.thresh.spearman.Rdata 
Objects: micr.agg, genes