A detailed description of the data generation method can be found in Hoye et al., J Neurosci (2017) and He et al., Neuron (2012). Click the hyperlink to access the publication on PubMed. Please cite if you use data derived from the publication.
Each individual bar graph can be downloaded as either a .png or .pdf. A table generated under the “Compare Cell Types Enrichment” header is downloadable as a .csv at the bottom of the table. The entirety of the raw data table is downloadable under the “Download Data” header.
Significance between any one cell type and another was calculated by empirical Bayes, adjusting the p-value for multiple comparisons with the Benjamini-Hochberg correction. This cannot be performed when comparing one cell type to all others as this is not a pairwise comparison. Rather, the R package pSI was used to determine miRNAs enriched in specific cell populations across a large number of profiles.
Under the “Search by Cell Type” tab, compare the cell type of interest to all other cell types. Then click the column header that refers to the expression of the cell type of interest and this will sort the table so the most highly expressed miRNA by that cell type are at the top.
A conserved miRNA is one where the sequence is similar across many species. Generally speaking, a more highly conserved miRNA strand is more likely to be stable and therefore loaded into the RISC complex to serve as a functional regulator. While the data were all generated in mouse models, conservation scores can potentially inform on the likelihood of a shared phenotype across multiple organisms. Conservation data was taken from TargetScan “miR Family” data table. TargetScan defines conservation along three levels: broadly conserved (conserved across most vertebrates, usually to zebrafish), conserved (conserved across most mammals, but usually not beyond placental mammals) and poorly conserved (all other combinations).
The data from Hoye 2017 utilized the 5p/3p strand notation system, whereby the He 2012 data originally utilized the star strand notation. Where possible, we converted this into the 5p v 3p strand notation system, using miRBase and miRDB as guides. We hope this will help to improve clarity and specificity in order to remove unambiguous identification. We did not change any data during this reformatting process.
Please reach out to either Dr. Joseph Dougherty (jdougherty@wustl.edu) or Dr. Timothy Miller (miller.t@wustl.edu) with questions related to this database.
Within each bar graph, miRNA expression is normalized to the tissue-wide Ago. Thus, the tissue-wide Ago’s expression level is set at 1.0 and set as the exogenous control. All other cell types are displayed as expression relative to the exogenous control.
Relative Fold Change describes the expression of a miRNA in relation to an exogenous control. This reflects the relative abundance of a given miRNA, but not the absolute abundance. Ct (cycle threshold) is determined by the absolute miRNA starting concentration and is therefore a more direct measure of absolute abundance. Ct is still not a direct measure of absolute abundance either and is the closest proxy available in the absence of an FPKM equivalent.
Ct is naturally in log2FC (log base 2 of the difference in fold change). The log2FC is calculated as the difference Ct between Cell Type 1 and Cell Type 2. A positive log2FC means that Cell Type 1 has higher expression of the given miRNA, and vice versa. Each value of 1.0 log2FC units indicates an additional two-fold enrichment.
Only samples from these four cell types and two tissue types were run on microarrays from the same lot. Microarrays can only be analyzed if performed side by side in this fashion. Because of this, no additional cell types will be added to the existing dataset. Future assays may be performed or further datasets may be incorporated to fill in important gaps in website functionality. The original publication included data from CNP-Cre (for oligodendrocytes), however this was a secondary assay performed to verify the enrichment of motor neuron-enriched miRNA and was not run on microarrays from the same lot.
If the cycle threshold exceeded 35, the sample was considered not to express the given miRNA. To account for non-specific background associated with the myc immunoprecipitation, miRNA expression was measured by an identical myc immunoprecipitation in non-transgenic animals. For a cell type to express a given miRNA above the background value, the biological replicate with the highest Ct count (lowest expression) must be 2 Ct less than non-transgenic median Ct.
While the neuron v glia dataset was generated using Taqman microarrays, the He et al. dataset was generated using small RNA sequencing. miRNA will be detectable via this method that did not have corresponding probes on the microarray. Conversely, He et al. contains data from different brain regions and only from neuronal cell types. It is therefore possible that no reads were detected because the miRNA is only expressed in more caudal regions or in non-neuronal cell types. The difference in data generation also explains the difference in units. While microarray are quantified as Ct, similar to qPCR, small RNA sequencing is quantified here using CPM (counts per million).
Because the data was generated via small RNA sequencing, the miRNA abundance is measured in terms of CPM. CPM is a normalized number of reads and therefore can range from 0 well into the hundreds of thousands. We therefore chose to display this information primarily on a log scale, to capture as much of this large range as possible. However, the log scale can compress differences and make them seem non-significant. For this reason, we display the CPM for each sample below its corresponding bar and on the secondary y-axis. We also changed the y-axis minima, so they start at a value on the order of magnitude of the number of reads, rather than 0.0. These steps will help to visualize inter-cell type differences.
A miRNA was considered to be expressed in a given sample if the CPM reliably exceeded 1. Background data from non-transgenic animals were not available for this data.