Often computational miRNA research tools and databases find prominent use in downstream analysis by other computational pipelines
rather than by biologists. miRNAVISA is a web application that allows interrogation and comparisons of miRNA families for hypotheses
generation, and comparison of the per-species chromosomal distribution of miRNA genes in different families. The design and
implementation of the miRNAVISA system is targeted for use by both the biological and computational scientists.
The results generated using miRNAVISA are useful say for understanding and contrasting closely related species in evolutionary terms,
assessing the abundance and general properties of family-specific miRNA-directed regulation in these species, and evaluating common
properties that may be shared amongst miRNA families. Biological investigations that are enabled by miRNAVISA may include, but are
not limited to, the following.
Enhancing a hypothesis-driven analysis of the properties of miRNA, miRNA gene and miRNA families.
Investigating the chromosomal distribution of family-annotated miRNA and miRNA genes in different species e.g. for downstream examination and testing of miRNA regulation models (miRNA promoters).
Evaluating the diversity of different miRNA families across species or a lineage.
Summarizing the miRNA knowledge database (miRBase) given a set of specific species of interest.
Jointly interrogating and integrating the genomic distribution of miRNA, miRNA gene organization (e.g. co-location on specific chromosomes), and sequence and/or structure organization of miRNA genes and the functions of their associated mature miRNA that may be encoded and implied by miRNA family categories.
Evaluating specific attributes and characteristics, such as the chromosomal distribution of miRNA genes, that may define inter-miRNA family relationships and existence of miRNA subclasses comprising of different miRNA families [clan(s) of families].
Examining and analyzing the (up/down) expression trends of the functional mature miRNA in a specific miRNA family, based on integrated experimental evidence contained in different public miRNA databases.
Assessing the link of family-specific mature miRNA and their genes to different diseases and biochemical pathways.
Assessing the yet-to-be-determined function of family-annotated miRNA genes given the known function(s) of miRNA and miRNA genes in the same family.
Improving in vitro and in vivo experimental design given the result summarized in miRNAVISA inter-species comparison and/or intra-species miRNA genes/family maps.
Evaluating DNA strand preferences of miRNA genes in a specific miRNA family.
The term miRNAVISA derives from a concatenation of miRNA and VISA. The term VISA connotes both the "visualizations" of data to allow easy correlation of analyzed variables (species, chromosomes, miRNA families) and "entry" into the exciting miRNA world.
The website offers access to three tools that are listed in the main menu:
Under this menu, the user can select a combination of two or more species together with at-least two families. The user can also opt
to use the automated selection buttons for the popular (in terms of miRNA gene numbers) species vis-a-vis the largest miRNA
families in the latest release of the miRBase database i.e., the central repository for miRNA research.
In the example below, chicken (Gallus gallus) and mouse (Mus musculus) are selected with the combination of the
miRNA families mir-515, mir- 548, mir-466 and mir-663. The user can select a species or miRNA family by clicking on the respective
buttons with their names.
After submitting the query, the program generates graphical output in the Scalable Vector Graphic (SVG) format. Two buttons are
provided for either download or display of the graph. By clicking on the "Download graph" button user can download a compressed SVG
picture and by clicking on the "Display graph" the program opens a new window with the graph (see below) for perusal.
The graph shows comparisons of miRNA gene distribution based on user input of miRNA families and species, relative to all
other species in the miRBase database.
The results in Figure 3a imply that
This figure summarizes the data at miRBase [Release 19 (R19), dated August 2012].
The miRBase R19 database contains 21,264 miRNA genes from 193 species as indicated on the y-axis.
There are 15,554 (298+15256) family-annotated miRNA genes in the miRBase R19 that belong to 1,543 miRNA families. Only 298 miRNA
genes belonging to the four selected families are analyzed and summarized in this figure.
This figure shows the distribution of miRNA genes in the miRNA families indicated on the x-axis across the 193 species.
The output from miRNAVISA indicates that none of 150, 116 and 16 miRNA genes annotated to the miRNA families mir-515, mir-548 and
mir-663, respectively, belong to the two species.
In fact, these families may only be specific to primate species as illustrated in Figure 3b (below) and elaborated in the manuscript.
All the 16 miRNA genes belonging to the miRNA family mir-466 are is specific to mouse and chicken but a majority (94%) of these have
been observed in mice (Mus musculus). Only one family gene has been annotated in the chicken genome (Gallus gallus)
The Intra-species comparisons menu offers the same user interface as the inter-species comparison menu. However, the intra-species
comparisons menu only allows the selection of one species and at-least two but no more than 50 miRNA families. The user can also opt to
use the automated selection buttons for the human genome (default genome) to assess the distribution of miRNA genes belonging to the
largest (top) miRNA families which are specific only to the selected genome.
The graphical output, a genome miRNA genes/family map, shows the chromosomal distribution (rows) and per-family distribution
(columns). The miRNA genes that are not mapped to any chromosome are reported under a predefined label 'Chunks/NoMap'. Figure 4
illustrates the genomic distribution of family-annotated miRNA genes in the human genome.
There are 1600 registered human miRNA genes in miRBase R19. About 61.2% (367 plus 612 genes) are classified into different miRNA families.
Figure 4 shows genome-wide distribution of miRNA genes in the 50 largest miRNA families. Moreover, it is apparent from the figure that:
The most populous miRNA family in human genome is mir-548 and its genes are spatially located in different chromosomes.
Most human miRNA genes are located in chromosomes 1 (135 genes) and X (113 genes).
Genes in miRNA families mir-515 (chromosome 19), mir-154 (chromosome 14), mir-379, mir-743, mir-329, mir-368, mir-500,
and say mir-188 are each co-located on a specific chromosome.
Some co-located miRNA genes, on chromosome 14, in different miRNA families exist as a large cluster as explained in the manuscript.
Several hypotheses can be formulated based on the results in Figure 3 and 4 as discussed in the manuscript.
The Family Query menu allows a keyword search based on standard miRBase nomenclature for the names of miRNA, miRNA genes, and their
families. The names/keywords are case insensitive and can include:
miRNA gene family name e.g. mir-515, or
miRNA gene family accession e.g. MIPF0000020, or
miRNA gene (hairpin) name e.g. hsa-mir-524 and hsa-mir-519a-2, and/or
hairpin accession e.g. MI0003160 and MI0003182.
An error message is returned for unsuccessful keyword queries. The output of a successful miRNA family query includes:
A table showing the summary statistics for a searched keyword and detail of miRNA genes associated with the keyword such as
expression trends and annotated functions, if any, of the functional ~22nt mature miRNA hosted on these genes (see Figure 5).
A graph showing the distribution of family-specific miRNA genes across species associated to the searched keyword (see Figure 6).
DNA strand preferences of miRNA genes in a given family associated to the searched keyword (see Figure 7).
Table 5 gives an overview of a query report under a keyword search "mir-515". The top section of the table reports the statistics
of the miRNA gene family name and accession numbers. The bottom section of the table gives the gene accession numbers, gene name,
chromosome, strand, species name, associated function, associated diseases and terms, expression status (up/down) and the reference
database from which additional information about the gene is given in detail.
The contents of the reported tables can be filtered by simply typing letters in the "Search" box say for a unique name
of interest to the user. The contents of the table are adaptively adjusted with each input of consecutive letters. The contents of the
reported columns can also be sorted (column-wise) by simply clicking on the column name. The user can also sort the results in a table
using multiple columns by holding down the Shift key and clicking on the desired columns.
The first graph (picture bellow) shows distribution by species of the searched miRNA family.
Keyword "mir-515" query report generated by miRNAVISA showing a detailed summary of the miRNA genes constituting the miRNA family mir-515, their location in different genomes, expression trends, their associated diseases or biochemical pathways and links to different databases where reference to the experimental evidence is given.
Figure 6 shows the distribution of miRNA genes with both known (mapped) and unknown (unmapped) chromosomal co-ordinates in each
species. Figure 7 shows a comparison of the DNA strand preferences of mapped miRNA genes on the right panel. The number of miRNA
genes that have been mapped on the positive (+1; FWD) strand and negative (-1; REV) strand are shown in red and blue bars, respectively.
The number of unmapped miRNA genes in each species is reported on the left panel (green bars) in Figure 7.
The blue bars in Figure 7 are explained in the manuscript along with the inference that the miRNA family mir-515 are preferentially
transcribed from the +1 DNA strand. All the miRNA families mir-515 genes have only been reported in primate species (Figure 6).
The user can download all the results of a query for their own use. These results are packaged in one archive file. All the graphs are
in the Scalable Vector Graphs (SVG) file format and the tables are "tab separated" files that can easily be imported into any
Timothy Kevin Kuria Kamanu, Aleksandar Radovanovic, John A. C. Archer, Vladimir B.
Bajic, Exploration of miRNA families for hypotheses generation