How can I visualise features or blast data? How can I visualise sequencing data in a workflow? Build several visualisations in JBrowse. Have basic familiarity with moving around JBrowse, and loading several data tracks
JBrowse is a fast, embeddable genome browser built completely with JavaScript and HTML5, with optional run-once data formatting tools written in Perl.
The Galaxy tool accepts data in many formats:
The JBrowse tool has an incredibly extensive number of options, more than anyone needs most of the time.
The data for today is a subset of real datasets from E. coli MG1655 strain K-12
Create and name a new history.
Import the datasets:
https://zenodo.org/record/3591856/files/blastp%20genes.gff3
https://zenodo.org/record/3591856/files/blastp%20vs%20swissprot.xml
https://zenodo.org/record/3591856/files/dna%20sequencing.bam
https://zenodo.org/record/3591856/files/dna%20sequencing%20coverage.bw
https://zenodo.org/record/3591856/files/genes%20(de%20novo).gff3
https://zenodo.org/record/3591856/files/genes%20(NCBI).gff3
https://zenodo.org/record/3591856/files/genome.fa
https://zenodo.org/record/3591856/files/RNA-Seq%20coverage%201.bw
https://zenodo.org/record/3591856/files/RNA-Seq%20coverage%202.bw
https://zenodo.org/record/3591856/files/variants.vcf
JBrowse
Tool with the following parameters:
Double clicking will zoom in on the genome, or you can use the magnifying glass icons to zoom in our out
JBrowse
tool takes track names directly from file namesJBrowse
tool and it will produce a new JBrowse instance with corrected names.All of the track types in the JBrowse tool support a wide array of features. We’ve only looked at a simple track with default options, however there are more tools available to us to help create user-friendly JBrowse instances that can embed rich data.
JBrowse
with the following parameters:
View the output. The lighter coloured genes on the denovo track are coloured based on the confidence score given to the prediction by AUGUSTUS. NCBI genes do not have scores, these are more like an official gene set.
21,200
, either manually, or by copying and pasting the location block: NC_000913.3:18351..24780
Right click on the yaaY
gene, and click the See protein at NCBI menu option.
This menu option is dynamic, try it with a few other features from the genes (NCBI).gff3
track. These features have a locus_tag
and the menu button we added will open a URL to an NCBI search page for the value of this locus_tag
attribute
In JBrowse
terminology the right click menu of some features is called the contextual menus.
You can customize this menu to add new links and options that will be useful to the user. These links can be templated with variables based on metadata of the feature that the user clicked upon.
There are more valid values for templating than mentioned in the help. When you click on a feature in the JBrowse instance, it will present all of the properties of the feature. Any of the top level properties can be used directly in your templating
Contextual menus can be used to link to more than just NCBI.
This is the next major category of data that people wish to visualize: sequencing, coverage, and variation.
The sequencing data can be of any type, and does not necessarily need to be sequencing data, as long as the results are formatted as BAM files.
Here we will add several coverage tracks (essentially a line plot along the genome, associating a position with a value) with various visualisation options like scaling and display type. Each of these visualisation options can be useful in different situations, but it largely is a matter of preference or what you are used to seeing.
Next we will add a sequencing dataset, a BAM file which maps some sequencing reads against various locations along the genome. JBrowse helpfully highlights which reads have mapping issues, and any changes in bases between the reads and the genome.
We Autogenerate a SNP track, which produces an extra track we can enable in JBrowse. This track reads the same BAM file used for visualising reads, and then produces a SNP and coverage visualisation.
NB: This only works for small BAM files, if your files are large (>200 Mb),
then you should consider generating these coverage and SNP tracks by other means
(e.g. bamCoverage
and FreeBayes
or similar tools) as it will be significantly faster.
JBrowse
with the following parameters:
Execute and then explore the resulting data.
Try:
The Blast visualisation module requires that you have a gff3 formatted set of features which you then exported as DNA or protein, and blasted.
The reason is easy to understand: when you extract DNA/protein sequences for Blasting, this process looses information about where these sequences were along the genome. The results from Blast retains the identifiers from the DNA/protein sequences, so we need to map these identifiers, to proper features with locations.
The best way to accomplish this is through the gffread
tool which can cleanup a gff3 file,
and export various features, optionally translating them.
With these outputs, the cleaned features and fasta formatted sequences, you can Blast
the sequences,
and then supply the resulting Blast XML outputs in addition to the cleaned features,
allowing a script to re-associate these Blast results to their original locations along the genome.
JBrowse
with the following parameters:
This does not exhaustively cover JBrowse
, and the tool is more extensible than can be easily documented,
but hopefully these examples are illustrative and can give you some ideas about it