Introduction to DNA Features Viewer
Welcome to a detailed exploration of DNA Features Viewer, a versatile tool crafted to simplify the visualization of DNA features. Created by the Edinburgh Genome Foundry, this Python library enables users to effortlessly generate clear and insightful visual representations of DNA sequences, extracted from sources like GenBank, GFF files, or Biopython SeqRecords.
What is DNA Features Viewer?
DNA Features Viewer is designed to automatically produce straightforward and comprehensible plots, even for sequences that contain numerous overlapping features and lengthy labels. It integrates seamlessly with Matplotlib and Biopython, and the resulting visualizations can be exported in multiple formats, including PNG, JPEG, SVG, and PDF. Such flexibility makes it an ideal choice for report generation, academic publications, and Laboratory Information Management Systems (LIMS) interfaces.
Installation
Setting up DNA Features Viewer is straightforward. For those with PIP installed, a simple command in the terminal is all it takes:
pip install dna_features_viewer
Alternatively, you can install it by unzipping the source code and running the following command:
python setup.py install
To utilize the Bokeh features, which enable interactive plots, you will need to install Bokeh and Pandas:
pip install bokeh pandas
For parsing GFF files, the bcbio-gff
library is required:
pip install bcbio-gff
Examples of Use
Basic Plots
Here's a straightforward demonstration, defining features manually:
from dna_features_viewer import GraphicFeature, GraphicRecord
features = [
GraphicFeature(start=0, end=20, strand=+1, color="#ffd700", label="Small feature"),
GraphicFeature(start=20, end=500, strand=+1, color="#ffcccc", label="Gene 1 with a very long name"),
GraphicFeature(start=400, end=700, strand=-1, color="#cffccc", label="Gene 2"),
GraphicFeature(start=600, end=900, strand=+1, color="#ccccff", label="Gene 3")
]
record = GraphicRecord(sequence_length=1000, features=features)
record.plot(figure_width=5)
For circular plots, simply replace GraphicRecord
with CircularGraphicRecord
:
Interactive and Advanced Plots
Besides the static visualizations, DNA Features Viewer supports interactive plots via Bokeh, presenting an engaging browser-based experience. Users can also plot nucleotide or amino acid sequences alongside the visualizations:
sequence = "ATGCATGCATGCATGCATGCATGCATGC"
record = GraphicRecord(sequence=sequence, features=[
GraphicFeature(start=5, end=10, strand=+1, color='#ffcccc'),
GraphicFeature(start=8, end=15, strand=+1, color='#ccccff')
])
ax, _ = record.plot(figure_width=5)
record.plot_sequence(ax)
record.plot_translation(ax, (8, 23), fontdict={'weight': 'bold'})
Using Data from Files
DNA Features Viewer integrates effectively with BioPython, allowing the easy visualization of BioPython records or data from GenBank or GFF files. For instance:
from dna_features_viewer import BiopythonTranslator
graphic_record = BiopythonTranslator().translate_record("my_sequence.gb")
ax, _ = graphic_record.plot(figure_width=10, strand_in_label_threshold=7)
Advanced Customizations
Users can define custom "themes" using personalized record translators, instead of the default BiopythonTranslator
. This offers flexibility in how features are represented, enabling the customization of colors, labels, and displayed feature types.
Multidimensional Plots
From version 3.0, users can plot sequences across multiple lines or pages, enhancing the visualization of lengthy sequences.
Applications with Other Packages
In the broader synthetic biology landscape, DNA Features Viewer serves as a crucial component of the EGF Codons software suite. It's used in tools like DNA Chisel for optimizing DNA sequences, and GeneBlocks for analyzing sequence differences, demonstrating its utility across diverse biological applications.
Conclusion
DNA Features Viewer, with its extensive features and ease of integration, stands out as a powerful tool for visualizing and interpreting complex DNA data. Whether for academic, research, or industrial purposes, its capabilities enhance the user’s ability to convey genetic information clearly and effectively. With ongoing support and updates, it continues to facilitate innovation within the biological sciences.