With the advent of next generation sequencing (NGS) technologies, there has been an explosion in the volume of genomic data available to researchers. NGS provides a platform to rapidly sequence genomes, and offers new ways to unlock the genomes of species that are difficult to maintain. Creative federal incentives in the US (National Human Genome Research Institute - http://www.genome.gov/10000368) have contributed to an unprecedented drop in the costs involved in sequencing a genome from ~ $100,000 in 2002 to ~ $5000 in 2013 [1], which in effect has converted a field that was previously dominated by consortiums into an open playing field where small individual labs can participate. However, to prevent individual researchers from becoming caught in the maelstrom of this new genomic era, it is imperative to develop open source and user-friendly tools to help investigators study this volume of data. Designing primers to validate candidate genes from RNA-seq projects as well as developing diagnostic tools for the genomes of recently sequenced species, are examples of routine tasks faced by researchers in tackling NGS related data. Primer design, and in particular primer design en masse, also becomes essential for researchers working with multi-gene families, or metagenomic samples [2], as well as many other PCR based applications including primer walking, assembly PCR, digital PCR, ligation PCR, nested PCR, and quantitative PCR. Therefore, as the volume of genomic data continues to increase, so does the scale of experiments related to its analysis, and this is particularly true for primer design.
Here I describe a Perl module called PrimerView that is straightforward to implement or plug into larger pipelines, and enables the user to automate the process of primer design for DNA datasets of any size. Often, a visual readout of primer position on the target sequence is the fastest and most helpful way to validate the distribution and position of primers, and to this end PrimerView includes graphical outputs for each primer mapped to its target sequence. Each primer/target sequence pair is aligned and converted into a JPEG formatted file for easy visualization (other formats are also available). A PNG format file is also generated by PrimerView to depict the distribution of all designed primers across each input sequence. PrimerView uses the popular Bioperl [3] modules to align primers to the target sequence using the alignment software MUSCLE [4], and also to convert the alignment from CLUSTAL [5] format into graphical files. PrimerView may be particularly helpful for researchers working with large datasets where primers must be efficiently designed for many genes, as well as for various PCR applications including primer walking and assembly PCR reactions, where a graphical output can quickly help users determine primer coverage and distribution.