PrimerView: high-throughput primer design and visualization
© O’Halloran. 2015
Received: 7 April 2015
Accepted: 26 May 2015
Published: 4 June 2015
High-throughput primer design is routinely performed in a wide number of molecular applications including genotyping specimens using traditional PCR techniques as well as assembly PCR, nested PCR, and primer walking experiments. Batch primer design is also required in validation experiments from RNA-seq transcriptome sequencing projects, as well as in generating probes for microarray experiments. The growing popularity of next generation sequencing and microarray technology has created a greater need for more primer design tools to validate large numbers of candidate genes and markers.
To meet these demands I here present a tool called PrimerView that designs forward and reverse primers from multi-sequence datasets, and generates graphical outputs that map the position and distribution of primers to the target sequence. This module operates from the command-line and can collect user-defined input for the design phase of each primer.
PrimerView is a straightforward to use module that implements a primer design algorithm to return forward and reverse primers from any number of FASTA formatted sequences to generate text based output of the features for each primer, and also graphical outputs that map the designed primers to the target sequence. PrimerView is freely available without restrictions.
KeywordsPCR Primer design Genotyping Perl NGS Sequencing
With the advent of next generation sequencing (NGS) technologies, there has been an explosion in the volume of genomic data available to researchers. NGS provides a platform to rapidly sequence genomes, and offers new ways to unlock the genomes of species that are difficult to maintain. Creative federal incentives in the US (National Human Genome Research Institute - http://www.genome.gov/10000368) have contributed to an unprecedented drop in the costs involved in sequencing a genome from ~ $100,000 in 2002 to ~ $5000 in 2013 , which in effect has converted a field that was previously dominated by consortiums into an open playing field where small individual labs can participate. However, to prevent individual researchers from becoming caught in the maelstrom of this new genomic era, it is imperative to develop open source and user-friendly tools to help investigators study this volume of data. Designing primers to validate candidate genes from RNA-seq projects as well as developing diagnostic tools for the genomes of recently sequenced species, are examples of routine tasks faced by researchers in tackling NGS related data. Primer design, and in particular primer design en masse, also becomes essential for researchers working with multi-gene families, or metagenomic samples , as well as many other PCR based applications including primer walking, assembly PCR, digital PCR, ligation PCR, nested PCR, and quantitative PCR. Therefore, as the volume of genomic data continues to increase, so does the scale of experiments related to its analysis, and this is particularly true for primer design.
Here I describe a Perl module called PrimerView that is straightforward to implement or plug into larger pipelines, and enables the user to automate the process of primer design for DNA datasets of any size. Often, a visual readout of primer position on the target sequence is the fastest and most helpful way to validate the distribution and position of primers, and to this end PrimerView includes graphical outputs for each primer mapped to its target sequence. Each primer/target sequence pair is aligned and converted into a JPEG formatted file for easy visualization (other formats are also available). A PNG format file is also generated by PrimerView to depict the distribution of all designed primers across each input sequence. PrimerView uses the popular Bioperl  modules to align primers to the target sequence using the alignment software MUSCLE , and also to convert the alignment from CLUSTAL  format into graphical files. PrimerView may be particularly helpful for researchers working with large datasets where primers must be efficiently designed for many genes, as well as for various PCR applications including primer walking and assembly PCR reactions, where a graphical output can quickly help users determine primer coverage and distribution.
PrimerView is written using Perl and has been tested successfully on both Windows command prompt as well as UNIX. PrimerView uses the Bioperl  dependencies Bio::Align::Graphics, Bio::Graphics, and Bio::SeqFeature::Generic to generate graphical output, which are all freely available from CPAN (http://www.cpan.org/). The alignment software MUSCLE  is used for a single iteration to map each designed primer to the inputted sequence, and this alignment is then converted into JPEG and PNG images depicting the position and distribution of all primers across each inputted sequence. PrimerView is a package with a constructor subroutine called “new” that allows the user to run the module by instantiating a PRIMERVIEW object. Separate subroutines for primer design, alignment, and conversion to graphical output, are called from a script called ‘primerview_driver.pl’. The input for PrimerView is any number of sequences in FASTA format . A sample sequence file called ‘test_seqs.fasta’ is included in the download. The main primer design subroutine of PrimerView requires various parameters, which can be collected from the command-line. Default settings for each parameter will be invoked in the absence of command-line arguments, with the exception of the input filename which must be provided. These options (a through to k) are as follows: [−a filename e.g. test_seqs.fasta] [−b 5′ search area, integer] [−c 3′ search area, integer] [−d primer length max, integer] [−e primer length min, integer] [−f GC clamp Y or N] [−g upper GC%, integer] [−h lower GC%, integer] [−i upper Tm, integer] [−j lower Tm, integer] [−k specificity to the entire input file (Y) or just the specific sequence (N), Y or N]. The ‘-b’ and ‘-c’ flags refer to the five prime or three prime search areas across which PrimerView will scan for appropriate primers within each sequence; if the user wants to scan the entire length of the sequence, these flags can be set to the total sequence length in nucleotides. Features of the basic algorithm for PrimerView have been described previously and use nearest neighbor thermodynamic calculations to determine primer T m values [7–9]. Example settings to execute PrimerView are: “ > perl primerview_driver.pl -a test_seqs.fasta”.
Results and discussion
By handling both single sequence and multi-sequence input, PrimerView facilitates automated primer design for specific targets as well as large gene datasets. Although many other primer design tools exist such as Primer3, BatchPrimer3, and PerlPrimer [10, 13, 14], the utility of PrimerView are the graphical outputs that can quickly and easily depict the distribution of all primers across a target sequence from multi-sequence input. Generating graphical outputs that map each designed primer to the target sequence is an efficient means of quickly validating the spread of primers across a target.
Availability and requirements
Project name: PrimerView
Project home page: https://github.com/dohalloran/PrimerView
Operating system(s): Platform independent
Other requirements: Bioperl
Any restrictions to use by non-academics: None
I would like to thank The George Washington University Columbian College of Arts and Sciences, GW Office of the Vice-President for Research, and the Department of Biological Sciences for Funding.
- Hayden EC. The $1,000 genome. Nature. 2014;507(7492):294–5.PubMedView ArticleGoogle Scholar
- Contreras-Moreira B, Sachman-Ruiz B, Figueroa-Palacios I, Vinuesa P. primers4clades: a web server that uses phylogenetic trees to design lineage-specific PCR primers for metagenomic and diversity studies. Nucleic Acids Res. 2009;37(Web Server issue):W95–100.PubMed CentralPubMedView ArticleGoogle Scholar
- Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, et al. The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 2002;12(10):1611–8.PubMed CentralPubMedView ArticleGoogle Scholar
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22(22):4673–80.PubMed CentralPubMedView ArticleGoogle Scholar
- Lipman DJ, Pearson WR. Rapid and sensitive protein similarity searches. Science. 1985;227(4693):1435–41.PubMedView ArticleGoogle Scholar
- Li K, Brownley A, Stockwell TB, Beeson K, McIntosh TC, Busam D, et al. Novel computational methods for increasing PCR primer design effectiveness in directed sequencing. BMC Bioinformatics. 2008;9:191-2105-9-191.Google Scholar
- Rychlik W, Spencer WJ, Rhoads RE. Optimization of the annealing temperature for DNA amplification in vitro. Nucleic Acids Res. 1990;18(21):6409–12.PubMed CentralPubMedView ArticleGoogle Scholar
- O’Halloran DM. STITCHER: a web resource for high-throughput design of primers for overlapping PCR applications. BioTechniques. 2015;58(6):325–8.Google Scholar
- Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. Primer3–new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115.PubMed CentralPubMedView ArticleGoogle Scholar
- Qu W, Shen Z, Zhao D, Yang Y, Zhang C. MFEprimer: multiple factor evaluation of the specificity of PCR primers. Bioinformatics. 2009;25(2):276–8.PubMedView ArticleGoogle Scholar
- Qu W, Zhou Y, Zhang Y, Lu Y, Wang X, Zhao D, et al. MFEprimer-2.0: a fast thermodynamics-based program for checking PCR primer specificity. Nucleic Acids Res. 2012;40(Web Server issue):W205–8.PubMed CentralPubMedView ArticleGoogle Scholar
- You FM, Huo N, Gu YQ, Luo MC, Ma Y, Hane D, et al. BatchPrimer3: a high throughput web application for PCR and sequencing primer design. BMC Bioinformatics. 2008;9:253-2105-9-253.View ArticleGoogle Scholar
- Marshall O. Graphical design of primers with PerlPrimer. Methods Mol Biol. 2007;402:403–14.PubMedGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.