Stacks

Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.

Stacks Pipeline

Genetic Maps

Stacks can be used to generate mappable markers from RAD-seq data. Thousands of markers can be generated from a single generation, F1 map as well as markers for traditional F2 and backcross designs. Stacks can export data to JoinMap, OneMap, or R/qtl. These data can be used for examining genomic structure as well as assembling genomic assemblies.

Population Genomics

Stacks can be used to identify SNPs within or among populations. Stacks provides tools to generate summary statistics and to compute population genetic measures such as F_IS and π within populations and F_ST between populations, allowing for genome scans. Data can be exported in VCF format and for use in programs such as STRUCTURE or GenePop. Data can also be exported for cline analysis in HZAR format.

Any SNP dataset in VCF format can also be imported into the Stacks populations module. SNPs generated from re-sequencing or RNA-seq, among other methods, can now be filtered/smoothed in the same way RAD data can.

Phylogenetics

Stacks can export GBS/RAD data for phylogenetic analysis. Identified SNPs can be concatenated and exported in Phylip format; these SNPs can be specified as fixed within and variable among populations, or simply all variable sites (encoded in IUPAC notation). Stacks can also export SNPs with their full flanking sequence -- the RAD locus. These data can be exported in Phylip format (either as concatenated or partitioned data) which can be fed into any standard phylogenetics package such as PhyML or RAxML.

Getting started with Stacks

Frequently Asked Questions

more...

Tutorials

How do the major Stacks parameters control the de novo formation of stacks and loci?

Pipeline components

The Stacks pipeline is designed modularly to perform several different types of analyses. Programs listed under Raw Reads are used to clean and filter raw sequence data. Programs under Core represent the main Stacks pipeline — building loci (ustacks), creating a catalog of loci (cstacks, and matching samples back against the catalog (sstacks), transposing the data (tsv2bam), adding paired-end reads to the analysis and calling genotypes, and population genomics analysis. Programs under Execution Control will run the whole pipeline.

Raw reads

Core

Execution control

Utility programs

Implementation

Stacks is implemented in C++, with some helper programs in Perl, and is parallelized using the OpenMP libraries. It will compile on GNU-based Linux systems or BSD-based Apple OS X systems. Stacks is released under the GNU GPL license.

Stacks was developed by Julian Catchen <> and Nicolas Rochette <>, with contributions from Angel Amores <>, Paul Hohenlohe <>, and Bill Cresko <>.

Mailing List

Subscribe to the stacks-user mailing list for technical help, and to discuss the use and development of Stacks.

Publications

Here are a few publications that have used the Stacks pipeline for data analysis. These papers show a variety of uses for the Stacks pipeline.

N. Rochette, A. Rivera‐Colón, and J. Catchen. Stacks 2: Analytical methods for paired‐end sequencing improve RADseq‐based population genomics. Molecular Ecology, 28(21):4737-4754. 2019. [reprint]
N. Rochette & J. Catchen. Deriving genotypes from RAD-seq short-read data using Stacks. Nature Protocols, 12:2640–2659, 2017. [reprint]
J. Paris, J. Stevens, & J. Catchen. Lost in parameter space: a road map for Stacks. Methods in Ecology and Evolution, 8(10):1360-1373, 2017. [reprint]
S. Bassham, J. Catchen, E. Lescak, F. von Hippel, W. Cresko. Repeated Selection of Alternatively Adapted Haplotypes Creates Sweeping Genomic Remodeling in Stickleback. Genetics, 209:921-939, 2018. [reprint]
J. Catchen, P. Hohenlohe, S. Bassham, A. Amores, and W. Cresko. Stacks: an analysis tool set for population genomics. Molecular Ecology, 22(11):3124-3140, 2013. [reprint]
J. Catchen, A. Amores, P. Hohenlohe, W. Cresko, and J. Postlethwait. Stacks: building and genotyping loci de novo from short-read sequences. G3: Genes, Genomes, Genetics, 1:171-182, 2011. [reprint]
A. Amores, J. Catchen, A. Ferrara, Q. Fontenot and J. Postlethwait. Genome evolution and meiotic maps by massively parallel DNA sequencing: Spotted gar, an outgroup for the teleost genome duplication. Genetics, 188:799–808, 2011. [reprint]
P. Hohenlohe, S. Amish, J. Catchen, F. Allendorf, G. Luikart. RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow trout and westslope cutthroat trout. Molecular Ecology Resources, 11(s1):117-122, 2011. [reprint]
K. Emerson, C. Merz, J. Catchen, P. Hohenlohe, W. Cresko, W. Bradshaw, C. Holzapfel. Resolving postglacial phylogeography using high-throughput sequencing. Proceedings of the National Academy of Science, 107(37):16196-200, 2010. [reprint]

Download Stacks

Recent Changes [updated August 23, 2024]