Stacks

Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.

Download Stacks

Version 2.68
Recent Changes [updated August 23, 2024]

Stacks Pipeline

Genetic Maps

Stacks can be used to generate mappable markers from RAD-seq data. Thousands of markers can be generated from a single generation, F1 map as well as markers for traditional F2 and backcross designs. Stacks can export data to JoinMap, OneMap, or R/qtl. These data can be used for examining genomic structure as well as assembling genomic assemblies.

Population Genomics

Stacks can be used to identify SNPs within or among populations. Stacks provides tools to generate summary statistics and to compute population genetic measures such as FIS and π within populations and FST between populations, allowing for genome scans. Data can be exported in VCF format and for use in programs such as STRUCTURE or GenePop. Data can also be exported for cline analysis in HZAR format.

Any SNP dataset in VCF format can also be imported into the Stacks populations module. SNPs generated from re-sequencing or RNA-seq, among other methods, can now be filtered/smoothed in the same way RAD data can.

Phylogenetics

Stacks can export GBS/RAD data for phylogenetic analysis. Identified SNPs can be concatenated and exported in Phylip format; these SNPs can be specified as fixed within and variable among populations, or simply all variable sites (encoded in IUPAC notation). Stacks can also export SNPs with their full flanking sequence -- the RAD locus. These data can be exported in Phylip format (either as concatenated or partitioned data) which can be fed into any standard phylogenetics package such as PhyML or RAxML.

Getting started with Stacks

Frequently Asked Questions

Pipeline components

The Stacks pipeline is designed modularly to perform several different types of analyses. Programs listed under Raw Reads are used to clean and filter raw sequence data. Programs under Core represent the main Stacks pipeline — building loci (ustacks), creating a catalog of loci (cstacks, and matching samples back against the catalog (sstacks), transposing the data (tsv2bam), adding paired-end reads to the analysis and calling genotypes, and population genomics analysis. Programs under Execution Control will run the whole pipeline.

Raw reads

Core

Execution control

Utility programs

Implementation

Stacks is implemented in C++, with some helper programs in Perl, and is parallelized using the OpenMP libraries. It will compile on GNU-based Linux systems or BSD-based Apple OS X systems. Stacks is released under the GNU GPL license.

Stacks was developed by Julian Catchen <> and Nicolas Rochette <>, with contributions from Angel Amores <>, Paul Hohenlohe <>, and Bill Cresko <>.

Mailing List

Subscribe to the stacks-user mailing list for technical help, and to discuss the use and development of Stacks.

Publications

Here are a few publications that have used the Stacks pipeline for data analysis. These papers show a variety of uses for the Stacks pipeline.