Stacks: gstacks

Stacks

tsv2bam

The tsv2bam program will transpose data so that it is oriented by locus, instead of by sample. It is executed after the core pipeline (ustacks→cstacks→sstacks) in de novo analyses. If paired-end reads are available, the tsv2bam program will pull in the set of paired-end reads that are associated with each single-end locus that was assembled de novo.

Program Options

tsv2bam -P stacks_dir -M popmap [-R paired_reads_dir] tsv2bam -P stacks_dir -s sample [-s sample ...] [-R paired_reads_dir]

-P,--in-dir — input directory.
-M,--popmap — population map.
-s,--sample — name of one sample.
-R,--pe-reads-dir — directory where to find the paired-end reads files (in fastq/fasta/bam (gz) format).
-t — number of threads to use (default: 1).

Example Usage

Processing single-end data, de novo.
Your Stacks directory should look similar to this, where the tags/snps/alleles/matches files were produced by the core pipeline (ustacks/cstacks/sstacks):

% ls stacks/ sample_1020.alleles.tsv.gz sample_1069.alleles.tsv.gz sample_1086.alleles.tsv.gz sample_1095.alleles.tsv.gz sample_1020.matches.tsv.gz sample_1069.matches.tsv.gz sample_1086.matches.tsv.gz sample_1095.matches.tsv.gz sample_1020.snps.tsv.gz sample_1069.snps.tsv.gz sample_1086.snps.tsv.gz sample_1095.snps.tsv.gz sample_1020.tags.tsv.gz sample_1069.tags.tsv.gz sample_1086.tags.tsv.gz sample_1095.tags.tsv.gz

% tsv2bam -P ./stacks/ -M ./popmap -t 8
Processing paired-end data, de novo.
Your Stacks directory should look the same as above, but we expect to find the paired-end reads files in the samples directory:
% ls samples/ sample_1020.1.fq.gz sample_1069.1.fq.gz sample_1086.1.fq.gz sample_1095.1.fq.gz sample_1020.1.rem.fq.gz sample_1069.1.rem.fq.gz sample_1086.1.rem.fq.gz sample_1095.1.rem.fq.gz sample_1020.2.fq.gz sample_1069.2.fq.gz sample_1086.2.fq.gz sample_1095.2.fq.gz sample_1020.2.rem.fq.gz sample_1069.2.rem.fq.gz sample_1086.2.rem.fq.gz sample_1095.2.rem.fq.gz

In this case, the sample_XXXX.1.fq.gz files were used by the core pipeline and now, tsv2bam will use the single-end read IDs from the assembled loci to find the corresponding paried-end reads in the sample_XXXX.2.fq.gz files.

% tsv2bam -P ./stacks/ -M ./popmap -R ./samples -t 8

Other Pipeline Programs

Raw reads

Core

Execution control

Utility programs