The stacks-dist-extract script will export a paricular section of a Stacks log or distribs file, either for easy viewing (e.g. using the --pretty option), or for plotting. If you supply a log path alone, stacks-dist-extract will print the available sections to output. The log file can also be supplied via stdin (which then requires the user to supply the --section option).
The Stacks component programs tend to output two types of files, *.log files and *.distribs files. While these files are all plain text files, and can therefore be viewed using standard UNIX tools (e.g., less, more, or cat), these files can be large and can contain a number of differt data sets of interest, and stacks-dist-extract makes it easy to pull out particular data sets.
stacks-dist-extract logfile [section] stacks-dist-extract [--pretty] [--out-path path] logfile [section] cat logfile | stacks-dist-extract [--pretty] --section section
% stacks-dist-extract ./stacks/population_r80/populations.log.distribs batch_progress samples_per_loc_prefilters missing_samples_per_loc_prefilters snps_per_loc_prefilters samples_per_loc_postfilters missing_samples_per_loc_postfilters snps_per_loc_postfilters ...
% stacks-dist-extract ./stacks/population_r80/populations.log.distribs samples_per_loc_prefilters # Distribution of valid samples matched to a catalog locus prior to filtering. n_samples n_loci 1 810 2 362 3 224 4 213 5 202 6 175 7 224 8 542 9 46792 10 49961
% stacks-dist-extract ./stacks/gstacks.log.distribs bam_stats_per_sample effective_coverages_per_sample phasing_rates_per_sample
% stacks-dist-extract ./stacks/gstacks.log.distribs bam_stats sample records primary_kept kept_frac primary_kept_read2 primary_disc_mapq primary_disc_sclip unmapped secondary supplementary S1_2023.01 2780637 2515438 0.905 1195103 26801 98337 80108 0 59953 S1_2023.07 3156646 2860191 0.906 1359700 27987 110763 89513 0 68192 S2_1999.13 2835542 2574684 0.908 1225169 25379 96962 81343 0 57174 ...
% stacks-dist-extract ./stacks/gstacks.log.distribs --pretty bam_stats sample records primary_kept kept_frac primary_kept_read2 primary_disc_mapq primary_disc_sclip unmapped secondary supplementary S1_2023.01 2780637 2515438 0.905 1195103 26801 98337 80108 0 59953 S1_2023.07 3156646 2860191 0.906 1359700 27987 110763 89513 0 68192 S2_1999.13 2835542 2574684 0.908 1225169 25379 96962 81343 0 57174 ...
% cat ./stacks/gstacks.log.distribs | stacks-dist-extract --section bam_stats_per_sample
% cat ./stacks/gstacks.log.distribs | stacks-dist-extract --section bam_stats_per_sample | tail -n +2 | cut -f 2 | awk '{s+=$1} END {print s/NR}'
Raw reads |
Core |
Execution control |
Utility programs |