Synolog

synolog_fasta.py

The synolog_fasta.py script is intended to write out orthogroup-specific fasta files for all or a subset of orthogroups inferred from a Synolog analysis.

Program Options

PLACEHOLDER

Example Usage

The synolog_fasta.py requires several of the Synolog output files to be in the directory provided to --path. More explictly, orthologs.tsv and orthogroups.tsv.

% ls ./path/to/synolog/output_files/ortho* orthologs.tsv orthogroups.tsv orthogroups.gene.counts.tsv

If the --segmental flag is implemented, Synolog expects *_segmentalduplication_clusters.tsv to be located in the specified --path.

% ls -1 ./path/to/synolog/output_files/*_segmentalduplication_clusters.tsv org_a-org_b_segmentalduplication_clusters.tsv org_a-org_c_segmentalduplication_clusters.tsv org_b-org_c_segmentalduplication_clusters.tsv

Below are a couple of examples depicting the different pieces of information obtainable across the different modes

# generate fasta files for all orthogroups % synolog_fasta.py --path ./path/to/synolog/output_files/ \ --cache ./path/to/species/cache/ \ --out-path ./path/to/output/directory/ \ --all

# generate fasta files containing the longest transcripts for single-copy orthogroups % synolog_fasta.py --path ./path/to/synolog/output_files/ \ --cache ./path/to/species/cache/ \ --out-path ./path/to/output/directory/ \ --single-copy \ --longest

# generate fasta files containing the longest transcripts for orthogroups of interest % synolog_fasta.py --path ./path/to/synolog/output_files/ \ --cache ./path/to/species/cache/ \ --out-path ./path/to/output/directory/ \ --longest \ --orthogroups orthogroup_ids.txt # single column list of orthogroup IDs

NOTE: When using multiple threads, Synolog will write out temporary fasta files with the prefix TMP*. Do not remove these files. Once all the records are collected for the orthogroup, the file will be renamed with the prefix Orthogroup*.

Other Pipeline Programs

Core

Species Cache

Execution control

Utility programs