csv_to_gene_db(1)

generate the appropriate headers for srst2

Section 1 srst2 bookworm source

Description

CSV_TO_GENE_DB

NAME

csv_to_gene_db - generate the appropriate headers for srst2

SYNOPSIS

csv_to_gene_db [options]

DESCRIPTION

This tool is part of the SRST2 suite.

take csv table detailing clustering etc and sequences for gene DB, write as fasta expected csv file format:

seqID,clusterid,gene,allele,(DNAseq),other....

headers in output will be srst2 compatible, ie [clusterID]__[gene]__[allele]__[seqID] [other stuff] sequence can be read from a specified column or from a fasta file (specify which column contains fasta header to match in seqs file)

OPTIONS

-h, --help

show this help message and exit

-t TABLE_FILE, --table=TABLE_FILE

table to read (csv)

-o OUTPUT_FILE, --out=OUTPUT_FILE

output file (fasta)

-s SEQ_COL, --seq_col=SEQ_COL

column number containing sequences

-f FASTA_FILE, --fasta=FASTA_FILE

fasta file to read sequences from (must specify which column in the table contains the sequence names that match the fasta file headers)

-c HEADERS_COL, --headers_col=HEADERS_COL

column number that contains the sequence names that match the fasta file headers

AUTHOR

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.