transmute(1)

transform data, particularly within NCBI Entrez Direct

Section 1 ncbi-entrez-direct bookworm source

Description

TRANSMUTE

NAME

transmute - transform data, particularly within NCBI Entrez Direct

SYNOPSIS

transmute -x2p|-j2p

transmute -align [-a codes] [-g N] [-h N] [-w N]

transmute -a2x [-set tag] [-rec tag]

transmute -t2x|-c2x|-s2x (tbl2xml / csv2xml / scn2xml) [-set tag] [-rec tag] [-skip N] [-header] [-lower|-upper] [-indent|-flush] columnName1 ...

transmute -g2x (gbf2xml)

transmute -g2r (gbf2ref)

transmute -r2p (ref2pmid) [-options confirm|verbose|fast|slow|exact ...]

transmute -revcomp

transmute -remove [-first N] [-last N]

transmute -retain -leading N-trailing N

transmute -replace -offset N|-column N [-delete N] [-insert seq] [-lower]

transmute -extract [-1-based] [-0-based] [-lower] feat_loc

transmute -cds2prot [-code N] [-frame N] [-stop] [-trim] [-part5] [-part3] [-every]

transmute -molwt [-met]

transmute -hgvs

transmute -counts

transmute -diff

transmute -codons -nuc seq -prot seq [-frame N] [-three]

transmute -search [-protein] [-circular] [-top] pattern ...

transmute -find [-relaxed] [-sensitive] [-whole] pattern ...

transmute -encodeXML|-decodeXML|-plainXML

transmute -encodeURL|-decodeURL

transmute -encode64|-decode64

transmute -plain

transmute -upper|-lower

transmute -aa1to3|-aa3to1

transmute -relax

transmute -format [fmt] [-xml declaration] [-doctype declaration] [-comment] [-cdata] [-combine] [-self] [-unicode style] [-script style] [-mathml terse]

transmute -filter element action target

transmute -normalize database

DESCRIPTION

transmute reads data from standard input, transforms it according to the specified mode, and writes the transformed data to standard output.

OPTIONS

Pretty-Printing

	-x2p		Reformat XML.
	-j2p		Reformat JSON.

-align

Table column alignment.

-a codes

Column alignment codes:

			Left.
	c		Center.
	r		Right.
	n		Numeric align on decimal point.
	N		Trailing zero-pad decimals.
	z		Leading zero-pad integers.
	m		Commas to group by 3 digits.
	M		Commas plus zero-pad decimals.
	-g N

Spacing between columns.

	-h N		Indentation before columns.
	-w N		Minimum column width.

Data Conversion

-j2x

Convert JSON stream to XML suitable for -path navigation.

-set tag

Replace set wrapper tag.

-rec tag

Replace record wrapper tag.

Nested array naming policy.

-a2x

Convert text ASN.1 stream to XML suitable for -path navigation.

-set tag

			Replace set wrapper tag.
	-rec tag		Replace record wrapper tag.

-t2x, -c2x, -s2x

Convert tab-delimited table, comma-separated values file, or semicolon-delimited table, respectively, to XML.

	-set tag		Replace set wrapper tag.
	-rec tag		Replace record wrapper tag.
	-skip N		Skip the first N lines.
	-header		Use fields from first row for column names.
	-lower		Convert text to lowercase.
	-upper		Convert text to uppercase.
	-indent		Indent XML output.
	-flush		Do not indent XML output.
	columnName1 ...		XML object names per column.
	-g2x

Convert GenBank/GenPept flatfile format to INSDSeq XML.

-g2r

Convert GenBank/GenPept flatfile format to Reference XML.

-r2p [-options option ...]

Reference Index XML lookup to find PMIDs. Supported option values:

	confirm		Recheck existing PMID claims.
	verbose		Add NOTE nodes explaining reasoning.
	fast		Prefilter candidates relatively heavily (default).
	slow		Prefilter candidates less heavily.
	exact		Require exact, unique title matches.

Sequence Editing

	-revcomp		Reverse complement nucleotide sequence.
	-remove		Trim at ends of sequence.

-first N

			Delete first N bases or residues.
	-last N		Delete last N bases or residues.
	-retain

Save either end of sequence.

	-leading N		Keep first N bases or residues.
	-trailing N		Keep last N bases or residues.
	-replace

Apply base or residue substition.

	-offset N		Skip ahead by 0-based count (SPDI), or
	-column N		Move just before 1-based position (HGVS).
	-delete N		Delete N bases or residues.
	-insert seq		Insert given sequence.
	-lower		Lower-case original sequence.

-extract [-lower] feat_loc

Use xtract -insd ... feat_location instructions.
-1-based

GenBank feat_location convention.

-0-based

Alignment, or -insd feat_intervals.

-lower

Lower-case extracted sequence.

Sequence Processing

-cds2prot

Translate coding region into protein.

	-code N		Use genetic code N (1 by default).
	-frame N		Offset in sequence.
	-stop		Include stop residue.
	-trim		Remove trailing Xs and *s.
	-part5		CDS partial at 5’ end.
	-part3		CDS extends past 3’ end.
	-every		Translate all codons.
	-molwt

Calculate molecular weight of peptide.

-met

Do not cleave leading methionine.

Variation Processing

-hgvs

Convert Human Genome Variation Society variation format to XML.

Sequence Comparison

	-counts		Print summary of base or residue counts.
	-diff		Compare two aligned files for point differences.
	-codons		Display nucleotide codons above amino acid residues.

-nuc seq

			Nucleotide sequence.
	-prot seq		Protein sequence.
	[-frame N]		Offset in nucleotide sequence.
	[-three]		Use three-letter residue abbreviations.

Sequence Searching

-search

Search for one or more patterns in a sequence, skipping any FASTA definition line (with a leading >). Each pattern can have an optional alias, e.g., GGATCC:BamHI.

-protein

Do not expand nucleotide ambiguity characters.

-circular

Match patterns spanning the origin of a circular molecule.

	-top		Do not search reverse complements of non-palindromic patterns.
	pattern		Pattern to search for.

Text Searching

-find

Find one or more patterns in text, allowing digits, spaces, punctuation, and phrases, e.g., "double, double toil and trouble".

-relaxed

			Match on words with letters and digits, ignoring spacing and punctuation.
	-sensitive		Case-sensitive match, distinguishing upper-case and lower-case letters.
	-whole		Match on whole words or multi-word phrases; implies -relaxed.
	pattern		Pattern to search for.

String Transformations

XML

	-encodeXML		XML-encode <, >, &, ", and ' characters.
	-decodeXML		Decode XML entity references.
	-plainXML		Remove embedded mixed-content tags and compress runs of spaces.

URL

	-encodeURL		Compress runs of spaces, and URI-escape the result.
	-decodeURL		URI-unescape the input.

Base64

	-encode64		Base64-encode the input.
	-decode64		Base64-decode the input.

Accent

-plain

Strip accents from the input.

Case

	-upper		Convert the input to uppercase.
	-lower		Convert the input to lowercase.

Protein

	-aa1to3		Convert amino acids from 1-character to 3-character format.
	-aa3to1		Convert amino acids from 3-character to 1-character format.

Letters plus Digits

-relax

Remove all punctuation and compress whitespace.

Customized XML Reformatting

-format [fmt]

	compact		Compress runs of spaces.
	flush		Suppress line indentation.
	indent		Indent according to nesting depth.
	expand		Place each attribute on a separate line.

-xml declaration

Use the given XML declaration.

-doctype declaration

Use the given document type declaration.

-comment

Preserve comments.

-cdata

Preserve cdata blocks.

-combine

If the input contains multiple top-level documents, combine them.

-self

Keep empty self-closing tags.

-unicode style

How to handle Unicode superscript and subscript digits (first converted to ASCII form in all cases).

	fuse		Run them all together, with no additional markup.
	space		Add spaces between digits in different positions.
	period		Add periods between digits in different positions.
	brackets		Surround superscripts by square brackets and subscripts by parentheses.
	markdown		Surround superscripts with carets and subscripts with tildes.
	slash		Add backslashes when going up in height and forward slashes when going down.
	tag		Put superscripts in XML sup elements and subscripts in sub elements.

-script style

How to handle XML sup and sub elements (denoting superscripts and subscripts, respectively).

	brackets		Surround superscripts by square brackets and subscripts by parentheses.
	markdown		Surround superscripts with carets and subscripts with tildes.

-mathml terse

Flatten MathML markup tersely.

XML Modification

-filter element action target

Actions:

	retain		Keep matching elements (no-op).
	remove		Remove matching elements.
	encode		HTML-escape special characters.
	decode		Decode HTML escapes.
	shrink		Compress runs of spaces.
	expand		Place each attribute on a separate line.
	accent		Strip off Unicode accents.

Targets:

	content		Plain-text content.
	cdata		CDATA blocks.
	comment		Comments.
	object		The whole object.
	attributes		Attributes.
	container		Start and end tags.

EFetch XML Normalization

-normalize database

Adjust XML fields to conform to common conventions.