cleanasn(1)
clean up irregularities in NCBI ASN.1 objects
Description
CLEANASN
NAME
cleanasn - clean up irregularities in NCBI ASN.1 objects
SYNOPSIS
cleanasn [-] [-A filename] [-B str] [-C str] [-D str] [-F str] [-K str] [-L filename] [-M filename] [-N str] [-O str] [-P str] [-Q str] [-R] [-S str] [-T] [-U str] [-V str] [-X str] [-Z str] [-a str] [-b] [-c] [-d str] [-f str] [-i filename] [-j filename] [-k filename] [-m str] [-n path] [-o filename] [-p path] [-q path] [-r path] [-v path] [-x ext]
DESCRIPTION
cleanasn is a utility program to clean up irregularities in NCBI ASN.1 objects.
OPTIONS
A summary of options is included below.
|
- |
Print usage message |
-A filename
Accession list file
-B str
Branch, per the flags in str:
|
c |
Has coding regions |
|||
|
d |
No coding regions |
|||
|
p |
Passes validation |
|||
|
q |
Validator errors or rejects |
|||
|
r |
Only pop/phy/mut/eco/WGS sets |
|||
|
s |
Exclude pop/phy/mut/eco/WGS sets |
|||
|
t |
Only nuc-prot sets |
|||
|
u |
Exclude nuc-prot sets |
|||
|
v |
Only segmented sequences |
|||
|
w |
Exclude segmented sequences |
|||
|
x |
Only segmented proteins |
|||
|
y |
Exclude segmented proteins |
-C str
Sequence operations, per the flags in str:
|
c |
Compress |
|||
|
d |
Decompress |
|||
|
l |
Recalculated segmented sequence length |
|||
|
v |
Virtual gaps inside segmented sequence |
|||
|
s |
Convert segmented set to delta sequence |
|||
|
t |
Non-NucProt segmented set to delta sequence |
|||
|
u |
Improved non-NucProt segmented set to delta sequence |
|||
|
g |
Raw to delta by assembly gap |
|||
|
m |
Merge assembly gap features |
-D str
Clean up descriptors, per the flags in str:
|
t |
Remove Title |
|||
|
c |
Remove Comment |
|||
|
n |
Remove Nuc-Prot Set title |
|||
|
e |
Remove Pop/Phy/Mut/Eco Set title |
|||
|
m |
Remove mRNA title |
|||
|
p |
Remove Protein title |
|||
|
a |
Title to name |
|||
|
b |
AutoDef title or name |
|||
|
x |
Prefix title with organism name |
-F str
Clean up features, per the flags in str:
|
u |
Remove User-objects |
|||
|
d |
Remove db_xrefs |
|||
|
e |
Remove /evidence and /inference |
|||
|
g |
Fuse multi-interval genes |
|||
|
i |
Fuse adjacent-interval imported features |
|||
|
r |
Remove redundant gene xrefs |
|||
|
f |
Fuse duplicate features |
|||
|
s |
Package features on referenced Bioseq |
|||
|
k |
Package coding-region or parts features |
|||
|
z |
Delete or update EC numbers |
|||
|
b |
Set Best coding-region reading frame |
|||
|
x |
Retranslate coding regions |
|||
|
a |
Adjust for missing stop codon |
-K str
Perform a general cleanup, per the flags in str:
|
b |
BasicSeqEntryCleanup |
|||
|
p |
C++ BasicCleanup (via an external utility) |
|||
|
v |
AdvancedSeqEntryCleanup |
|||
|
s |
SeriousSeqEntryCleanup |
|||
|
x |
ExtendedSeqEntryCleanup |
|||
|
g |
GpipeSeqEntryCleanup |
|||
|
n |
Normalize descriptor order |
|||
|
u |
Remove NcbiCleanup User Objects |
|||
|
c |
Synchronize genetic Codes |
|||
|
f |
CDS partial from translation |
|||
|
e |
Impose CDS partials |
|||
|
d |
Resynchronize CDS partials |
|||
|
m |
Resynchronize mRNA partials |
|||
|
t |
Resynchronize Peptide partials |
|||
|
a |
Adjust consensus splice |
|||
|
i |
Promote to "worst" Seq-ID |
|||
|
r |
Reassign local IDs |
|||
|
l |
Remove locus |
-L filename
Log file
-M filename
Macro file
-N str
Clean up links, per the flags in str:
|
o |
Link CDS mRNA by Overlap |
|||
|
p |
Link CDS mRNA by Product |
|||
|
l |
Link CDS mRNA by Label and Location |
|||
|
r |
Reassign feature IDs |
|||
|
m |
Merge colliding feature IDs |
|||
|
f |
Fix missing reciprocal feature IDs |
|||
|
c |
Clear feature IDs |
-O str
Missing prot-ref name
-P str
Publication options:
|
a |
Remove All publications |
|||
|
s |
Remove Serial number |
|||
|
f |
Remove Figure, numbering, and name |
|||
|
r |
Remove Remark |
|||
|
u |
Update PMID-only publication |
|||
|
j |
Lookup ISO Journal title abbreviation |
|||
|
m |
Merge identical publication features |
|||
|
# |
Replace unpublished with PMID |
-Q str
Report:
|
c |
Record count |
|||
|
r |
ASN.1 BSEC report |
|||
|
s |
ASN.1 SSEC report |
|||
|
n |
NORM vs. SSEC report |
|||
|
e |
PopPhyMutEco AutoDef report |
|||
|
o |
Overlap report |
|||
|
l |
Latitude-longitude country diff |
|||
|
d |
Log SSEC differences |
|||
|
g |
GenBank SSEC diff |
|||
|
f |
asn2gb/asn2flat diff |
|||
|
h |
Seg-to-delta GenBank diff |
|||
|
v |
Validator SSEC diff |
|||
|
m |
Modernize Gene/RNA/PCR |
|||
|
u |
Unpublished Pub lookup |
|||
|
p |
Published Pub lookup |
|||
|
j |
Unindexed Journal report |
|||
|
t |
tRNA anticodon report |
|||
|
w |
Component offset report |
|||
|
x |
Custom scan |
|||
|
-R |
Remote fetching from ID (NCBI sequence databases)
-S str
Selective difference filter (capital letters skip)
|
s |
SSEC |
|||
|
b |
BSEC |
|||
|
A |
Author |
|||
|
p |
Publication |
|||
|
l |
Location |
|||
|
r |
RNA |
|||
|
q |
Qualifier sort order |
|||
|
g |
Genbank block |
|||
|
k |
Package CdRegion or parts features |
|||
|
m |
Move publication |
|||
|
o |
Leave duplicate Bioseq publication |
|||
|
d |
Automatic definition line |
|||
|
e |
Pop/Phy/Mut/Eco Set definition line |
|||
|
-T |
Taxonomy Lookup
-U str
Modernize, per the flags in str:
|
g |
Genes |
|||
|
r |
RNA |
|||
|
p |
PCR Primers |
-V str
Remove features by validator severity:
|
r |
Reject |
|||
|
e |
Error |
|||
|
w |
Warning |
|||
|
i |
Info |
-X str
Miscellaneous options, per str:
|
d |
Automatic definition line |
|||
|
s |
Automatic definition line with Source qualifiers |
|||
|
e |
Pop/Phy/Mut/Eco Set definition line |
|||
|
n |
Instantiate NC title |
|||
|
m |
Instantiate NM titles |
|||
|
x |
Special XM titles |
|||
|
p |
Instantiate Protein titles |
|||
|
g |
GPipe instantiate titles |
|||
|
c |
Create mRNAs for coding sequences |
|||
|
f |
Fix reciprocal protein_id/transcript_id |
|||
|
v |
Revert preRNA or ncRNA transcript_id |
|||
|
t |
Parse anticodon from Sequence |
|||
|
b |
Batch cleanup of multireader output |
|||
|
z |
Wrap SegSet with NucProt set |
|||
|
w |
GFF/WGS genome cleanup |
-Z str
Remove indicated User-object
-a str
ASN.1 type
|
a |
Any (default) |
|||
|
e |
Seq-entry |
|||
|
b |
Bioseq |
|||
|
s |
Bioseq-set |
|||
|
m |
Seq-submit |
|||
|
t |
Batch Bioseq-set |
|||
|
u |
Batch Seq-submit |
|||
|
-b |
Input ASN.1 is Binary
|
-c |
Input ASN.1 is Compressed |
-d str
Source database
|
a |
Any (default) |
|||
|
g |
GenBank |
|||
|
e |
EMBL |
|||
|
d |
DDBJ |
|||
|
b |
EMBL or DDBJ |
|||
|
i |
INSD |
|||
|
r |
RefSeq |
|||
|
n |
NCBI |
|||
|
x |
Exclude EMBL/DDBJ |
|||
|
y |
Exclude gbcon, gbest, gbgss, gbhtg, gbpat, gbsts |
-f str
Substring filter
-i filename
Single input file (defaults to stdin)
-j filename
First filename
-k filename
Last filename
-m str
Flatfile mode:
|
r |
Release |
|||
|
e |
Entrez |
|||
|
s |
Sequin |
|||
|
d |
Dump |
-n path
asn2flat executable (default is /netopt/ncbi_tools/bin/asn2flat)
-o filename
Single output file (defaults to stdout)
-p path
Process all matching files in path
-q path
ffdiff executable (default is /netopt/genbank/subtool/bin/ffdiff)
-r path
Path for results
-v path
asnval executable (default is /netopt/ncbi_tools/bin/asnval)
-x ext
File selection suffix for use with -p (defaults to .ent)
AUTHOR
The National Center for Biotechnology Information.
SEE ALSO
asndisc(1), asnval(1), sequin(1).