seqfmt(5)
Sequences formats" 4
Description
SEQFMT
NAME
seqfmt - Sequences formats" 4
DESCRIPTION
This document illustrates some common formats used for sequences representation.
|
EMBL |
ID MMVASPHOS
standard; RNA; EST; 140 BP.
AC X97897;
DE M.musculus mRNA for protein homologous to
DE vasodilator-stimulated phosphoprotein
SQ Sequence 140 BP; 25 A; 58 C; 39 G; 17 T; 1 other;
ttctcccaga agctgactct atggngaccc cgagagagac tgagcagaac 60
ccccgcaccc ctgcacttcc aatcaggggc gccccgggag cactccccgt 120
ccgccctccg cgcagccatg 140
//
FASTA
>MMVASPHOS
ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg
|
GCG |
!!NA_SEQUENCE
1.0
(No documentation)
dna1.txt Length: 88 Nov 22, 2001 14:38 Type: N Check: 3818
..
1 TAGTCGTAGT CGGAGCGATG CTGACGATGA CGATGACGAT CGTAGCTGAT
51 CGATCGAGCT GATGCTGATC GAGCTAGCTG ATCGATCG
|
GDE |
#sample1
TTCAAGAGAAACAGCGGCCAAGGAAAAGACTCGGCATGATTGTCCATAGCTTACAAAGCG
#sample2
TTCAAGAGAAACAGCGGCTGGGGGAAAGACTCGTCCTGATTGCCTGTAGATGGTAAAGCG
GENBANK
LOCUS HUMHBV1 130 bp DNA PRI
17-JUN-1993
DEFINITION Human DNA/endogenous Hepatitis B virus (HBV) DNA,
left
host viral junction.
ACCESSION M15770
BASE COUNT 32 a 43 c 29 g 26 t
ORIGIN
1 agcgggcagt gcagctgctt ggacagcagg ggtgtttctt caacccaggc
61 ctcctgtcac aacaggccca ttcaattctg aacctgcaag ccaactccaa
121 cctcttttcc cagggggaac caaaaaccct
//
|
IG |
; comment
U03518
AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTC
TATTGTACCCTGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTG
TGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGC1
NBRF (pir)
>P1;CCHU
cytochrome c [validated] - human
MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW
GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE*
CODATA
ENTRY CCHU #type complete
TITLE cytochrome c [validated] - human
ACCESSIONS A31764; A05676; I55192; A00001
SUMMARY #length 105 #molecular-weight 11749 #checksum 3247
SEQUENCE
5 10 15 20 25 30
1 M G D V E K G K K I F I M K C S Q C H T V E K G G K H K T
G
31 P N L H G L F G R K T G Q A P G Y S Y T A A N K N K G I I
W
61 G E D T L M E Y L E N P K K Y I P G T K M I F V G I K K K
E
91 E R A D L I A Y L K K A T N E
///
|
RAW |
ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg
Warning: This format cannot handle more than one sequence per file.
SWISSPROT
ID 100K_RAT STANDARD; PRT; 149
AA.
AC Q62671;
DE 100 kDa protein (EC 6.3.2.-).
SQ SEQUENCE 149 AA; 17004 MW; D06484B8BC29112E CRC64;
MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK
PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN
SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV
//
SEE ALSO
squizz(1), alifmt(5)
AUTHOR
Nicolas Joly (njoly@pasteur.fr), Institut Pasteur.