FFE CONFIGURATION
ffe uses
the configuration file for extracting fields from the input
file and for formatting the fields for output. Every line or
binary block of the input file is considered as a record.
Default configuration file is ˜/.fferc but another
file can be given with ’-c’ option.
Configuration
file for ffe is a text file. The file may contain
empty lines. Commands are case-sensitive. Comments begin
with the #-character and end at the end of the line. The
string and char definitions can be enclosed in
double quotation ’"’ characters.
char is a single character. string and
char can contain following escape codes:
’\a’,’\b’,’\t’,’\n’,’\v’,’\f’,
’\r’, ’\"’ and
’\#’. Character ’\’ can be escaped
as ’\\’.
Command
Substitution allows the output of a command to replace parts
of the configuration file. Syntax for command substitution
is:
‘command‘
The command is executed and the
‘command‘ is substituted with the
standard output of the command, with any trailing newlines
deleted. Command substitutions may not be nested.
Before executing
the command ffe sets few environment variables:
FFE_STRUCTURE
The name of the structure given
using -s,--structure.
FFE_OUIPUT
The name of the output file
given using -o,--output.
FFE_FORMAT
The name of the output format
given using -p,--print.
FFE_FIRST_FILE
The name of the first input
file.
FFE_FILES
A list of all input files.
If variable is
already set it will not be replaced.
Input file structure
Input file
structures are specified with keyword structure:
structure
name {options...}
Options must be
ended with newline, options are:
type fixed|binary|separated [char]
[*]
Fields in the input are fixed
length text fields, fixed length binary fields or text
fields separated by char. If * is given, multiple
sequential separators are considered as one. Default
separator is comma.
quoted [char]
Fields may be quoted with
char, default quotation mark is double quotation mark
’"’. A quotation mark is assumed to be
escaped as \char or doubling the mark as
charchar in input. Non escaped quotation marks are
not preserved in output.
header
first|all|no
Controls the occurrence of the
header line. Default is no. If set as first or all, the
first line of the first input file is considered as header
line containing the names of the fields. First means that
only the first file has a header, all means that all files
have a header, although the names are still taken from the
header of the first file. Header line is handled according
the record definition, meaning that the name positions,
separators etc. are the same as for the fields.
output name
All records belonging this
structure are printed according output format name.
Default is to use output named as ’default’.
record name
{options...}
Defines one record for a
structure. A structure can contain several record types.
Record options:
id
position string
rid position regexp
Identifies a record in the
input file. Records are identified by the string or
by the regular expression in regexp in input record
position position. For fixed length and binary input
the position is the byte position of the input record
and for separated input the position means the
position’th field of the input record.
Positions start from one.
Id’s
are required only if input structure contains several record
types with equal lengths or field counts. Non printable
characters can be escaped as \xnn where nn is
the hexadecimal value of the character.
A record
definition can contain several id’s, then all
id’d must match the input line
(id’s are combined with logical and).
In a
multi-record binary structure every record must have at
least one id.
field
name|FILLER|* [length]|*
[lookup]|* [output]
Specifies one field in a text
input structure. length is mandatory for fixed length
input structure except for the last field. If the last field
of a fixed length input structure has a * in place of
length then the last field can have arbitrary
length.
Length is also
used for printing fields in fixed length format using the
%D or %D directive. The order of fields in
configuration file is essential, it specifies the field
order in a record.
If
’*’ is given instead of the name, then the
’name’ will be the ordinal number of the field,
or if the ’header’ option has value
’first’ or ’all’, then the name of
the field will taken from the header line (first line of the
input).
If
lookup is given then the fields contents is used to
make a lookup in lookup table lookup. If length is
not needed (separated format) but lookup is needed, use
asterisk (*) in place of length definition.
If
output is given field is printed using output
output. Use asterisk in place of lookup if lookup is
not needed.
Naming the
field as FILLER causes field not to be printed in
output.
field
name|FILLER|*
[length]|type [lookup]|*
[output]
Specifies one field in a binary
input structure. All other features are same as for the text
structure except the type parameter. type
specifies field data type and length and can have the
following values:
char
Printable character.
short
Short integer having current system length and byte
order.
int
Integer having current system length and byte order.
long
Long integer having current system length and byte
order.
llong
Long long integer having current system length and byte
order.
ushort
Unsigned short integer having current system length and byte
order.
uint
Unsigned integer having current system length and byte
order.
ulong
Unsigned long integer having current system length and byte
order.
ullong
Unsigned long long integer having current system length and
byte order.
int8 8
bit integer.
int16_be
Big endian 16 bit integer.
int32_be
Big endian 32 bit integer.
int64_be
Big endian 64 bit integer.
int16_le
Little endian 16 bit integer.
int32_le
Little endian 32 bit integer.
int64_le
Little endian 64 bit integer.
uint8
Unsigned 8 bit integer.
uint16_be
Unsigned big endian 16 bit integer.
uint32_be
Unsigned big endian 32 bit integer.
uint64_be
Unsigned big endian 64 bit integer.
uint16_le
Unsigned little endian 16 bit integer.
uint32_le
Unsigned little endian 32 bit integer.
uint64_le
Unsigned little endian 64 bit integer.
float
Float having current system length and byte order.
float_be
Float having current system length and big endian byte
order.
float_le
Float having current system length and little endian byte
order.
double
Double having current system length and byte order.
double_be
Double having current system length and big endian byte
order.
double_le
Double having current system length and little endian byte
order.
bcd_be_len
Bcd number having length len and nybbles in big
endian order.
bcd_le_len
Bcd number having length len and nybbles in little
endian order.
hex_be_len
Hexadecimal data in big endian order having length
len.
|
hex_le_len Hexadecimal data in little
endian order having length len. |
If
length is given instead of the type, then the
field is assumed to be a printable string having length
length. String is printed until length
characters are printed or NULL character is found.
Bcd number
(bcd_be_len and bcd_le_len) is
printed until len bytes are read or a nybble having
hexadecimal value f is found. Bcd number having big
endian order is printed in order: most significant nybble
first and least significant nybble second and bcd number
having little endian order is printed in order: least
significant nybble first and most significant nybble second.
Bytes are always read in big endian order.
Hexadecimal
data (hex_be_len and hex_le_len)
is printed as hexadecimal values. Big endian data is printed
starting from the lower address and little endian data
starting from the upper address.
field-count
number
Same effect as having field
* number times. Because length is not specified,
this works only with separated structure.
fields-from
record
Fields for this record are the
same as for record record.
output name
This record is printed
according output format name. Default is to use
output format specified in the structure.
level number
[element_name|*] [group_name]
Level can be used if the
contents of a file should be printed as hierarchical
multi-level nested form document. Use * instead of the
element name if it is not needed. number is the level of the
record, starting from number one (highest level),
element_name is the name for the record,
group_name is used to group records in the same and
lower levels. Only number is mandatory parameter.
record-length
strict|minimum
strict
Input record length and field count must match the record
definition in order to get it processed. This is default
value.
minimum
Input record length and field count can be the same or
longer as defined for the record. The rest of the input line
is ignored.
Output definitions
There can be
several output definitions in the configuration file. Format
can be selected with ’-p’ option. Default format
is named as ’default’.
output name|default {options...}
Defines one output format.
Output named as ’default’ will be used if none
is given for structure or record, or none is given with
option ’-p’.
There is two
predefined output formats no and raw.
no suppresses all printing and raw prints the
original input data.
Output options
Pictures in
output definition can contain printf-style %-directives:
|
%f |
|
Name of the input file. |
|
%s |
|
Name of the current
structure. |
|
%r |
|
Name of the current record. |
|
%o |
|
Input record number in current
file. |
|
%O |
|
Input record number starting
from the first file. |
|
%i |
|
Byte offset of the current
record in the current file. Starts from zero. |
|
%I |
|
Byte offset of the current
record starting from the first file. Starts from zero. |
|
%n |
|
Field name. |
|
%t |
|
Field contents, without leading
and trailing whitespaces. |
|
%d |
|
Field contents. Binary integer
is printed as a decimal value. Floating point number is
printed in the style [-]ddd.ddd, where the number of
digits after the decimal-point character is 6. Bcd number is
printed as a decimal number and hexadecimal data as
consecutive hexadecimal values. |
|
%D |
|
Field contents, right padded to
the field length (requires length definition for the
field). |
|
%C |
|
Field contents, right padded to
the field length (requires length definition for the field).
Output field is cut if input field is longer that field
length. |
|
%x |
|
Unsigned hexadecimal value of a
binary integer. Other fields are printed using directive
%d. |
|
%l |
|
Value from lookup. |
|
%L |
|
Value from lookup, right padded
to the field length (requires length definition for the
field). |
|
%e |
|
Does not print anything, causes
still the "field empty" check to be performed. Can
be used when only the names of non-empty fields should be
printed. |
|
%p |
|
Fields start position in a
record. For fixed structure this is field’s byte
position in the input line and for separated structure this
is the ordinal number of the field. Starts from one. |
|
%h |
|
Hexadecimal dump of a field.
Byte values are printed as consecutive xnn values,
where the nn is the hexadecimal value of a byte. Data
is printed before any endian conversion. |
|
%g |
|
Group name given by the keyword
group_name in record definition. |
|
%m |
|
Element name given by the
keyword element_name in record definition. |
|
%% |
|
Percent sign. |
file_header
picture
Picture is printed once before
file contents.
file_trailer
picture
Picture is printed once after
file contents.
header picture
If specified, then the header
line describing the field names is printed before records.
Every field name is printed according the picture
using the same separator and fields length as defined for
the fields. Picture can contain only %n
directive.
data picture
Field contents is printed
according picture.
lookup picture
If field is mapped to lookup
table, this picture will be used instead of picture from
data option. If not given, then picture from
data will be used.
separator
string
All fields are terminated by
string, except the last field of the record. Default
is not to print separator.
record_header
picture
picture is printed
before the record content. Default is not to print
header.
record_trailer
picture
picture is printed after
the record content. Default is newline.
justify
left|right|char
Fields are left or right
justified. char justifies output according the first
occurrence of char in the data picture. Default is
left.
indent string
Record contents is intended by
string. Field contents is intended by two times the
string. Default is not to indent.
field-list
name1,name2,...
Only fields or constants named
as name1,name2,... are printed, same effect as
has ’-f’ option. Default is to print all the
fields. Fields are also printed in the same order as they
are listed.
no-data-print
yes|no
When set as no and
field-list is given, suppresses printing of
record_header and record_trailer in case where
current record contains none of the fields specified in
field-list.
field-empty-print
yes|no
When set as no, nothing is
printed for fields which consist entirely of characters from
empty-chars. If none of the fields of a record are
printed then the printing of record_trailer is also
suppressed. Default is yes.
empty-chars
string
string specifies a set
of characters which define an "empty" field.
Default is " \f\n\r\t\v" (space, form-feed,
newline, carriage return, horizontal tab and vertical
tab).
output-file
file
Output is written to
file instead of the default output. If - is given the
standard output is used.
group_header
string
If a record has a level and
group name defined, string is printed before the
first record in the same group or if the group name has
changed in the same level
group_trailer
string
If a record has a level and
group name defined, string is printed after the
records in lower levels or if the group name has changed in
the same level or if a higher level record is found.
element_header
string
If record has a level and
header name defined, string is printed before the
records contents.
element_header
string
If record has a level and
header name defined, string is printed after the
records contents.
hex-caps
yes|no
Print hexadecimal numbers in
capital letters. Default is no.
Lookup definitions
lookup
name {options...}
Defines one lookup table.
Lookup options:
search
exact|longest
The search type for lookup
table.
default-value
value
value is printed if the
lookup is not successful.
pair key value
One key/value pair for the
lookup table.
file name
[separator]
Key/value pairs are read from
file name. Every line is considered as a key/value
pair separated by separator. Default separator is
semicolon.
Constants
Additional to
input fields constants values can be printed using option
-f,--field-list or output option
field-list. Constant will be printed using
data output option.
Constants are
specified as
const name value
when the name appears in
a field list, value will be printed for every record
as the name were one of the input fields.
Input Preprocessor
It is possible
to define an input preprosessor for ffe. An input
preprocessor is simply an executable program which writes
the contents of the input file to standard output which will
be read by ffe. If the input preprosessor does not
write any characters on its standard output, then ffe
uses the original file.
To set up an
input preprocessor, set the FFEOPEN environment
variable to a command line which will invoke your input
preprocessor. This command line should include one
occurrence of the string %s, which will be replaced
by the input filename when the input preprocessor command is
invoked.
The input
preprocessor is not used if ffe is reading standard
input.
EXAMPLES
Example of fixed
length flat file containing fields
’FirstName’,’LastName’ and
’Age’:
John Ripper 23
Scott Tiger 45
Mary Moore 41
This file can be
printed in XML with the following configuration:
structure
personnel {
type fixed
output XML
record person {
field FirstName 9
field LastName 13
field Age 2
}
}
output XML {
file_header "<?xml version=\"1.0\"
encoding=\"ISO-8859-1\"?>\n"
data "<%n>%d</%n>\n"
record_header "<%r>\n"
record_trailer "</%r>\n"
indent " "
}
SEE ALSO
More examples in
Texinfo manual. If the info and ffe are
properly installed, the command
info
ffe
should give more
information.
AUTHOR
Timo Savinen
<tjsa@iki.fi >
ffe
1
general
|