setop(1)

make set of strings from input

Section 1 setop bookworm source

Description

SETOP

NAME

setop - make set of strings from input

SYNOPSIS

setop [-h] [--quiet | --verbose] [-C] [--include-empty] [-n insepar | -l elregex] [-o outsepar] [-t trimchars] [-u|i|s] [inputfilename]* [-d filename]* [-# | --is-empty | -c element | -e filename | -b filename | -p filename]

DESCRIPTION

Apply set operations like union, intersection, or set difference to input files and print resulting set (sorted and with unique string elements) to standard output or give answer to special queries like number of elements.

OPTIONS

--help

produce this help message and exit

--version

output name and version

--quiet

suppress all output messages in case of special queries (e. g. when check if element is contained in set)

--verbose

always use output messages in case of special queries (i. e. also output message on success)

-C [ --ignore-case ]

handle input elements case-insensitive

--include-empty

donât ignore empty elements (these can come from empty lines, trimming, etc.)

-n [ --input-separator ] arg

describe the form of an input separator as regular expression in ECMAScript syntax; default is new line (if --input-element is not given); donât forget to include the new line character \n when you set the input separator manually, when desired!

-l [ --input-element ] arg

describe the form of input elements as regular expression in ECMAScript syntax

-o [ --output-separator ] arg (=\n) string for separating output elements;

escape sequences are allowed

-t [ --trim ] arg

trim all given characters at beginning and end of elements (escape sequences allowed)

-u [ --union ]

unite all given input sets (default)

-i [ --intersection ]

unite all given input sets

-s [ --symmetric-difference ]

build symmetric difference for all given input sets

-d [ --difference ] arg

subtract all elements in given file from output set

-# [ --count ]

just output number of (different) elements, donât list them

--is-empty

check if resulting set is empty

-c [ --contains ] arg

check if given element is contained in set

-e [ --equal ] arg

check set equality, i. e. check if output corresponds with content of file

-b [ --subset ] arg

check if content of file is subset of output set

-p [ --superset ] arg

check if content of file is superset of output set

No input filename or "-" is equal to reading from standard input.

The sequence of events of setop is as follows: At first, all input files are parsed and combined according to one of the options -u, -i, or -s. After that, all inputs from option -d are parsed and removed from result of first step. Finally, the desired output is printed to screen: the set itself, or its number of elements, or a comparison to another set (option -e), etc.

By default each line of an input stream is considered to be an element, you can change this by defining regular expressions within the options --input-separator or --input-element. When using both, the input stream is first split according to the separator and after that filtered by the desired input element form. After finding the elements they are finally trimmed according to the argument given with --trim. The option -C lets you treat Word and WORD equal, only the first occurrence of all input streams is considered. Note that -C does not affect the regular expressions used in --input-separator and --input-element.

When describing strings and characters for the output separator or for the option --trim you can use escape sequences like \t, \n, \" and \’. But be aware that some of these sequences (especially \\ and \") might be interpreted by your shell before passing the string to setop. In that case you have to use \\\\ respectively \\\" just for describing a \ or a ". You can check your shellâs behavior with echo "\\ and \""

Special boolean queries (e. g. check if element is contained in set) donât return anything in case of success except their exit code EXIT_SUCCESS (0). In case the query is unsuccessful (e. g. element not contained in set) the exit code is guaranteed to be unequal to EXIT_SUCCESS and to EXIT_FAILURE (1). (Here it is 3.) This way, setop can be used in the shell.

EXAMPLES

setop -c ":fooBAR-:" --trim ":-\t" -C -d B.txt A.txt

case-insensitive check if element "foobar" is contained in A minus B

setop A.txt - -i B.txt --input-element "\d+"

output intersection of console, A, and B, where elements are recognized as strings of digits with at least one character; i. e. elements are non-negative integers

setop -s A.txt B.txt --input-separator [[:space:]-]

find all elements contained in A *or* B, not both, where a whitespace (i. e. \v \t \n \r \f or space) or a minus is interpreted as a separator between elements