csvjoin(1)

manual page for csvjoin 1.0.7

Section 1 csvkit bookworm source

Description

CSVJOIN

NAME

csvjoin - manual page for csvjoin 1.0.7

DESCRIPTION

usage: csvjoin [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]

[-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE] [-S] [--blanks] [--date-format DATE_FORMAT] [--datetime-format DATETIME_FORMAT] [-H] [-K SKIP_LINES] [-v] [-l] [--zero] [-V] [-c COLUMNS] [--outer] [--left] [--right] [-y SNIFF_LIMIT] [-I] [FILE ...]

Execute a SQL-like join to merge CSV files on a specified column or columns.

positional arguments:

FILE

The CSV files to operate on. If only one is specified, it will be copied to STDOUT.

optional arguments:

-h, --help

show this help message and exit

-d DELIMITER, --delimiter DELIMITER

Delimiting character of the input CSV file.

-t, --tabs

Specify that the input CSV file is delimited with tabs. Overrides "-d".

-q QUOTECHAR, --quotechar QUOTECHAR

Character used to quote strings in the input CSV file.

-u {0,1,2,3}, --quoting {0,1,2,3}

Quoting style used in the input CSV file. 0 = Quote Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 = Quote None.

-b, --no-doublequote

Whether or not double quotes are doubled in the input CSV file.

-p ESCAPECHAR, --escapechar ESCAPECHAR

Character used to escape the delimiter if --quoting 3 ("Quote None") is specified and to escape the QUOTECHAR if --no-doublequote is specified.

-z FIELD_SIZE_LIMIT, --maxfieldsize FIELD_SIZE_LIMIT

Maximum length of a single field in the input CSV file.

-e ENCODING, --encoding ENCODING

Specify the encoding of the input CSV file.

-L LOCALE, --locale LOCALE

Specify the locale (en_US) of any formatted numbers.

-S, --skipinitialspace

Ignore whitespace immediately following the delimiter.

--blanks

Do not convert "", "na", "n/a", "none", "null", "." to NULL.

--date-format DATE_FORMAT

Specify a strptime date format string like "%m/%d/%Y".

--datetime-format DATETIME_FORMAT

Specify a strptime datetime format string like "%m/%d/%Y %I:%M %p".

-H, --no-header-row

Specify that the input CSV file has no header row. Will create default headers (a,b,c,...).

-K SKIP_LINES, --skip-lines SKIP_LINES

Specify the number of initial lines to skip before the header row (e.g. comments, copyright notices, empty rows).

-v, --verbose

Print detailed tracebacks when errors occur.

-l, --linenumbers

Insert a column of line numbers at the front of the output. Useful when piping to grep or as a simple primary key.

--zero

When interpreting or displaying column numbers, use zero-based numbering instead of the default 1-based numbering.

-V, --version

Display version information and exit.

-c COLUMNS, --columns COLUMNS

The column name(s) on which to join. Should be either one name (or index) or a comma-separated list with one name (or index) per file, in the same order in which the files were specified. If not specified, the two files will be joined sequentially without matching.

--outer

Perform a full outer join, rather than the default inner join.

--left

Perform a left outer join, rather than the default inner join. If more than two files are provided this will be executed as a sequence of left outer joins, starting at the left.

--right

Perform a right outer join, rather than the default inner join. If more than two files are provided this will be executed as a sequence of right outer joins, starting at the right.

-y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT

Limit CSV dialect sniffing to the specified number of bytes. Specify "0" to disable sniffing entirely, or "-1" to sniff the entire file.

-I, --no-inference

Disable type inference when parsing CSV input.

Note that the join operation requires reading all files into memory. Don’t try this on very large files.

SEE ALSO

The full documentation for csvjoin is maintained as a Texinfo manual. If the info and csvjoin programs are properly installed at your site, the command

info csvjoin

should give you access to the complete manual.