pkstatascii(1)

program to calculate basic statistics from text file

Section 1 pktools bookworm source

Description

pkstatascii

NAME

pkstatascii - program to calculate basic statistics from text file

SYNOPSIS

pkstatascii -i input [-c column] [options] [advanced options]

DESCRIPTION

pkstatascii calculates basic statistics of a data series in a text file.

OPTIONS

-i filename, --input filename

name of the input text file

-size, --size

sample size

-rnd number, --rnd number

generate random numbers

-dist function, --dist function

distribution for generating random numbers, see http://www.gn/software/gsl/manual/gsl-ref_toc.html#TOC320 (only uniform and Gaussian supported yet)

-rnda value, --rnda value

first parameter for random distribution (mean value in case of Gaussian)

-rndb value, --rndb value

second parameter for random distribution (standard deviation in case of Gaussian)

-mean, --mean

calculate mean

-median, --median

calculate median

-var, --var

calculate variance

-stdev, --stdev

calculate standard deviation

-skew, --skewness

calculate skewness

-kurt, --kurtosis

calculate kurtosis

-sum, --sum

calculate sum of column

-mm, --minmax

calculate minimum and maximum value

-min, --min

calculate minimum value

-max, --max

calculate maximum value

-hist, --hist

calculate histogram

-hist2d, --hist2d

calculate 2-dimensional histogram based on two columns

-nbin value, --nbin value

number of bins to calculate histogram

-rel, --relative

use percentiles for histogram to calculate histogram

-kde, --kde

Use Kernel ⟨http://pktools.nongnu.org/html/classKernel.html⟩ density estimation when producing histogram. The standard deviation is estimated based on Silverman’s rule of thumb

-cor, --correlation

calculate Pearson produc-moment correlation coefficient between two columns (defined by -c <col1> -c <col2>)

-rmse, --rmse

calculate root mean square error between two columns (defined by -c <col1> -c <col2>)

-reg, --regression

calculate linear regression between two columns and get correlation coefficient (defined by -c <col1> -c <col2>)

-regerr, --regerr

calculate linear regression between two columns and get root mean square error (defined by -c <col1> -c <col2>)

-v level, --verbose level

verbose mode when positive

Advanced options
-src_min
value, --src_min value

start reading source from this minimum value

-src_max value, --src_max value

stop reading source from this maximum value

-fs separator, --fs separator

field separator.

-r startrow [-r endrow], --range startrow [--range endrow]

rows to start/end reading. Use -r 1 -r 10 to read first 10 rows where first row is header. Use 0 to read all rows with no header.

-o, --output

output the selected columns

-t, --transpose

transpose input ascii vector (use in combination with --output)

-comment character, --comment character

comment character