pynlpl-sampler(1)
manual page for pynlpl-sampler 0.7.7
Description
PYNLPL-SAMPLER
NAME
sampler - manual page for pynlpl-sampler 0.7.7
DESCRIPTION
usage:
pynlpl-sampler [-h] [-t TESTSETSIZE] [-d DEVSETSITE] [-T
TRAINSETSITE]
[-S SEED]
files [files ...]
Extracts random samples from datasets, supports multiple parallel datasets (such as parallel corpora), provided that corresponding data is on the same line.
positional arguments:
|
files |
The data sets to sample from, must be of equal size (i.e., same number of lines) |
optional arguments:
-h, --help
show this help message and exit
-t TESTSETSIZE, --testsetsize TESTSETSIZE
Test set size (lines) (default: 0)
-d DEVSETSITE, --devsetsite DEVSETSITE
Development set size (lines) (default: 0)
-T TRAINSETSITE, --trainsetsite TRAINSETSITE
Training set size (lines), leave unassigned (0) to automatically use all of the remaining data (default: 0)
-S SEED, --seed SEED
Seed for random number generator (default: 0)