datadings.commands.shuffle module
- usage: shuffle.py [-h] [-y] [–true-shuffle] [–buf-size BUF_SIZE]
[–chunk-size CHUNK_SIZE] infile outfile
Shuffle an existing dataset file.
Positional arguments
infile Input file. outfile Output file.
- options:
- -h, --help
show this help message and exit
- -y, --no-confirm
Don’t require user interaction.
- --true-shuffle
Use slower but more random shuffling algorithm
- --buf-size BUF_SIZE
size of the shuffling buffer for fast shuffling; values less than 1 are interpreted as fractions of the dataset length; bigger values improve randomness, but use more memory
- --chunk-size CHUNK_SIZE
size of chunks read by the fast shuffling algorithm; bigger values improve performance, but reduce randomness