A number of useful tools are installed with datadings.
These will be accessible on the command line as
* is replaced with one of the submodule names,
The main tool to interact with datadings in this way is
It finds available datasets and runs their writing scripts.
These are the available tools:
datadings-writecreates new dataset files.
datadings-catprints the (abbreviated) contents of a dataset file.
datadings-shuffleshuffles an existing dataset file.
datadings-mergemerges two or more dataset files.
datadings-splitsplits a dataset file into two or more subsets.
datadings-benchruns some basic read performance benchmarks.
You can either call them directly or run them as modules
python -m datadings.commands.*, again with star
replaced by the name the command, e.g.,
- datadings.commands.bench module
- datadings.commands.cat module
- datadings.commands.convert_index module
- datadings.commands.merge module
- datadings.commands.sample module
- datadings.commands.shuffle module
- datadings.commands.split module
- datadings.commands.write module