datadings.reader.sharded module
- class datadings.reader.sharded.ShardedReader(shards: str | Path | Iterable[str | Path | Reader])[source]
Bases:
Reader
A Reader that combines several shards into one. Shards can be specified either as a glob pattern
dir/*.msgpack
for msgpack files, or an iterable of individual shards. Each shard can be a string,Path
, orReader
.- Parameters:
shards – glob pattern or a list of strings, Path objects or Readers
- get(index, yield_key=False, raw=False, copy=True)[source]
Returns sample at given index.
copy=False
allows the reader to use zero-copy mechanisms. Data may be returned asmemoryview
objects rather thanbytes
. This can improve performance, but also drastically increase memory consumption, since one sample can keep the whole slice in memory.- Parameters:
index – Index of the sample
yield_key – If True, returns (key, sample)
raw – If True, returns sample as msgpacked message
copy – if False, allow the reader to return data as
memoryview
objects instead ofbytes
- Returns:
Sample as index.
- slice(start, stop=None, yield_key=False, raw=False, copy=True)[source]
Returns a generator of samples selected by the given slice.
copy=False
allows the reader to use zero-copy mechanisms. Data may be returned asmemoryview
objects rather thanbytes
. This can improve performance, but also drastically increase memory consumption, since one sample can keep the whole slice in memory.- Parameters:
start – start index of slice
stop – stop index of slice
yield_key – if True, yield (key, sample)
raw – if True, returns sample as msgpacked message
copy – if False, allow the reader to return data as
memoryview
objects instead ofbytes
- Returns:
Iterator of selected samples