datadings.reader.reader module¶
- class datadings.reader.reader.Reader[source]¶
Bases:
object
Abstract base class for dataset readers.
Readers should be used as context managers:
with Reader(...) as reader: for sample in reader: [do dataset things]
Subclasses must implement the following methods:
__exit__
__len__
__contains__
find_key
find_index
get
slice
- abstract get(index, yield_key=False, raw=False, copy=True)[source]¶
Returns sample at given index.
copy=False
allows the reader to use zero-copy mechanisms. Data may be returned asmemoryview
objects rather thanbytes
. This can improve performance, but also drastically increase memory consumption, since one sample can keep the whole slice in memory.- Parameters
index – Index of the sample
yield_key – If True, returns (key, sample)
raw – If True, returns sample as msgpacked message
copy – if False, allow the reader to return data as
memoryview
objects instead ofbytes
- Returns
Sample as index.
- iter(start=None, stop=None, yield_key=False, raw=False, copy=True, chunk_size=16)[source]¶
Iterate over the dataset.
start
andstop
behave like the parameters of therange
function0.copy=False
allows the reader to use zero-copy mechanisms. Data may be returned asmemoryview
objects rather thanbytes
. This can improve performance, but also drastically increase memory consumption, since one sample can keep the whole slice in memory.- Parameters
start – start of range; if None, current index is used
stop – stop of range
yield_key – if True, yields (key, sample) pairs.
raw – if True, yields samples as msgpacked messages.
copy – if False, allow the reader to return data as
memoryview
objects instead ofbytes
chunk_size – number of samples read at once; bigger values can increase throughput, but require more memory
- Returns
Iterator
- next()[source]¶
Returns the next sample.
This can be slow for file-based readers if a lot of samples are to be read. Consider using iter instead:
it = iter(reader) while 1: next(it) ...
Or simply loop over the reader:
for sample in reader: ...
- rawiter(yield_key=False)[source]¶
Create an iterator that yields samples as msgpacked messages.
Included for backwards compatibility and may be deprecated and subsequently removed in the future.
- Parameters
yield_key – If True, yields (key, sample) pairs.
- Returns
Iterator
- rawnext() → bytes[source]¶
Return the next sample msgpacked as raw bytes.
This can be slow for file-based readers if a lot of samples are to be read. Consider using iter instead:
it = iter(reader) while 1: next(it) ...
Or simply loop over the reader:
for sample in reader: ...
Included for backwards compatibility and may be deprecated and subsequently removed in the future.
- abstract slice(start, stop=None, yield_key=False, raw=False, copy=True)[source]¶
Returns a generator of samples selected by the given slice.
copy=False
allows the reader to use zero-copy mechanisms. Data may be returned asmemoryview
objects rather thanbytes
. This can improve performance, but also drastically increase memory consumption, since one sample can keep the whole slice in memory.- Parameters
start – start index of slice
stop – stop index of slice
yield_key – if True, yield (key, sample)
raw – if True, returns sample as msgpacked message
copy – if False, allow the reader to return data as
memoryview
objects instead ofbytes
- Returns
Iterator of selected samples