Requesting multiple slices
When working with simulation data it can be useful to be able to efficiently read multiple non-contiguous chunks of a dataset (e.g. particles in some region of a SWIFT snapshot). Requesting each chunk separately can be slow because a round trip to the server is required for each one.
This module provides a mechanism to fetch multiple slices with one
http request. The hdfstream.RemoteDataset.request_slices()
method takes a sequence of slice objects as input and returns a single
array with the slices concatenated along the first axis. Slice objects
can be created by indexing numpy’s built in np.s_ object. For
example:
import numpy as np
slices = []
slices.append(np.s_[10:20,:])
slices.append(np.s_[50:60,:])
data = dataset.request_slices(slices)
This would return dataset elements with coordinates 10 to 19 and 50 to 59 in the first dimension and all elements in the second dimension. There are some restrictions on the slices:
Slice starting indexes in the first dimension must be in ascending order
Slice indexes in dimensions other than the first must not differ between slices
Slices must not overlap
Slices can only be concatenated along the first dimension
These restrictions are imposed for efficiency: slices may only be requested in the order in which they are stored on disk, and it must be possible to represent the combined slices as a single ndarray.