dask.array.api.normalize_chunks

dask.array.api.normalize_chunks#

dask.array.api.normalize_chunks(chunks, shape=None, limit=None, dtype=None, previous_chunks=None)[source]#

Normalize chunks to tuple of tuples

This takes in a variety of input types and information and produces a full tuple-of-tuples result for chunks, suitable to be passed to Array or rechunk or any other operation that creates a Dask array.

Parameters:

chunks: tuple, int, dict, or string: The chunks to be normalized. See examples below for more details
shape: Tuple[int]: The shape of the array
limit: int (optional): The maximum block size to target in bytes, if freedom is given to choose
dtype: np.dtype
previous_chunks: Tuple[Tuple[int]] optional: Chunks from a previous array that we should use for inspiration when rechunking auto dimensions. If not provided but auto-chunking exists then auto-dimensions will prefer square-like chunk shapes.

Examples

Fully explicit tuple-of-tuples

>>> from dask.array.core import normalize_chunks
>>> normalize_chunks(((2, 2, 1), (2, 2, 2)), shape=(5, 6))
((2, 2, 1), (2, 2, 2))

Specify uniform chunk sizes

>>> normalize_chunks((2, 2), shape=(5, 6))
((2, 2, 1), (2, 2, 2))

Cleans up missing outer tuple

>>> normalize_chunks((3, 2), (5,))
((3, 2),)

Cleans up lists to tuples

>>> normalize_chunks([[2, 2], [3, 3]])
((2, 2), (3, 3))

Expands integer inputs 10 -> (10, 10)

>>> normalize_chunks(10, shape=(30, 5))
((10, 10, 10), (5,))

Expands dict inputs

>>> normalize_chunks({0: 2, 1: 3}, shape=(6, 6))
((2, 2, 2), (3, 3))

The values -1 and None get mapped to full size

>>> normalize_chunks((5, -1), shape=(10, 10))
((5, 5), (10,))
>>> normalize_chunks((5, None), shape=(10, 10))
((5, 5), (10,))

Use the value “auto” to automatically determine chunk sizes along certain dimensions. This uses the limit= and dtype= keywords to determine how large to make the chunks. The term “auto” can be used anywhere an integer can be used. See array chunking documentation for more information.

>>> normalize_chunks(("auto",), shape=(20,), limit=5, dtype='uint8')
((5, 5, 5, 5),)
>>> normalize_chunks("auto", (2, 3), dtype=np.int32)
((2,), (3,))

You can also use byte sizes (see dask.utils.parse_bytes()) in place of “auto” to ask for a particular size

>>> normalize_chunks("1kiB", shape=(2000,), dtype='float32')
((256, 256, 256, 256, 256, 256, 256, 208),)

Respects null dimensions

>>> normalize_chunks(())
()
>>> normalize_chunks((), ())
()
>>> normalize_chunks((1,), ())
()
>>> normalize_chunks((), shape=(0, 0))
((0,), (0,))

Handles NaNs

>>> normalize_chunks((1, (np.nan,)), (1, np.nan))
((1,), (nan,))

dask.array.api.normalize_chunks

Contents

dask.array.api.normalize_chunks#