dask.array.core.normalize_chunks

dask.array.core.normalize_chunks

dask.array.core.normalize_chunks(chunks, shape=None, limit=None, dtype=None, previous_chunks=None)[source]

Normalize chunks to tuple of tuples

This takes in a variety of input types and information and produces a full tuple-of-tuples result for chunks, suitable to be passed to Array or rechunk or any other operation that creates a Dask array.

Parameters
chunks: tuple, int, dict, or string

The chunks to be normalized. See examples below for more details

shape: Tuple[int]

The shape of the array

limit: int (optional)

The maximum block size to target in bytes, if freedom is given to choose

dtype: np.dtype
previous_chunks: Tuple[Tuple[int]] optional

Chunks from a previous array that we should use for inspiration when rechunking auto dimensions. If not provided but auto-chunking exists then auto-dimensions will prefer square-like chunk shapes.

Examples

Specify uniform chunk sizes

>>> from dask.array.core import normalize_chunks
>>> normalize_chunks((2, 2), shape=(5, 6))
((2, 2, 1), (2, 2, 2))

Also passes through fully explicit tuple-of-tuples

>>> normalize_chunks(((2, 2, 1), (2, 2, 2)), shape=(5, 6))
((2, 2, 1), (2, 2, 2))

Cleans up lists to tuples

>>> normalize_chunks([[2, 2], [3, 3]])
((2, 2), (3, 3))

Expands integer inputs 10 -> (10, 10)

>>> normalize_chunks(10, shape=(30, 5))
((10, 10, 10), (5,))

Expands dict inputs

>>> normalize_chunks({0: 2, 1: 3}, shape=(6, 6))
((2, 2, 2), (3, 3))

The values -1 and None get mapped to full size

>>> normalize_chunks((5, -1), shape=(10, 10))
((5, 5), (10,))

Use the value “auto” to automatically determine chunk sizes along certain dimensions. This uses the limit= and dtype= keywords to determine how large to make the chunks. The term “auto” can be used anywhere an integer can be used. See array chunking documentation for more information.

>>> normalize_chunks(("auto",), shape=(20,), limit=5, dtype='uint8')
((5, 5, 5, 5),)

You can also use byte sizes (see dask.utils.parse_bytes()) in place of “auto” to ask for a particular size

>>> normalize_chunks("1kiB", shape=(2000,), dtype='float32')
((256, 256, 256, 256, 256, 256, 256, 208),)

Respects null dimensions

>>> normalize_chunks((), shape=(0, 0))
((0,), (0,))