dask.array.api.normalize_chunks
dask.array.api.normalize_chunks¶
- dask.array.api.normalize_chunks(chunks, shape=None, limit=None, dtype=None, previous_chunks=None)[source]¶
Normalize chunks to tuple of tuples
This takes in a variety of input types and information and produces a full tuple-of-tuples result for chunks, suitable to be passed to Array or rechunk or any other operation that creates a Dask array.
- Parameters
- chunks: tuple, int, dict, or string
The chunks to be normalized. See examples below for more details
- shape: Tuple[int]
The shape of the array
- limit: int (optional)
The maximum block size to target in bytes, if freedom is given to choose
- dtype: np.dtype
- previous_chunks: Tuple[Tuple[int]] optional
Chunks from a previous array that we should use for inspiration when rechunking auto dimensions. If not provided but auto-chunking exists then auto-dimensions will prefer square-like chunk shapes.
Examples
Fully explicit tuple-of-tuples
>>> from dask.array.core import normalize_chunks >>> normalize_chunks(((2, 2, 1), (2, 2, 2)), shape=(5, 6)) ((2, 2, 1), (2, 2, 2))
Specify uniform chunk sizes
>>> normalize_chunks((2, 2), shape=(5, 6)) ((2, 2, 1), (2, 2, 2))
Cleans up missing outer tuple
>>> normalize_chunks((3, 2), (5,)) ((3, 2),)
Cleans up lists to tuples
>>> normalize_chunks([[2, 2], [3, 3]]) ((2, 2), (3, 3))
Expands integer inputs 10 -> (10, 10)
>>> normalize_chunks(10, shape=(30, 5)) ((10, 10, 10), (5,))
Expands dict inputs
>>> normalize_chunks({0: 2, 1: 3}, shape=(6, 6)) ((2, 2, 2), (3, 3))
The values -1 and None get mapped to full size
>>> normalize_chunks((5, -1), shape=(10, 10)) ((5, 5), (10,)) >>> normalize_chunks((5, None), shape=(10, 10)) ((5, 5), (10,))
Use the value “auto” to automatically determine chunk sizes along certain dimensions. This uses the
limit=
anddtype=
keywords to determine how large to make the chunks. The term “auto” can be used anywhere an integer can be used. See array chunking documentation for more information.>>> normalize_chunks(("auto",), shape=(20,), limit=5, dtype='uint8') ((5, 5, 5, 5),) >>> normalize_chunks("auto", (2, 3), dtype=np.int32) ((2,), (3,))
You can also use byte sizes (see
dask.utils.parse_bytes()
) in place of “auto” to ask for a particular size>>> normalize_chunks("1kiB", shape=(2000,), dtype='float32') ((256, 256, 256, 256, 256, 256, 256, 208),)
Respects null dimensions
>>> normalize_chunks(()) () >>> normalize_chunks((), ()) () >>> normalize_chunks((1,), ()) () >>> normalize_chunks((), shape=(0, 0)) ((0,), (0,))
Handles NaNs
>>> normalize_chunks((1, (np.nan,)), (1, np.nan)) ((1,), (nan,))