dask.bag.Bag.repartition

dask.bag.Bag.repartition

Bag.repartition(npartitions=None, partition_size=None)[source]

Repartition Bag across new divisions.

Parameters
npartitionsint, optional

Number of partitions of output.

partition_sizeint or string, optional

Max number of bytes of memory for each partition. Use numbers or strings like 5MB.

Warning

This keyword argument triggers computation to determine the memory size of each partition, which may be expensive.

Notes

Exactly one of npartitions or partition_size should be specified. A ValueError will be raised when that is not the case.

Examples

>>> b.repartition(5)  # set to have 5 partitions