# API¶

Top level user functions:

 all(a[, axis, keepdims, split_every]) Test whether all array elements along a given axis evaluate to True. angle(*args, **kwargs) Return the angle of the complex argument. any(a[, axis, keepdims, split_every]) Test whether any array element along a given axis evaluates to True. arange(*args, **kwargs) Return evenly spaced values from start to stop with step size step. arccos(x[, out]) Trigonometric inverse cosine, element-wise. arccosh(x[, out]) Inverse hyperbolic cosine, element-wise. arcsin(x[, out]) Inverse sine, element-wise. arcsinh(x[, out]) Inverse hyperbolic sine element-wise. arctan(x[, out]) Trigonometric inverse tangent, element-wise. arctan2(x1, x2[, out]) Element-wise arc tangent of x1/x2 choosing the quadrant correctly. arctanh(x[, out]) Inverse hyperbolic tangent element-wise. argmax(x[, axis, split_every]) Returns the indices of the maximum values along an axis. argmin(x[, axis, split_every]) Returns the indices of the minimum values along an axis. around(x[, decimals]) Evenly round to the given number of decimals. array(object[, dtype, copy, order, subok, ndmin]) Create an array. bincount(x[, weights, minlength]) Count number of occurrences of each value in array of non-negative ints. broadcast_to(x, shape) Broadcast an array to a new shape. coarsen(reduction, x, axes[, trim_excess]) Coarsen array by applying reduction to fixed size neighborhoods ceil(x[, out]) Return the ceiling of the input, element-wise. choose(a, choices) Construct an array from an index array and a set of arrays to choose from. clip(*args, **kwargs) Clip (limit) the values in an array. compress(condition, a[, axis]) Return selected slices of an array along given axis. concatenate(seq[, axis]) Concatenate arrays along an existing axis conj(x[, out]) Return the complex conjugate, element-wise. copysign(x1, x2[, out]) Change the sign of x1 to that of x2, element-wise. corrcoef(x[, y, rowvar]) Return Pearson product-moment correlation coefficients. cos(x[, out]) Cosine element-wise. cosh(x[, out]) Hyperbolic cosine, element-wise. cov(m[, y, rowvar, bias, ddof]) Estimate a covariance matrix, given data and weights. cumprod(x, axis[, dtype]) Return the cumulative product of elements along a given axis. cumsum(x, axis[, dtype]) Return the cumulative sum of the elements along a given axis. deg2rad(x[, out]) Convert angles from degrees to radians. degrees(x[, out]) Convert angles from radians to degrees. diag(v) Extract a diagonal or construct a diagonal array. digitize(x, bins[, right]) Return the indices of the bins to which each value in input array belongs. dot(a, b[, out]) Dot product of two arrays. dstack(tup) Stack arrays in sequence depth wise (along third axis). empty Blocked variant of empty exp(x[, out]) Calculate the exponential of all elements in the input array. expm1(x[, out]) Calculate exp(x) - 1 for all elements in the array. eye(N, chunks[, M, k, dtype]) Return a 2-D Array with ones on the diagonal and zeros elsewhere. fabs(x[, out]) Compute the absolute values element-wise. fft.fft(a[, n, axis]) Wrapping of numpy.fft.fft fft.ifft(a[, n, axis]) Wrapping of numpy.fft.ifft fft.hfft(a[, n, axis]) Wrapping of numpy.fft.hfft fft.ihfft(a[, n, axis]) Wrapping of numpy.fft.ihfft fft.rfft(a[, n, axis]) Wrapping of numpy.fft.rfft fft.irfft(a[, n, axis]) Wrapping of numpy.fft.irfft fix(*args, **kwargs) Round to nearest integer towards zero. floor(x[, out]) Return the floor of the input, element-wise. fmax(x1, x2[, out]) Element-wise maximum of array elements. fmin(x1, x2[, out]) Element-wise minimum of array elements. fmod(x1, x2[, out]) Return the element-wise remainder of division. frexp(x[, out1, out2]) Decompose the elements of x into mantissa and twos exponent. fromfunction(func[, chunks, shape, dtype]) Construct an array by executing a function over each coordinate. full Blocked variant of full histogram(a[, bins, range, normed, weights, ...]) Blocked variant of numpy.histogram. hstack(tup) Stack arrays in sequence horizontally (column wise). hypot(x1, x2[, out]) Given the “legs” of a right triangle, return its hypotenuse. imag(*args, **kwargs) Return the imaginary part of the elements of the array. insert(arr, obj, values, axis) Insert values along the given axis before the given indices. isclose(arr1, arr2[, rtol, atol, equal_nan]) Returns a boolean array where two arrays are element-wise equal within a tolerance. iscomplex(*args, **kwargs) Returns a bool array, where True if input element is complex. isfinite(x[, out]) Test element-wise for finiteness (not infinity or not Not a Number). isinf(x[, out]) Test element-wise for positive or negative infinity. isnan(x[, out]) Test element-wise for NaN and return result as a boolean array. isnull(values) pandas.isnull for dask arrays isreal(*args, **kwargs) Returns a bool array, where True if input element is real. ldexp(x1, x2[, out]) Returns x1 * 2**x2, element-wise. linspace(start, stop[, num, chunks, dtype]) Return num evenly spaced values over the closed interval [start, stop]. log(x[, out]) Natural logarithm, element-wise. log10(x[, out]) Return the base 10 logarithm of the input array, element-wise. log1p(x[, out]) Return the natural logarithm of one plus the input array, element-wise. log2(x[, out]) Base-2 logarithm of x. logaddexp(x1, x2[, out]) Logarithm of the sum of exponentiations of the inputs. logaddexp2(x1, x2[, out]) Logarithm of the sum of exponentiations of the inputs in base-2. logical_and(x1, x2[, out]) Compute the truth value of x1 AND x2 element-wise. logical_not(x[, out]) Compute the truth value of NOT x element-wise. logical_or(x1, x2[, out]) Compute the truth value of x1 OR x2 element-wise. logical_xor(x1, x2[, out]) Compute the truth value of x1 XOR x2, element-wise. max(a[, axis, keepdims, split_every]) Return the maximum of an array or maximum along an axis. maximum(x1, x2[, out]) Element-wise maximum of array elements. mean(a[, axis, dtype, keepdims, split_every]) Compute the arithmetic mean along the specified axis. min(a[, axis, keepdims, split_every]) Return the minimum of an array or minimum along an axis. minimum(x1, x2[, out]) Element-wise minimum of array elements. modf(x[, out1, out2]) Return the fractional and integral parts of an array, element-wise. moment(a, order[, axis, dtype, keepdims, ...]) nanargmax(x[, axis, split_every]) nanargmin(x[, axis, split_every]) nancumprod(x, axis[, dtype]) Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. nancumsum(x, axis[, dtype]) Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. nanmax(a[, axis, keepdims, split_every]) Return the maximum of an array or maximum along an axis, ignoring any NaNs. nanmean(a[, axis, dtype, keepdims, split_every]) Compute the arithmetic mean along the specified axis, ignoring NaNs. nanmin(a[, axis, keepdims, split_every]) Return minimum of an array or minimum along an axis, ignoring any NaNs. nanprod(a[, axis, dtype, keepdims, split_every]) Return the product of array elements over a given axis treating Not a Numbers (NaNs) as zero. nanstd(a[, axis, dtype, keepdims, ddof, ...]) Compute the standard deviation along the specified axis, while ignoring NaNs. nansum(a[, axis, dtype, keepdims, split_every]) Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. nanvar(a[, axis, dtype, keepdims, ddof, ...]) Compute the variance along the specified axis, while ignoring NaNs. nextafter(x1, x2[, out]) Return the next floating-point value after x1 towards x2, element-wise. notnull(values) pandas.notnull for dask arrays ones Blocked variant of ones percentile(a, q[, interpolation]) Approximate percentile of 1-D array prod(a[, axis, dtype, keepdims, split_every]) Return the product of array elements over a given axis. rad2deg(x[, out]) Convert angles from radians to degrees. radians(x[, out]) Convert angles from degrees to radians. random ravel(array) Return a contiguous flattened array. real(*args, **kwargs) Return the real part of the elements of the array. rechunk(x, chunks[, threshold, block_size_limit]) Convert blocks in dask array x for new chunks. repeat(a, b[, out]) Dot product of two arrays. reshape(array, shape) Gives a new shape to an array without changing its data. rint(x[, out]) Round elements of the array to the nearest integer. round(a[, decimals]) Round an array to the given number of decimals. sign(x[, out]) Returns an element-wise indication of the sign of a number. signbit(x[, out]) Returns element-wise True where signbit is set (less than zero). sin(x[, out]) Trigonometric sine, element-wise. sinh(x[, out]) Hyperbolic sine, element-wise. sqrt(x[, out]) Return the positive square-root of an array, element-wise. square(x[, out]) Return the element-wise square of the input. squeeze(a[, axis]) Remove single-dimensional entries from the shape of an array. stack(seq[, axis]) Stack arrays along a new axis std(a[, axis, dtype, keepdims, ddof, ...]) Compute the standard deviation along the specified axis. sum(a[, axis, dtype, keepdims, split_every]) Sum of array elements over a given axis. take(a, indices[, axis]) Take elements from an array along an axis. tan(x[, out]) Compute tangent element-wise. tanh(x[, out]) Compute hyperbolic tangent element-wise. tensordot(lhs, rhs[, axes]) Compute tensor dot product along specified axes for arrays >= 1-D. topk(k, x) The top k elements of an array transpose(a[, axes]) Permute the dimensions of an array. tril(m[, k]) Lower triangle of an array with elements above the k-th diagonal zeroed. triu(m[, k]) Upper triangle of an array with elements above the k-th diagonal zeroed. trunc(x[, out]) Return the truncated value of the input, element-wise. unique(x) Find the unique elements of an array. var(a[, axis, dtype, keepdims, ddof, ...]) Compute the variance along the specified axis. vnorm(a[, ord, axis, dtype, keepdims, ...]) Vector norm vstack(tup) Stack arrays in sequence vertically (row wise). where(condition, [x, y]) Return elements, either from x or y, depending on condition. zeros Blocked variant of zeros

## Linear Algebra¶

 linalg.cholesky(a[, lower]) Returns the Cholesky decomposition, $$A = L L^*$$ or $$A = U^* U$$ of a Hermitian positive-definite matrix A. linalg.inv(a) Compute the inverse of a matrix with LU decomposition and forward / backward substitutions. linalg.lstsq(a, b) Return the least-squares solution to a linear matrix equation using QR decomposition. linalg.lu(a) Compute the lu decomposition of a matrix. linalg.qr(a[, name]) Compute the qr factorization of a matrix. linalg.solve(a, b[, sym_pos]) Solve the equation a x = b for x. linalg.solve_triangular(a, b[, lower]) Solve the equation a x = b for x, assuming a is a triangular matrix. linalg.svd(a[, name]) Compute the singular value decomposition of a matrix. linalg.svd_compressed(a, k[, n_power_iter, ...]) Randomly compressed rank-k thin Singular Value Decomposition. linalg.tsqr(data[, name, compute_svd]) Direct Tall-and-Skinny QR algorithm

## Random¶

 random.beta(a, b[, size]) Draw samples from a Beta distribution. random.binomial(n, p[, size]) Draw samples from a binomial distribution. random.chisquare(df[, size]) Draw samples from a chi-square distribution. random.different_seeds random.exponential([scale, size]) Draw samples from an exponential distribution. random.f(dfnum, dfden[, size]) Draw samples from an F distribution. random.gamma(shape[, scale, size]) Draw samples from a Gamma distribution. random.geometric(p[, size]) Draw samples from the geometric distribution. random.gumbel([loc, scale, size]) Draw samples from a Gumbel distribution. random.hypergeometric(ngood, nbad, nsample) Draw samples from a Hypergeometric distribution. random.laplace([loc, scale, size]) Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay). random.logistic([loc, scale, size]) Draw samples from a logistic distribution. random.lognormal([mean, sigma, size]) Draw samples from a log-normal distribution. random.logseries(p[, size]) Draw samples from a logarithmic series distribution. random.negative_binomial(n, p[, size]) Draw samples from a negative binomial distribution. random.noncentral_chisquare(df, nonc[, size]) Draw samples from a noncentral chi-square distribution. random.noncentral_f(dfnum, dfden, nonc[, size]) Draw samples from the noncentral F distribution. random.normal([loc, scale, size]) Draw random samples from a normal (Gaussian) distribution. random.pareto(a[, size]) Draw samples from a Pareto II or Lomax distribution with specified shape. random.poisson([lam, size]) Draw samples from a Poisson distribution. random.power(a[, size]) Draws samples in [0, 1] from a power distribution with positive exponent a - 1. random.random([size]) Return random floats in the half-open interval [0.0, 1.0). random.random_sample([size]) Return random floats in the half-open interval [0.0, 1.0). random.rayleigh([scale, size]) Draw samples from a Rayleigh distribution. random.standard_cauchy([size]) Draw samples from a standard Cauchy distribution with mode = 0. random.standard_exponential([size]) Draw samples from the standard exponential distribution. random.standard_gamma(shape[, size]) Draw samples from a standard Gamma distribution. random.standard_normal([size]) Draw samples from a standard Normal distribution (mean=0, stdev=1). random.standard_t(df[, size]) Draw samples from a standard Student’s t distribution with df degrees of freedom. random.triangular(left, mode, right[, size]) Draw samples from the triangular distribution. random.uniform([low, high, size]) Draw samples from a uniform distribution. random.vonmises(mu, kappa[, size]) Draw samples from a von Mises distribution. random.wald(mean, scale[, size]) Draw samples from a Wald, or inverse Gaussian, distribution. random.weibull(a[, size]) Draw samples from a Weibull distribution. random.zipf(a[, size]) Standard distributions

## Slightly Overlapping Ghost Computations¶

 ghost.ghost(x, depth, boundary) Share boundaries between neighboring blocks ghost.map_overlap(x, func, depth[, ...])

## Create and Store Arrays¶

 from_array(x, chunks[, name, lock, fancy]) Create dask array from something that looks like an array from_delayed(value, shape, dtype[, name]) Create a dask array from a dask delayed value from_npy_stack(dirname[, mmap_mode]) Load dask array from stack of npy files store(sources, targets[, lock, regions, compute]) Store dask arrays in array-like objects, overwrite data in target to_hdf5(filename, *args, **kwargs) Store arrays in HDF5 file to_npy_stack(dirname, x[, axis]) Write dask array to a stack of .npy files

## Internal functions¶

 map_blocks(func, *args, **kwargs) Map a function across all blocks of a dask array. atop(func, out_ind, *args, **kwargs) Tensor operation: Generalized inner and outer products top(func, output, out_indices, ...) Tensor operation

## Other functions¶

dask.array.from_array(x, chunks, name=None, lock=False, fancy=True)

Create dask array from something that looks like an array

Input must have a .shape and support numpy-style slicing.

Parameters: x : array_like chunks : int, tuple How to chunk the array. Must be one of the following forms: - A blocksize like 1000. - A blockshape like (1000, 1000). - Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)). name : str, optional The key name to use for the array. Defaults to a hash of x. Use name=False to generate a random name instead of hashing (fast) lock : bool or Lock, optional If x doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you. fancy : bool, optional If x doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.

Examples

>>> x = h5py.File('...')['/data/path']
>>> a = da.from_array(x, chunks=(1000, 1000))


If your underlying datastore does not support concurrent reads then include the lock=True keyword argument or lock=mylock if you want multiple arrays to coordinate around the same lock.

>>> a = da.from_array(x, chunks=(1000, 1000), lock=True)

dask.array.from_delayed(value, shape, dtype, name=None)

This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.

The dask array will consist of a single chunk.

Examples

>>> from dask import delayed
>>> value = delayed(np.ones)(5)
>>> array = from_delayed(value, (5,), float)
>>> array
>>> array.compute()
array([ 1.,  1.,  1.,  1.,  1.])

dask.array.store(sources, targets, lock=True, regions=None, compute=True, **kwargs)

Store dask arrays in array-like objects, overwrite data in target

This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.

If your data fits in memory then you may prefer calling np.array(myarray) instead.

Parameters: sources: Array or iterable of Arrays targets: array-like or iterable of array-likes These should support setitem syntax target[10:20] = ... lock: boolean or threading.Lock, optional Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular threading.Lock object to be shared among all writes. regions: tuple of slices or iterable of tuple of slices Each region tuple in regions should be such that target[region].shape = source.shape for the corresponding source and target in sources and targets, respectively. compute: boolean, optional If true compute immediately, return dask.delayed.Delayed otherwise

Examples

>>> x = ...

>>> import h5py
>>> f = h5py.File('myfile.hdf5')
>>> dset = f.create_dataset('/data', shape=x.shape,
...                                  chunks=x.chunks,
...                                  dtype='f8')

>>> store(x, dset)


Alternatively store many arrays at the same time

>>> store([x, y, z], [dset1, dset2, dset3])

dask.array.topk(k, x)

The top k elements of an array

Returns the k greatest elements of the array in sorted order. Only works on arrays of a single dimension.

This assumes that k is small. All results will be returned in a single chunk.

Examples

>>> x = np.array([5, 1, 3, 6])
>>> d = from_array(x, chunks=2)
>>> d.topk(2).compute()
array([6, 5])

dask.array.coarsen(reduction, x, axes, trim_excess=False)

Coarsen array by applying reduction to fixed size neighborhoods

Parameters: reduction: function Function like np.sum, np.mean, etc... x: np.ndarray Array to be coarsened axes: dict Mapping of axis to coarsening factor

Examples

>>> x = np.array([1, 2, 3, 4, 5, 6])
>>> coarsen(np.sum, x, {0: 2})
array([ 3,  7, 11])
>>> coarsen(np.max, x, {0: 3})
array([3, 6])


Provide dictionary of scale per dimension

>>> x = np.arange(24).reshape((4, 6))
>>> x
array([[ 0,  1,  2,  3,  4,  5],
[ 6,  7,  8,  9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])

>>> coarsen(np.min, x, {0: 2, 1: 3})
array([[ 0,  3],
[12, 15]])


You must avoid excess elements explicitly

>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
>>> coarsen(np.min, x, {0: 3}, trim_excess=True)
array([1, 4])

dask.array.stack(seq, axis=0)

Stack arrays along a new axis

Given a sequence of dask Arrays form a new dask Array by stacking them along a new dimension (axis=0 by default)

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np

>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]

>>> x = da.stack(data, axis=0)
>>> x.shape
(3, 4, 4)

>>> da.stack(data, axis=1).shape
(4, 3, 4)

>>> da.stack(data, axis=-1).shape
(4, 4, 3)


Result is a new dask Array

dask.array.concatenate(seq, axis=0)

Concatenate arrays along an existing axis

Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np

>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]

>>> x = da.concatenate(data, axis=0)
>>> x.shape
(12, 4)

>>> da.concatenate(data, axis=1).shape
(4, 12)


Result is a new dask Array

dask.array.all(a, axis=None, keepdims=False, split_every=None)

Test whether all array elements along a given axis evaluate to True.

Parameters: a : array_like Input array or object that can be converted to an array. axis : None or int or tuple of ints, optional Axis or axes along which a logical AND reduction is performed. The default (axis = None) is to perform a logical AND over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis. New in version 1.7.0. If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before. out : ndarray, optional Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if dtype(out) is float, the result will consist of 0.0’s and 1.0’s). See doc.ufuncs (Section “Output arguments”) for more details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. all : ndarray, bool A new boolean or array is returned unless out is specified, in which case a reference to out is returned.

ndarray.all
equivalent method
any
Test whether any element along a given axis evaluates to True.

Notes

Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.

Examples

>>> np.all([[True,False],[True,True]])
False

>>> np.all([[True,False],[True,True]], axis=0)
array([ True, False], dtype=bool)

>>> np.all([-1, 4, 5])
True

>>> np.all([1.0, np.nan])
True

>>> o=np.array([False])
>>> z=np.all([-1, 4, 5], out=o)
>>> id(z), id(o), z
(28293632, 28293632, array([ True], dtype=bool))

dask.array.angle(*args, **kwargs)

Return the angle of the complex argument.

Parameters: z : array_like A complex number or sequence of complex numbers. deg : bool, optional Return angle in degrees if True, radians if False (default). angle : ndarray or scalar The counterclockwise angle from the positive real axis on the complex plane, with dtype as numpy.float64.

arctan2, absolute

Examples

>>> np.angle([1.0, 1.0j, 1+1j])               # in radians
array([ 0.        ,  1.57079633,  0.78539816])
>>> np.angle(1+1j, deg=True)                  # in degrees
45.0

dask.array.any(a, axis=None, keepdims=False, split_every=None)

Test whether any array element along a given axis evaluates to True.

Returns single boolean unless axis is not None

Parameters: a : array_like Input array or object that can be converted to an array. axis : None or int or tuple of ints, optional Axis or axes along which a logical OR reduction is performed. The default (axis = None) is to perform a logical OR over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis. New in version 1.7.0. If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before. out : ndarray, optional Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if it is of type float, then it will remain so, returning 1.0 for True and 0.0 for False, regardless of the type of a). See doc.ufuncs (Section “Output arguments”) for details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. any : bool or ndarray A new boolean or ndarray is returned unless out is specified, in which case a reference to out is returned.

ndarray.any
equivalent method
all
Test whether all elements along a given axis evaluate to True.

Notes

Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.

Examples

>>> np.any([[True, False], [True, True]])
True

>>> np.any([[True, False], [False, False]], axis=0)
array([ True, False], dtype=bool)

>>> np.any([-1, 0, 5])
True

>>> np.any(np.nan)
True

>>> o=np.array([False])
>>> z=np.any([-1, 4, 5], out=o)
>>> z, o
(array([ True], dtype=bool), array([ True], dtype=bool))
>>> # Check now that z is a reference to o
>>> z is o
True
>>> id(z), id(o) # identity of z and o
(191614240, 191614240)

dask.array.arange(*args, **kwargs)

Return evenly spaced values from start to stop with step size step.

The values are half-open [start, stop), so including start and excluding stop. This is basically the same as python’s range function but for dask arrays.

When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.

Parameters: start : int, optional The starting value of the sequence. The default is 0. stop : int The end of the interval, this value is excluded from the interval. step : int, optional The spacing between the values. The default is 1 when not specified. The last value of the sequence. chunks : int The number of samples on each block. Note that the last block will have fewer samples if len(array) % chunks != 0. samples : dask array
dask.array.arccos(x[, out])

Trigonometric inverse cosine, element-wise.

The inverse of cos so that, if y = cos(x), then x = arccos(y).

Parameters: x : array_like x-coordinate on the unit circle. For real arguments, the domain is [-1, 1]. out : ndarray, optional Array of the same shape as a, to store results in. See doc.ufuncs (Section “Output arguments”) for more details. angle : ndarray The angle of the ray intersecting the unit circle at the given x-coordinate in radians [0, pi]. If x is a scalar then a scalar is returned, otherwise an array of the same shape as x is returned.

cos, arctan, arcsin, emath.arccos

Notes

arccos is a multivalued function: for each x there are infinitely many numbers z such that cos(z) = x. The convention is to return the angle z whose real part lies in [0, pi].

For real-valued input data types, arccos always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arccos is a complex analytic function that has branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.

The inverse cos is also known as acos or cos^-1.

References

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/

Examples

We expect the arccos of 1 to be 0, and of -1 to be pi:

>>> np.arccos([1, -1])
array([ 0.        ,  3.14159265])


Plot arccos:

>>> import matplotlib.pyplot as plt
>>> x = np.linspace(-1, 1, num=100)
>>> plt.plot(x, np.arccos(x))
>>> plt.axis('tight')
>>> plt.show()

dask.array.arccosh(x[, out])

Inverse hyperbolic cosine, element-wise.

Parameters: x : array_like Input array. out : ndarray, optional Array of the same shape as x, to store results in. See doc.ufuncs (Section “Output arguments”) for details. arccosh : ndarray Array of the same shape as x.

Notes

arccosh is a multivalued function: for each x there are infinitely many numbers z such that cosh(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi] and the real part in [0, inf].

For real-valued input data types, arccosh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arccosh is a complex analytical function that has a branch cut [-inf, 1] and is continuous from above on it.

References

 [R85] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/
 [R86] Wikipedia, “Inverse hyperbolic function”, http://en.wikipedia.org/wiki/Arccosh

Examples

>>> np.arccosh([np.e, 10.0])
array([ 1.65745445,  2.99322285])
>>> np.arccosh(1)
0.0

dask.array.arcsin(x[, out])

Inverse sine, element-wise.

Parameters: x : array_like y-coordinate on the unit circle. out : ndarray, optional Array of the same shape as x, in which to store the results. See doc.ufuncs (Section “Output arguments”) for more details. angle : ndarray The inverse sine of each element in x, in radians and in the closed interval [-pi/2, pi/2]. If x is a scalar, a scalar is returned, otherwise an array.

Notes

arcsin is a multivalued function: for each x there are infinitely many numbers z such that $$sin(z) = x$$. The convention is to return the angle z whose real part lies in [-pi/2, pi/2].

For real-valued input data types, arcsin always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arcsin is a complex analytic function that has, by convention, the branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.

The inverse sine is also known as asin or sin^{-1}.

References

Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79ff. http://www.math.sfu.ca/~cbm/aands/

Examples

>>> np.arcsin(1)     # pi/2
1.5707963267948966
>>> np.arcsin(-1)    # -pi/2
-1.5707963267948966
>>> np.arcsin(0)
0.0

dask.array.arcsinh(x[, out])

Inverse hyperbolic sine element-wise.

Parameters: x : array_like Input array. out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. out : ndarray Array of of the same shape as x.

Notes

arcsinh is a multivalued function: for each x there are infinitely many numbers z such that sinh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].

For real-valued input data types, arcsinh always returns real output. For each value that cannot be expressed as a real number or infinity, it returns nan and sets the invalid floating point error flag.

For complex-valued input, arccos is a complex analytical function that has branch cuts [1j, infj] and [-1j, -infj] and is continuous from the right on the former and from the left on the latter.

The inverse hyperbolic sine is also known as asinh or sinh^-1.

References

 [R87] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/
 [R88] Wikipedia, “Inverse hyperbolic function”, http://en.wikipedia.org/wiki/Arcsinh

Examples

>>> np.arcsinh(np.array([np.e, 10.0]))
array([ 1.72538256,  2.99822295])

dask.array.arctan(x[, out])

Trigonometric inverse tangent, element-wise.

The inverse of tan, so that if y = tan(x) then x = arctan(y).

Parameters: x : array_like Input values. arctan is applied to each element of x. out : ndarray Out has the same shape as x. Its real part is in [-pi/2, pi/2] (arctan(+/-inf) returns +/-pi/2). It is a scalar if x is a scalar.

arctan2
The “four quadrant” arctan of the angle formed by (x, y) and the positive x-axis.
angle
Argument of complex values.

Notes

arctan is a multi-valued function: for each x there are infinitely many numbers z such that tan(z) = x. The convention is to return the angle z whose real part lies in [-pi/2, pi/2].

For real-valued input data types, arctan always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arctan is a complex analytic function that has [1j, infj] and [-1j, -infj] as branch cuts, and is continuous from the left on the former and from the right on the latter.

The inverse tangent is also known as atan or tan^{-1}.

References

Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/

Examples

We expect the arctan of 0 to be 0, and of 1 to be pi/4:

>>> np.arctan([0, 1])
array([ 0.        ,  0.78539816])

>>> np.pi/4
0.78539816339744828


Plot arctan:

>>> import matplotlib.pyplot as plt
>>> x = np.linspace(-10, 10)
>>> plt.plot(x, np.arctan(x))
>>> plt.axis('tight')
>>> plt.show()

dask.array.arctan2(x1, x2[, out])

Element-wise arc tangent of x1/x2 choosing the quadrant correctly.

The quadrant (i.e., branch) is chosen so that arctan2(x1, x2) is the signed angle in radians between the ray ending at the origin and passing through the point (1,0), and the ray ending at the origin and passing through the point (x2, x1). (Note the role reversal: the “y-coordinate” is the first function parameter, the “x-coordinate” is the second.) By IEEE convention, this function is defined for x2 = +/-0 and for either or both of x1 and x2 = +/-inf (see Notes for specific values).

This function is not defined for complex-valued arguments; for the so-called argument of complex values, use angle.

Parameters: x1 : array_like, real-valued y-coordinates. x2 : array_like, real-valued x-coordinates. x2 must be broadcastable to match the shape of x1 or vice versa. angle : ndarray Array of angles in radians, in the range [-pi, pi].

Notes

arctan2 is identical to the atan2 function of the underlying C library. The following special values are defined in the C standard: [R89]

x1 x2 arctan2(x1,x2)
+/- 0 +0 +/- 0
+/- 0 -0 +/- pi
> 0 +/-inf +0 / +pi
< 0 +/-inf -0 / -pi
+/-inf +inf +/- (pi/4)
+/-inf -inf +/- (3*pi/4)

Note that +0 and -0 are distinct floating point numbers, as are +inf and -inf.

References

 [R89] (1, 2) ISO/IEC standard 9899:1999, “Programming language C.”

Examples

Consider four points in different quadrants:

>>> x = np.array([-1, +1, +1, -1])
>>> y = np.array([-1, -1, +1, +1])
>>> np.arctan2(y, x) * 180 / np.pi
array([-135.,  -45.,   45.,  135.])


Note the order of the parameters. arctan2 is defined also when x2 = 0 and at several other special points, obtaining values in the range [-pi, pi]:

>>> np.arctan2([1., -1.], [0., 0.])
array([ 1.57079633, -1.57079633])
>>> np.arctan2([0., 0., np.inf], [+0., -0., np.inf])
array([ 0.        ,  3.14159265,  0.78539816])

dask.array.arctanh(x[, out])

Inverse hyperbolic tangent element-wise.

Parameters: x : array_like Input array. out : ndarray Array of the same shape as x.

emath.arctanh

Notes

arctanh is a multivalued function: for each x there are infinitely many numbers z such that tanh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].

For real-valued input data types, arctanh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arctanh is a complex analytical function that has branch cuts [-1, -inf] and [1, inf] and is continuous from above on the former and from below on the latter.

The inverse hyperbolic tangent is also known as atanh or tanh^-1.

References

 [R90] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/
 [R91] Wikipedia, “Inverse hyperbolic function”, http://en.wikipedia.org/wiki/Arctanh

Examples

>>> np.arctanh([0, -0.5])
array([ 0.        , -0.54930614])

dask.array.argmax(x, axis=None, split_every=None)

Returns the indices of the maximum values along an axis.

Parameters: a : array_like Input array. axis : int, optional By default, the index is into the flattened array, otherwise along the specified axis. out : array, optional If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. index_array : ndarray of ints Array of indices into the array. It has the same shape as a.shape with the dimension along axis removed.

ndarray.argmax, argmin

amax
The maximum value along a given axis.
unravel_index
Convert a flat index into an index tuple.

Notes

In case of multiple occurrences of the maximum values, the indices corresponding to the first occurrence are returned.

Examples

>>> a = np.arange(6).reshape(2,3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> np.argmax(a)
5
>>> np.argmax(a, axis=0)
array([1, 1, 1])
>>> np.argmax(a, axis=1)
array([2, 2])

>>> b = np.arange(6)
>>> b[1] = 5
>>> b
array([0, 5, 2, 3, 4, 5])
>>> np.argmax(b) # Only the first occurrence is returned.
1

dask.array.argmin(x, axis=None, split_every=None)

Returns the indices of the minimum values along an axis.

Parameters: a : array_like Input array. axis : int, optional By default, the index is into the flattened array, otherwise along the specified axis. out : array, optional If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. index_array : ndarray of ints Array of indices into the array. It has the same shape as a.shape with the dimension along axis removed.

ndarray.argmin, argmax

amin
The minimum value along a given axis.
unravel_index
Convert a flat index into an index tuple.

Notes

In case of multiple occurrences of the minimum values, the indices corresponding to the first occurrence are returned.

Examples

>>> a = np.arange(6).reshape(2,3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> np.argmin(a)
0
>>> np.argmin(a, axis=0)
array([0, 0, 0])
>>> np.argmin(a, axis=1)
array([0, 0])

>>> b = np.arange(6)
>>> b[4] = 0
>>> b
array([0, 1, 2, 3, 0, 5])
>>> np.argmin(b) # Only the first occurrence is returned.
0

dask.array.around(x, decimals=0)

Evenly round to the given number of decimals.

Parameters: a : array_like Input data. decimals : int, optional Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary. See doc.ufuncs (Section “Output arguments”) for details. rounded_array : ndarray An array of the same type as a, containing the rounded values. Unless out was specified, a new array is created. A reference to the result is returned. The real and imaginary parts of complex numbers are rounded separately. The result of rounding a float is a float.

ndarray.round
equivalent method

Notes

For values exactly halfway between rounded decimal values, Numpy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R92] and errors introduced when scaling by powers of ten.

References

 [R92] (1, 2) “Lecture Notes on the Status of IEEE 754”, William Kahan, http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF
 [R93] “How Futile are Mindless Assessments of Roundoff in Floating-Point Computation?”, William Kahan, http://www.cs.berkeley.edu/~wkahan/Mindless.pdf

Examples

>>> np.around([0.37, 1.64])
array([ 0.,  2.])
>>> np.around([0.37, 1.64], decimals=1)
array([ 0.4,  1.6])
>>> np.around([.5, 1.5, 2.5, 3.5, 4.5]) # rounds to nearest even value
array([ 0.,  2.,  2.,  4.,  4.])
>>> np.around([1,2,3,11], decimals=1) # ndarray of ints is returned
array([ 1,  2,  3, 11])
>>> np.around([1,2,3,11], decimals=-1)
array([ 0,  0,  0, 10])

dask.array.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

Create an array.

Parameters: object : array_like An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence. dtype : data-type, optional The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the .astype(t) method. copy : bool, optional If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (dtype, order, etc.). order : {‘C’, ‘F’, ‘A’}, optional Specify the order of the array. If order is ‘C’, then the array will be in C-contiguous order (last-index varies the fastest). If order is ‘F’, then the returned array will be in Fortran-contiguous order (first-index varies the fastest). If order is ‘A’ (default), then the returned array may be in any order (either C-, Fortran-contiguous, or even discontiguous), unless a copy is required, in which case it will be C-contiguous. subok : bool, optional If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default). ndmin : int, optional Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement. out : ndarray An array object satisfying the specified requirements.

empty, empty_like, zeros, zeros_like, ones, ones_like, fill

Examples

>>> np.array([1, 2, 3])
array([1, 2, 3])


Upcasting:

>>> np.array([1, 2, 3.0])
array([ 1.,  2.,  3.])


More than one dimension:

>>> np.array([[1, 2], [3, 4]])
array([[1, 2],
[3, 4]])


Minimum dimensions 2:

>>> np.array([1, 2, 3], ndmin=2)
array([[1, 2, 3]])


Type provided:

>>> np.array([1, 2, 3], dtype=complex)
array([ 1.+0.j,  2.+0.j,  3.+0.j])


Data-type consisting of more than one element:

>>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')])
>>> x['a']
array([1, 3])


Creating an array from sub-classes:

>>> np.array(np.mat('1 2; 3 4'))
array([[1, 2],
[3, 4]])

>>> np.array(np.mat('1 2; 3 4'), subok=True)
matrix([[1, 2],
[3, 4]])

dask.array.bincount(x, weights=None, minlength=None)

Count number of occurrences of each value in array of non-negative ints.

The number of bins (of size 1) is one larger than the largest value in x. If minlength is specified, there will be at least this number of bins in the output array (though it will be longer if necessary, depending on the contents of x). Each bin gives the number of occurrences of its index value in x. If weights is specified the input array is weighted by it, i.e. if a value n is found at position i, out[n] += weight[i] instead of out[n] += 1.

Parameters: x : array_like, 1 dimension, nonnegative ints Input array. weights : array_like, optional Weights, array of the same shape as x. minlength : int, optional A minimum number of bins for the output array. New in version 1.6.0. out : ndarray of ints The result of binning the input array. The length of out is equal to np.amax(x)+1. ValueError If the input is not 1-dimensional, or contains elements with negative values, or if minlength is non-positive. TypeError If the type of the input is float or complex.

Examples

>>> np.bincount(np.arange(5))
array([1, 1, 1, 1, 1])
>>> np.bincount(np.array([0, 1, 1, 3, 2, 1, 7]))
array([1, 3, 1, 1, 0, 0, 0, 1])

>>> x = np.array([0, 1, 1, 3, 2, 1, 7, 23])
>>> np.bincount(x).size == np.amax(x)+1
True


The input array needs to be of integer dtype, otherwise a TypeError is raised:

>>> np.bincount(np.arange(5, dtype=np.float))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: array cannot be safely cast to required type


A possible use of bincount is to perform sums over variable-size chunks of an array, using the weights keyword.

>>> w = np.array([0.3, 0.5, 0.2, 0.7, 1., -0.6]) # weights
>>> x = np.array([0, 1, 1, 2, 2, 2])
>>> np.bincount(x,  weights=w)
array([ 0.3,  0.7,  1.1])

dask.array.broadcast_to(x, shape)

Broadcast an array to a new shape.

Parameters: array : array_like The array to broadcast. shape : tuple The shape of the desired array. subok : bool, optional If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default). broadcast : array A readonly view on the original array with the given shape. It is typically not contiguous. Furthermore, more than one element of a broadcasted array may refer to a single memory location. ValueError If the array is not compatible with the new shape according to NumPy’s broadcasting rules.

Notes

New in version 1.10.0.

Examples

>>> x = np.array([1, 2, 3])
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])

dask.array.coarsen(reduction, x, axes, trim_excess=False)

Coarsen array by applying reduction to fixed size neighborhoods

Parameters: reduction: function Function like np.sum, np.mean, etc... x: np.ndarray Array to be coarsened axes: dict Mapping of axis to coarsening factor

Examples

>>> x = np.array([1, 2, 3, 4, 5, 6])
>>> coarsen(np.sum, x, {0: 2})
array([ 3,  7, 11])
>>> coarsen(np.max, x, {0: 3})
array([3, 6])


Provide dictionary of scale per dimension

>>> x = np.arange(24).reshape((4, 6))
>>> x
array([[ 0,  1,  2,  3,  4,  5],
[ 6,  7,  8,  9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])

>>> coarsen(np.min, x, {0: 2, 1: 3})
array([[ 0,  3],
[12, 15]])


You must avoid excess elements explicitly

>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
>>> coarsen(np.min, x, {0: 3}, trim_excess=True)
array([1, 4])

dask.array.ceil(x[, out])

Return the ceiling of the input, element-wise.

The ceil of the scalar x is the smallest integer i, such that i >= x. It is often denoted as $$\lceil x \rceil$$.

Parameters: x : array_like Input data. y : ndarray or scalar The ceiling of each element in x, with float dtype.

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])
>>> np.ceil(a)
array([-1., -1., -0.,  1.,  2.,  2.,  2.])

dask.array.choose(a, choices)

Construct an array from an index array and a set of arrays to choose from.

First of all, if confused or uncertain, definitely look at the Examples - in its full generality, this function is less simple than it might seem from the following code description (below ndi = numpy.lib.index_tricks):

np.choose(a,c) == np.array([c[a[I]][I] for I in ndi.ndindex(a.shape)]).

But this omits some subtleties. Here is a fully general summary:

Given an “index” array (a) of integers and a sequence of n arrays (choices), a and each choice array are first broadcast, as necessary, to arrays of a common shape; calling these Ba and Bchoices[i], i = 0,...,n-1 we have that, necessarily, Ba.shape == Bchoices[i].shape for each i. Then, a new array with shape Ba.shape is created as follows:

• if mode=raise (the default), then, first of all, each element of a (and thus Ba) must be in the range [0, n-1]; now, suppose that i (in that range) is the value at the (j0, j1, ..., jm) position in Ba - then the value at the same position in the new array is the value in Bchoices[i] at that same position;
• if mode=wrap, values in a (and thus Ba) may be any (signed) integer; modular arithmetic is used to map integers outside the range [0, n-1] back into that range; and then the new array is constructed as above;
• if mode=clip, values in a (and thus Ba) may be any (signed) integer; negative integers are mapped to 0; values greater than n-1 are mapped to n-1; and then the new array is constructed as above.
Parameters: a : int array This array must contain integers in [0, n-1], where n is the number of choices, unless mode=wrap or mode=clip, in which cases any integers are permissible. choices : sequence of arrays Choice arrays. a and all of the choices must be broadcastable to the same shape. If choices is itself an array (not recommended), then its outermost dimension (i.e., the one corresponding to choices.shape[0]) is taken as defining the “sequence”. out : array, optional If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. mode : {‘raise’ (default), ‘wrap’, ‘clip’}, optional Specifies how indices outside [0, n-1] will be treated: ‘raise’ : an exception is raised ‘wrap’ : value becomes value mod n ‘clip’ : values < 0 are mapped to 0, values > n-1 are mapped to n-1 merged_array : array The merged result. ValueError: shape mismatch If a and each choice array are not all broadcastable to the same shape.

ndarray.choose
equivalent method

Notes

To reduce the chance of misinterpretation, even though the following “abuse” is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.

Examples

>>> choices = [[0, 1, 2, 3], [10, 11, 12, 13],
...   [20, 21, 22, 23], [30, 31, 32, 33]]
>>> np.choose([2, 3, 1, 0], choices
... # the first element of the result will be the first element of the
... # third (2+1) "array" in choices, namely, 20; the second element
... # will be the second element of the fourth (3+1) choice array, i.e.,
... # 31, etc.
... )
array([20, 31, 12,  3])
>>> np.choose([2, 4, 1, 0], choices, mode='clip') # 4 goes to 3 (4-1)
array([20, 31, 12,  3])
>>> # because there are 4 choice arrays
>>> np.choose([2, 4, 1, 0], choices, mode='wrap') # 4 goes to (4 mod 4)
array([20,  1, 12,  3])
>>> # i.e., 0


A couple examples illustrating how choose broadcasts:

>>> a = [[1, 0, 1], [0, 1, 0], [1, 0, 1]]
>>> choices = [-10, 10]
>>> np.choose(a, choices)
array([[ 10, -10,  10],
[-10,  10, -10],
[ 10, -10,  10]])

>>> # With thanks to Anne Archibald
>>> a = np.array([0, 1]).reshape((2,1,1))
>>> c1 = np.array([1, 2, 3]).reshape((1,3,1))
>>> c2 = np.array([-1, -2, -3, -4, -5]).reshape((1,1,5))
>>> np.choose(a, (c1, c2)) # result is 2x3x5, res[0,:,:]=c1, res[1,:,:]=c2
array([[[ 1,  1,  1,  1,  1],
[ 2,  2,  2,  2,  2],
[ 3,  3,  3,  3,  3]],
[[-1, -2, -3, -4, -5],
[-1, -2, -3, -4, -5],
[-1, -2, -3, -4, -5]]])

dask.array.clip(*args, **kwargs)

Clip (limit) the values in an array.

Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.

Parameters: a : array_like Array containing elements to clip. a_min : scalar or array_like Minimum value. a_max : scalar or array_like Maximum value. If a_min or a_max are array_like, then they will be broadcasted to the shape of a. out : ndarray, optional The results will be placed in this array. It may be the input array for in-place clipping. out must be of the right shape to hold the output. Its type is preserved. clipped_array : ndarray An array with the elements of a, but where values < a_min are replaced with a_min, and those > a_max with a_max.

numpy.doc.ufuncs
Section “Output arguments”

Examples

>>> a = np.arange(10)
>>> np.clip(a, 1, 8)
array([1, 1, 2, 3, 4, 5, 6, 7, 8, 8])
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, 3, 6, out=a)
array([3, 3, 3, 3, 4, 5, 6, 6, 6, 6])
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, [3,4,1,1,1,4,4,4,4,4], 8)
array([3, 4, 2, 3, 4, 5, 6, 7, 8, 8])

dask.array.compress(condition, a, axis=None)

Return selected slices of an array along given axis.

When working along a given axis, a slice along that axis is returned in output for each index where condition evaluates to True. When working on a 1-D array, compress is equivalent to extract.

Parameters: condition : 1-D array of bools Array that selects which entries to return. If len(condition) is less than the size of a along the given axis, then output is truncated to the length of the condition array. a : array_like Array from which to extract a part. axis : int, optional Axis along which to take slices. If None (default), work on the flattened array. out : ndarray, optional Output array. Its type is preserved and it must be of the right shape to hold the output. compressed_array : ndarray A copy of a without the slices along axis for which condition is false.

take, choose, diag, diagonal, select

ndarray.compress
Equivalent method in ndarray
np.extract
Equivalent method when working on 1-D arrays
numpy.doc.ufuncs
Section “Output arguments”

Examples

>>> a = np.array([[1, 2], [3, 4], [5, 6]])
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> np.compress([0, 1], a, axis=0)
array([[3, 4]])
>>> np.compress([False, True, True], a, axis=0)
array([[3, 4],
[5, 6]])
>>> np.compress([False, True], a, axis=1)
array([[2],
[4],
[6]])


Working on the flattened array does not return slices along an axis but selects elements.

>>> np.compress([False, True], a)
array([2])

dask.array.concatenate(seq, axis=0)

Concatenate arrays along an existing axis

Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np

>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]

>>> x = da.concatenate(data, axis=0)
>>> x.shape
(12, 4)

>>> da.concatenate(data, axis=1).shape
(4, 12)


Result is a new dask Array

dask.array.conj(x[, out])

Return the complex conjugate, element-wise.

The complex conjugate of a complex number is obtained by changing the sign of its imaginary part.

Parameters: x : array_like Input value. y : ndarray The complex conjugate of x, with same dtype as y.

Examples

>>> np.conjugate(1+2j)
(1-2j)

>>> x = np.eye(2) + 1j * np.eye(2)
>>> np.conjugate(x)
array([[ 1.-1.j,  0.-0.j],
[ 0.-0.j,  1.-1.j]])

dask.array.copysign(x1, x2[, out])

Change the sign of x1 to that of x2, element-wise.

If both arguments are arrays or sequences, they have to be of the same length. If x2 is a scalar, its sign will be copied to all elements of x1.

Parameters: x1 : array_like Values to change the sign of. x2 : array_like The sign of x2 is copied to x1. out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. out : array_like The values of x1 with the sign of x2.

Examples

>>> np.copysign(1.3, -1)
-1.3
>>> 1/np.copysign(0, 1)
inf
>>> 1/np.copysign(0, -1)
-inf

>>> np.copysign([-1, 0, 1], -1.1)
array([-1., -0., -1.])
>>> np.copysign([-1, 0, 1], np.arange(3)-1)
array([-1.,  0.,  1.])

dask.array.corrcoef(x, y=None, rowvar=1)

Return Pearson product-moment correlation coefficients.

Please refer to the documentation for cov for more detail. The relationship between the correlation coefficient matrix, R, and the covariance matrix, C, is

$R_{ij} = \frac{ C_{ij} } { \sqrt{ C_{ii} * C_{jj} } }$

The values of R are between -1 and 1, inclusive.

Parameters: x : array_like A 1-D or 2-D array containing multiple variables and observations. Each row of x represents a variable, and each column a single observation of all those variables. Also see rowvar below. y : array_like, optional An additional set of variables and observations. y has the same shape as x. rowvar : int, optional If rowvar is non-zero (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations. bias : _NoValue, optional Has no effect, do not use. Deprecated since version 1.10.0. ddof : _NoValue, optional Has no effect, do not use. Deprecated since version 1.10.0. R : ndarray The correlation coefficient matrix of the variables.

cov
Covariance matrix

Notes

Due to floating point rounding the resulting array may not be Hermitian, the diagonal elements may not be 1, and the elements may not satisfy the inequality abs(a) <= 1. The real and imaginary parts are clipped to the interval [-1, 1] in an attempt to improve on that situation but is not much help in the complex case.

This function accepts but discards arguments bias and ddof. This is for backwards compatibility with previous versions of this function. These arguments had no effect on the return values of the function and can be safely ignored in this and previous versions of numpy.

dask.array.cos(x[, out])

Cosine element-wise.

Parameters: x : array_like Input array in radians. out : ndarray, optional Output array of same shape as x. y : ndarray The corresponding cosine values. ValueError: invalid return array shape if out is provided and out.shape != x.shape (See Examples)

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.

Examples

>>> np.cos(np.array([0, np.pi/2, np.pi]))
array([  1.00000000e+00,   6.12303177e-17,  -1.00000000e+00])
>>>
>>> # Example of providing the optional output parameter
>>> out2 = np.cos([0.1], out1)
>>> out2 is out1
True
>>>
>>> # Example of ValueError due to provision of shape mis-matched out
>>> np.cos(np.zeros((3,3)),np.zeros((2,2)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid return array shape

dask.array.cosh(x[, out])

Hyperbolic cosine, element-wise.

Equivalent to 1/2 * (np.exp(x) + np.exp(-x)) and np.cos(1j*x).

Parameters: x : array_like Input array. out : ndarray Output array of same shape as x.

Examples

>>> np.cosh(0)
1.0


The hyperbolic cosine describes the shape of a hanging cable:

>>> import matplotlib.pyplot as plt
>>> x = np.linspace(-4, 4, 1000)
>>> plt.plot(x, np.cosh(x))
>>> plt.show()

dask.array.cov(m, y=None, rowvar=1, bias=0, ddof=None)

Estimate a covariance matrix, given data and weights.

Covariance indicates the level to which two variables vary together. If we examine N-dimensional samples, $$X = [x_1, x_2, ... x_N]^T$$, then the covariance matrix element $$C_{ij}$$ is the covariance of $$x_i$$ and $$x_j$$. The element $$C_{ii}$$ is the variance of $$x_i$$.

See the notes for an outline of the algorithm.

Parameters: m : array_like A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables. Also see rowvar below. y : array_like, optional An additional set of variables and observations. y has the same form as that of m. rowvar : bool, optional If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations. bias : bool, optional Default normalization (False) is by (N - 1), where N is the number of observations given (unbiased estimate). If bias is True, then normalization is by N. These values can be overridden by using the keyword ddof in numpy versions >= 1.5. ddof : int, optional If not None the default value implied by bias is overridden. Note that ddof=1 will return the unbiased estimate, even if both fweights and aweights are specified, and ddof=0 will return the simple average. See the notes for the details. The default value is None. New in version 1.5. fweights : array_like, int, optional 1-D array of integer freguency weights; the number of times each observation vector should be repeated. New in version 1.10. aweights : array_like, optional 1-D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If ddof=0 the array of weights can be used to assign probabilities to observation vectors. New in version 1.10. out : ndarray The covariance matrix of the variables.

corrcoef
Normalized covariance matrix

Notes

Assume that the observations are in the columns of the observation array m and let f = fweights and a = aweights for brevity. The steps to compute the weighted covariance are as follows:

>>> w = f * a
>>> v1 = np.sum(w)
>>> v2 = np.sum(w * a)
>>> m -= np.sum(m * w, axis=1, keepdims=True) / v1
>>> cov = np.dot(m * w, m.T) * v1 / (v1**2 - ddof * v2)


Note that when a == 1, the normalization factor v1 / (v1**2 - ddof * v2) goes over to 1 / (np.sum(f) - ddof) as it should.

Examples

Consider two variables, $$x_0$$ and $$x_1$$, which correlate perfectly, but in opposite directions:

>>> x = np.array([[0, 2], [1, 1], [2, 0]]).T
>>> x
array([[0, 1, 2],
[2, 1, 0]])


Note how $$x_0$$ increases while $$x_1$$ decreases. The covariance matrix shows this clearly:

>>> np.cov(x)
array([[ 1., -1.],
[-1.,  1.]])


Note that element $$C_{0,1}$$, which shows the correlation between $$x_0$$ and $$x_1$$, is negative.

Further, note how x and y are combined:

>>> x = [-2.1, -1,  4.3]
>>> y = [3,  1.1,  0.12]
>>> X = np.vstack((x,y))
>>> print(np.cov(X))
[[ 11.71        -4.286     ]
[ -4.286        2.14413333]]
>>> print(np.cov(x, y))
[[ 11.71        -4.286     ]
[ -4.286        2.14413333]]
>>> print(np.cov(x))
11.71

dask.array.cumprod(x, axis, dtype=None)

Return the cumulative product of elements along a given axis.

Parameters: a : array_like Input array. axis : int, optional Axis along which the cumulative product is computed. By default the input is flattened. dtype : dtype, optional Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary. cumprod : ndarray A new array holding the result is returned unless out is specified, in which case a reference to out is returned.

numpy.doc.ufuncs
Section “Output arguments”

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow.

Examples

>>> a = np.array([1,2,3])
>>> np.cumprod(a) # intermediate results 1, 1*2
...               # total product 1*2*3 = 6
array([1, 2, 6])
>>> a = np.array([[1, 2, 3], [4, 5, 6]])
>>> np.cumprod(a, dtype=float) # specify type of output
array([   1.,    2.,    6.,   24.,  120.,  720.])


The cumulative product for each column (i.e., over the rows) of a:

>>> np.cumprod(a, axis=0)
array([[ 1,  2,  3],
[ 4, 10, 18]])


The cumulative product for each row (i.e. over the columns) of a:

>>> np.cumprod(a,axis=1)
array([[  1,   2,   6],
[  4,  20, 120]])

dask.array.cumsum(x, axis, dtype=None)

Return the cumulative sum of the elements along a given axis.

Parameters: a : array_like Input array. axis : int, optional Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array. dtype : dtype, optional Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See doc.ufuncs (Section “Output arguments”) for more details. cumsum_along_axis : ndarray. A new array holding the result is returned unless out is specified, in which case a reference to out is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.

sum
Sum array elements.
trapz
Integration of array values using the composite trapezoidal rule.
diff
Calculate the n-th discrete difference along given axis.

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow.

Examples

>>> a = np.array([[1,2,3], [4,5,6]])
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> np.cumsum(a)
array([ 1,  3,  6, 10, 15, 21])
>>> np.cumsum(a, dtype=float)     # specifies type of output value(s)
array([  1.,   3.,   6.,  10.,  15.,  21.])

>>> np.cumsum(a,axis=0)      # sum over rows for each of the 3 columns
array([[1, 2, 3],
[5, 7, 9]])
>>> np.cumsum(a,axis=1)      # sum over columns for each of the 2 rows
array([[ 1,  3,  6],
[ 4,  9, 15]])

dask.array.deg2rad(x[, out])

Convert angles from degrees to radians.

Parameters: x : array_like Angles in degrees. y : ndarray The corresponding angle in radians.

rad2deg
Convert angles from radians to degrees.
unwrap
Remove large jumps in angle by wrapping.

Notes

New in version 1.3.0.

deg2rad(x) is x * pi / 180.

Examples

>>> np.deg2rad(180)
3.1415926535897931

dask.array.degrees(x[, out])

Convert angles from radians to degrees.

Parameters: x : array_like Input array in radians. out : ndarray, optional Output array of same shape as x. y : ndarray of floats The corresponding degree values; if out was supplied this is a reference to it.

rad2deg
equivalent function

Examples

Convert a radian array to degrees

>>> rad = np.arange(12.)*np.pi/6
array([   0.,   30.,   60.,   90.,  120.,  150.,  180.,  210.,  240.,
270.,  300.,  330.])

>>> out = np.zeros((rad.shape))
>>> np.all(r == out)
True

dask.array.diag(v)

Extract a diagonal or construct a diagonal array.

See the more detailed documentation for numpy.diagonal if you use this function to extract a diagonal and wish to write to the resulting array; whether it returns a copy or a view depends on what version of numpy you are using.

Parameters: v : array_like If v is a 2-D array, return a copy of its k-th diagonal. If v is a 1-D array, return a 2-D array with v on the k-th diagonal. k : int, optional Diagonal in question. The default is 0. Use k>0 for diagonals above the main diagonal, and k<0 for diagonals below the main diagonal. out : ndarray The extracted diagonal or constructed diagonal array.

diagonal
Return specified diagonals.
diagflat
Create a 2-D array with the flattened input as a diagonal.
trace
Sum along diagonals.
triu
Upper triangle of an array.
tril
Lower triangle of an array.

Examples

>>> x = np.arange(9).reshape((3,3))
>>> x
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])

>>> np.diag(x)
array([0, 4, 8])
>>> np.diag(x, k=1)
array([1, 5])
>>> np.diag(x, k=-1)
array([3, 7])

>>> np.diag(np.diag(x))
array([[0, 0, 0],
[0, 4, 0],
[0, 0, 8]])

dask.array.digitize(x, bins, right=False)

Return the indices of the bins to which each value in input array belongs.

Each index i returned is such that bins[i-1] <= x < bins[i] if bins is monotonically increasing, or bins[i-1] > x >= bins[i] if bins is monotonically decreasing. If values in x are beyond the bounds of bins, 0 or len(bins) is returned as appropriate. If right is True, then the right bin is closed so that the index i is such that bins[i-1] < x <= bins[i] or bins[i-1] >= x > bins[i] if bins is monotonically increasing or decreasing, respectively.

Parameters: x : array_like Input array to be binned. Prior to Numpy 1.10.0, this array had to be 1-dimensional, but can now have any shape. bins : array_like Array of bins. It has to be 1-dimensional and monotonic. right : bool, optional Indicating whether the intervals include the right or the left bin edge. Default behavior is (right==False) indicating that the interval does not include the right edge. The left bin end is open in this case, i.e., bins[i-1] <= x < bins[i] is the default behavior for monotonically increasing bins. out : ndarray of ints Output array of indices, of same shape as x. ValueError If bins is not monotonic. TypeError If the type of the input is complex.

Notes

If values in x are such that they fall outside the bin range, attempting to index bins with the indices that digitize returns will result in an IndexError.

New in version 1.10.0.

np.digitize is implemented in terms of np.searchsorted. This means that a binary search is used to bin the values, which scales much better for larger number of bins than the previous linear search. It also removes the requirement for the input array to be 1-dimensional.

Examples

>>> x = np.array([0.2, 6.4, 3.0, 1.6])
>>> bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0])
>>> inds = np.digitize(x, bins)
>>> inds
array([1, 4, 3, 2])
>>> for n in range(x.size):
...   print(bins[inds[n]-1], "<=", x[n], "<", bins[inds[n]])
...
0.0 <= 0.2 < 1.0
4.0 <= 6.4 < 10.0
2.5 <= 3.0 < 4.0
1.0 <= 1.6 < 2.5

>>> x = np.array([1.2, 10.0, 12.4, 15.5, 20.])
>>> bins = np.array([0, 5, 10, 15, 20])
>>> np.digitize(x,bins,right=True)
array([1, 2, 3, 4, 4])
>>> np.digitize(x,bins,right=False)
array([1, 3, 3, 4, 5])

dask.array.dot(a, b, out=None)

Dot product of two arrays.

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of a and the second-to-last of b:

dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])

Parameters: a : array_like First argument. b : array_like Second argument. out : ndarray, optional Output argument. This must have the exact kind that would be returned if it was not used. In particular, it must have the right type, must be C-contiguous, and its dtype must be the dtype that would be returned for dot(a,b). This is a performance feature. Therefore, if these conditions are not met, an exception is raised, instead of attempting to be flexible. output : ndarray Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. If out is given, then it is returned. ValueError If the last dimension of a is not the same size as the second-to-last dimension of b.

vdot
Complex-conjugating dot product.
tensordot
Sum products over arbitrary axes.
einsum
Einstein summation convention.
matmul
‘@’ operator as method with out parameter.

Examples

>>> np.dot(3, 4)
12


Neither argument is complex-conjugated:

>>> np.dot([2j, 3j], [2j, 3j])
(-13+0j)


For 2-D arrays it is the matrix product:

>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> np.dot(a, b)
array([[4, 1],
[2, 2]])

>>> a = np.arange(3*4*5*6).reshape((3,4,5,6))
>>> b = np.arange(3*4*5*6)[::-1].reshape((5,4,6,3))
>>> np.dot(a, b)[2,3,2,1,2,2]
499128
>>> sum(a[2,3,2,:] * b[1,2,:,2])
499128

dask.array.dstack(tup)

Stack arrays in sequence depth wise (along third axis).

Takes a sequence of arrays and stack them along the third axis to make a single array. Rebuilds arrays divided by dsplit. This is a simple way to stack 2D arrays (images) into a single 3D array for processing.

Parameters: tup : sequence of arrays Arrays to stack. All of them must have the same shape along all but the third axis. stacked : ndarray The array formed by stacking the given arrays.

stack
Join a sequence of arrays along a new axis.
vstack
Stack along first axis.
hstack
Stack along second axis.
concatenate
Join a sequence of arrays along an existing axis.
dsplit
Split array along third axis.

Notes

Equivalent to np.concatenate(tup, axis=2).

Examples

>>> a = np.array((1,2,3))
>>> b = np.array((2,3,4))
>>> np.dstack((a,b))
array([[[1, 2],
[2, 3],
[3, 4]]])

>>> a = np.array([[1],[2],[3]])
>>> b = np.array([[2],[3],[4]])
>>> np.dstack((a,b))
array([[[1, 2]],
[[2, 3]],
[[3, 4]]])

dask.array.empty()

Blocked variant of empty

Follows the signature of empty exactly except that it also requires a keyword argument chunks=(...)

Original signature follows below. empty(shape, dtype=float, order=’C’)

Return a new array of given shape and type, without initializing entries.

Parameters: shape : int or tuple of int Shape of the empty array dtype : data-type, optional Desired output data-type. order : {‘C’, ‘F’}, optional Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory. out : ndarray Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays will be initialized to None.

empty_like, zeros, ones

Notes

empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster. On the other hand, it requires the user to manually set all the values in the array, and should be used with caution.

Examples

>>> np.empty([2, 2])
array([[ -9.74499359e+001,   6.69583040e-309],
[  2.13182611e-314,   3.06959433e-309]])         #random

>>> np.empty([2, 2], dtype=int)
array([[-1073741821, -1067949133],
[  496041986,    19249760]])                     #random

dask.array.exp(x[, out])

Calculate the exponential of all elements in the input array.

Parameters: x : array_like Input values. out : ndarray Output array, element-wise exponential of x.

expm1
Calculate exp(x) - 1 for all elements in the array.
exp2
Calculate 2**x for all elements in the array.

Notes

The irrational number e is also known as Euler’s number. It is approximately 2.718281, and is the base of the natural logarithm, ln (this means that, if $$x = \ln y = \log_e y$$, then $$e^x = y$$. For real input, exp(x) is always positive.

For complex arguments, x = a + ib, we can write $$e^x = e^a e^{ib}$$. The first term, $$e^a$$, is already known (it is the real argument, described above). The second term, $$e^{ib}$$, is $$\cos b + i \sin b$$, a function with magnitude 1 and a periodic phase.

References

 [R94] Wikipedia, “Exponential function”, http://en.wikipedia.org/wiki/Exponential_function
 [R95] M. Abramovitz and I. A. Stegun, “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables,” Dover, 1964, p. 69, http://www.math.sfu.ca/~cbm/aands/page_69.htm

Examples

Plot the magnitude and phase of exp(x) in the complex plane:

>>> import matplotlib.pyplot as plt

>>> x = np.linspace(-2*np.pi, 2*np.pi, 100)
>>> xx = x + 1j * x[:, np.newaxis] # a + ib over complex plane
>>> out = np.exp(xx)

>>> plt.subplot(121)
>>> plt.imshow(np.abs(out),
...            extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi])
>>> plt.title('Magnitude of exp(x)')

>>> plt.subplot(122)
>>> plt.imshow(np.angle(out),
...            extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi])
>>> plt.title('Phase (angle) of exp(x)')
>>> plt.show()

dask.array.expm1(x[, out])

Calculate exp(x) - 1 for all elements in the array.

Parameters: x : array_like Input values. out : ndarray Element-wise exponential minus one: out = exp(x) - 1.

log1p
log(1 + x), the inverse of expm1.

Notes

This function provides greater precision than exp(x) - 1 for small values of x.

Examples

The true value of exp(1e-10) - 1 is 1.00000000005e-10 to about 32 significant digits. This example shows the superiority of expm1 in this case.

>>> np.expm1(1e-10)
1.00000000005e-10
>>> np.exp(1e-10) - 1
1.000000082740371e-10

dask.array.eye(N, chunks, M=None, k=0, dtype=<type 'float'>)

Return a 2-D Array with ones on the diagonal and zeros elsewhere.

Parameters: N : int Number of rows in the output. chunks: int chunk size of resulting blocks M : int, optional Number of columns in the output. If None, defaults to N. k : int, optional Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal. dtype : data-type, optional Data-type of the returned array. I : Array of shape (N,M) An array where all elements are equal to zero, except for the k-th diagonal, whose values are equal to one.
dask.array.fabs(x[, out])

Compute the absolute values element-wise.

This function returns the absolute values (positive magnitude) of the data in x. Complex values are not handled, use absolute to find the absolute values of complex data.

Parameters: x : array_like The array of numbers for which the absolute values are required. If x is a scalar, the result y will also be a scalar. out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. y : ndarray or scalar The absolute values of x, the returned values are always floats.

absolute
Absolute values including complex types.

Examples

>>> np.fabs(-1)
1.0
>>> np.fabs([-1.2, 1.2])
array([ 1.2,  1.2])

dask.array.fix(*args, **kwargs)

Round to nearest integer towards zero.

Round an array of floats element-wise to nearest integer towards zero. The rounded values are returned as floats.

Parameters: x : array_like An array of floats to be rounded y : ndarray, optional Output array out : ndarray of floats The array of rounded numbers

around
Round to given number of decimals

Examples

>>> np.fix(3.14)
3.0
>>> np.fix(3)
3.0
>>> np.fix([2.1, 2.9, -2.1, -2.9])
array([ 2.,  2., -2., -2.])

dask.array.floor(x[, out])

Return the floor of the input, element-wise.

The floor of the scalar x is the largest integer i, such that i <= x. It is often denoted as $$\lfloor x \rfloor$$.

Parameters: x : array_like Input data. y : ndarray or scalar The floor of each element in x.

Notes

Some spreadsheet programs calculate the “floor-towards-zero”, in other words floor(-2.5) == -2. NumPy instead uses the definition of floor where floor(-2.5) == -3.

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])
>>> np.floor(a)
array([-2., -2., -1.,  0.,  1.,  1.,  2.])

dask.array.fmax(x1, x2[, out])

Element-wise maximum of array elements.

Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.

Parameters: x1, x2 : array_like The arrays holding the elements to be compared. They must have the same shape. y : ndarray or scalar The maximum of x1 and x2, element-wise. Returns scalar if both x1 and x2 are scalars.

fmin
Element-wise minimum of two arrays, ignores NaNs.
maximum
Element-wise maximum of two arrays, propagates NaNs.
amax
The maximum value of an array along a given axis, propagates NaNs.
nanmax
The maximum value of an array along a given axis, ignores NaNs.

minimum, amin, nanmin

Notes

New in version 1.3.0.

The fmax is equivalent to np.where(x1 >= x2, x1, x2) when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.

Examples

>>> np.fmax([2, 3, 4], [1, 5, 2])
array([ 2.,  5.,  4.])

>>> np.fmax(np.eye(2), [0.5, 2])
array([[ 1. ,  2. ],
[ 0.5,  2. ]])

>>> np.fmax([np.nan, 0, np.nan],[0, np.nan, np.nan])
array([  0.,   0.,  NaN])

dask.array.fmin(x1, x2[, out])

Element-wise minimum of array elements.

Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.

Parameters: x1, x2 : array_like The arrays holding the elements to be compared. They must have the same shape. y : ndarray or scalar The minimum of x1 and x2, element-wise. Returns scalar if both x1 and x2 are scalars.

fmax
Element-wise maximum of two arrays, ignores NaNs.
minimum
Element-wise minimum of two arrays, propagates NaNs.
amin
The minimum value of an array along a given axis, propagates NaNs.
nanmin
The minimum value of an array along a given axis, ignores NaNs.

maximum, amax, nanmax

Notes

New in version 1.3.0.

The fmin is equivalent to np.where(x1 <= x2, x1, x2) when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.

Examples

>>> np.fmin([2, 3, 4], [1, 5, 2])
array([2, 5, 4])

>>> np.fmin(np.eye(2), [0.5, 2])
array([[ 1. ,  2. ],
[ 0.5,  2. ]])

>>> np.fmin([np.nan, 0, np.nan],[0, np.nan, np.nan])
array([  0.,   0.,  NaN])

dask.array.fmod(x1, x2[, out])

Return the element-wise remainder of division.

This is the NumPy implementation of the C library function fmod, the remainder has the same sign as the dividend x1. It is equivalent to the Matlab(TM) rem function and should not be confused with the Python modulus operator x1 % x2.

Parameters: x1 : array_like Dividend. x2 : array_like Divisor. y : array_like The remainder of the division of x1 by x2.

remainder
Equivalent to the Python % operator.

divide

Notes

The result of the modulo operation for negative dividend and divisors is bound by conventions. For fmod, the sign of result is the sign of the dividend, while for remainder the sign of the result is the sign of the divisor. The fmod function is equivalent to the Matlab(TM) rem function.

Examples

>>> np.fmod([-3, -2, -1, 1, 2, 3], 2)
array([-1,  0, -1,  1,  0,  1])
>>> np.remainder([-3, -2, -1, 1, 2, 3], 2)
array([1, 0, 1, 1, 0, 1])

>>> np.fmod([5, 3], [2, 2.])
array([ 1.,  1.])
>>> a = np.arange(-3, 3).reshape(3, 2)
>>> a
array([[-3, -2],
[-1,  0],
[ 1,  2]])
>>> np.fmod(a, [2,2])
array([[-1,  0],
[-1,  0],
[ 1,  0]])

dask.array.frexp(x[, out1, out2])

Decompose the elements of x into mantissa and twos exponent.

Returns (mantissa, exponent), where x = mantissa * 2**exponent. The mantissa is lies in the open interval(-1, 1), while the twos exponent is a signed integer.

Parameters: x : array_like Array of numbers to be decomposed. out1 : ndarray, optional Output array for the mantissa. Must have the same shape as x. out2 : ndarray, optional Output array for the exponent. Must have the same shape as x. (mantissa, exponent) : tuple of ndarrays, (float, int) mantissa is a float array with values between -1 and 1. exponent is an int array which represents the exponent of 2.

ldexp
Compute y = x1 * 2**x2, the inverse of frexp.

Notes

Complex dtypes are not supported, they will raise a TypeError.

Examples

>>> x = np.arange(9)
>>> y1, y2 = np.frexp(x)
>>> y1
array([ 0.   ,  0.5  ,  0.5  ,  0.75 ,  0.5  ,  0.625,  0.75 ,  0.875,
0.5  ])
>>> y2
array([0, 1, 2, 2, 3, 3, 3, 3, 4])
>>> y1 * 2**y2
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.])

dask.array.fromfunction(func, chunks=None, shape=None, dtype=None)

Construct an array by executing a function over each coordinate.

The resulting array therefore has a value fn(x, y, z) at coordinate (x, y, z).

Parameters: function : callable The function is called with N parameters, where N is the rank of shape. Each parameter represents the coordinates of the array varying along a specific axis. For example, if shape were (2, 2), then the parameters in turn be (0, 0), (0, 1), (1, 0), (1, 1). shape : (N,) tuple of ints Shape of the output array, which also determines the shape of the coordinate arrays passed to function. dtype : data-type, optional Data-type of the coordinate arrays passed to function. By default, dtype is float. fromfunction : any The result of the call to function is passed back directly. Therefore the shape of fromfunction is completely determined by function. If function returns a scalar value, the shape of fromfunction would match the shape parameter.

indices, meshgrid

Notes

Keywords other than dtype are passed to function.

Examples

>>> np.fromfunction(lambda i, j: i == j, (3, 3), dtype=int)
array([[ True, False, False],
[False,  True, False],
[False, False,  True]], dtype=bool)

>>> np.fromfunction(lambda i, j: i + j, (3, 3), dtype=int)
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])

dask.array.full(*args, **kwargs)

Blocked variant of full

Follows the signature of full exactly except that it also requires a keyword argument chunks=(...)

Original signature follows below.

Return a new array of given shape and type, filled with fill_value.

Parameters: shape : int or sequence of ints Shape of the new array, e.g., (2, 3) or 2. fill_value : scalar Fill value. dtype : data-type, optional The desired data-type for the array, e.g., np.int8. Default is float, but will change to np.array(fill_value).dtype in a future release. order : {‘C’, ‘F’}, optional Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory. out : ndarray Array of fill_value with the given shape, dtype, and order.

zeros_like
Return an array of zeros with shape and type of input.
ones_like
Return an array of ones with shape and type of input.
empty_like
Return an empty array with shape and type of input.
full_like
Fill an array with shape and type of input.
zeros
Return a new array setting values to zero.
ones
Return a new array setting values to one.
empty
Return a new uninitialized array.

Examples

>>> np.full((2, 2), np.inf)
array([[ inf,  inf],
[ inf,  inf]])
>>> np.full((2, 2), 10, dtype=np.int)
array([[10, 10],
[10, 10]])

dask.array.histogram(a, bins=None, range=None, normed=False, weights=None, density=None)

Blocked variant of numpy.histogram.

Follows the signature of numpy.histogram exactly with the following exceptions:

• Either an iterable specifying the bins or the number of bins and a range argument is required as computing min and max over blocked arrays is an expensive operation that must be performed explicitly.
• weights must be a dask.array.Array with the same block structure as a.

Examples

Using number of bins and range:

>>> import dask.array as da
>>> import numpy as np
>>> x = da.from_array(np.arange(10000), chunks=10)
>>> h, bins = da.histogram(x, bins=10, range=[0, 10000])
>>> bins
array([     0.,   1000.,   2000.,   3000.,   4000.,   5000.,   6000.,
7000.,   8000.,   9000.,  10000.])
>>> h.compute()
array([1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000])


Explicitly specifying the bins:

>>> h, bins = da.histogram(x, bins=np.array([0, 5000, 10000]))
>>> bins
array([    0,  5000, 10000])
>>> h.compute()
array([5000, 5000])

dask.array.hstack(tup)

Stack arrays in sequence horizontally (column wise).

Take a sequence of arrays and stack them horizontally to make a single array. Rebuild arrays divided by hsplit.

Parameters: tup : sequence of ndarrays All arrays must have the same shape along all but the second axis. stacked : ndarray The array formed by stacking the given arrays.

stack
Join a sequence of arrays along a new axis.
vstack
Stack arrays in sequence vertically (row wise).
dstack
Stack arrays in sequence depth wise (along third axis).
concatenate
Join a sequence of arrays along an existing axis.
hsplit
Split array along second axis.

Notes

Equivalent to np.concatenate(tup, axis=1)

Examples

>>> a = np.array((1,2,3))
>>> b = np.array((2,3,4))
>>> np.hstack((a,b))
array([1, 2, 3, 2, 3, 4])
>>> a = np.array([[1],[2],[3]])
>>> b = np.array([[2],[3],[4]])
>>> np.hstack((a,b))
array([[1, 2],
[2, 3],
[3, 4]])

dask.array.hypot(x1, x2[, out])

Given the “legs” of a right triangle, return its hypotenuse.

Equivalent to sqrt(x1**2 + x2**2), element-wise. If x1 or x2 is scalar_like (i.e., unambiguously cast-able to a scalar type), it is broadcast for use with each element of the other argument. (See Examples)

Parameters: x1, x2 : array_like Leg of the triangle(s). out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. z : ndarray The hypotenuse of the triangle(s).

Examples

>>> np.hypot(3*np.ones((3, 3)), 4*np.ones((3, 3)))
array([[ 5.,  5.,  5.],
[ 5.,  5.,  5.],
[ 5.,  5.,  5.]])


Example showing broadcast of scalar_like argument:

>>> np.hypot(3*np.ones((3, 3)), [4])
array([[ 5.,  5.,  5.],
[ 5.,  5.,  5.],
[ 5.,  5.,  5.]])

dask.array.imag(*args, **kwargs)

Return the imaginary part of the elements of the array.

Parameters: val : array_like Input array. out : ndarray Output array. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.

real, angle, real_if_close

Examples

>>> a = np.array([1+2j, 3+4j, 5+6j])
>>> a.imag
array([ 2.,  4.,  6.])
>>> a.imag = np.array([8, 10, 12])
>>> a
array([ 1. +8.j,  3.+10.j,  5.+12.j])

dask.array.insert(arr, obj, values, axis)

Insert values along the given axis before the given indices.

Parameters: arr : array_like Input array. obj : int, slice or sequence of ints Object that defines the index or indices before which values is inserted. New in version 1.8.0. Support for multiple insertions when obj is a single scalar or a sequence with one element (similar to calling insert multiple times). values : array_like Values to insert into arr. If the type of values is different from that of arr, values is converted to the type of arr. values should be shaped so that arr[...,obj,...] = values is legal. axis : int, optional Axis along which to insert values. If axis is None then arr is flattened first. out : ndarray A copy of arr with values inserted. Note that insert does not occur in-place: a new array is returned. If axis is None, out is a flattened array.

append
Append elements at the end of an array.
concatenate
Join a sequence of arrays along an existing axis.
delete
Delete elements from an array.

Notes

Note that for higher dimensional inserts obj=0 behaves very different from obj=[0] just like arr[:,0,:] = values is different from arr[:,[0],:] = values.

Examples

>>> a = np.array([[1, 1], [2, 2], [3, 3]])
>>> a
array([[1, 1],
[2, 2],
[3, 3]])
>>> np.insert(a, 1, 5)
array([1, 5, 1, 2, 2, 3, 3])
>>> np.insert(a, 1, 5, axis=1)
array([[1, 5, 1],
[2, 5, 2],
[3, 5, 3]])


Difference between sequence and scalars:

>>> np.insert(a, [1], [[1],[2],[3]], axis=1)
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
>>> np.array_equal(np.insert(a, 1, [1, 2, 3], axis=1),
...                np.insert(a, [1], [[1],[2],[3]], axis=1))
True

>>> b = a.flatten()
>>> b
array([1, 1, 2, 2, 3, 3])
>>> np.insert(b, [2, 2], [5, 6])
array([1, 1, 5, 6, 2, 2, 3, 3])

>>> np.insert(b, slice(2, 4), [5, 6])
array([1, 1, 5, 2, 6, 2, 3, 3])

>>> np.insert(b, [2, 2], [7.13, False]) # type casting
array([1, 1, 7, 0, 2, 2, 3, 3])

>>> x = np.arange(8).reshape(2, 4)
>>> idx = (1, 3)
>>> np.insert(x, idx, 999, axis=1)
array([[  0, 999,   1,   2, 999,   3],
[  4, 999,   5,   6, 999,   7]])

dask.array.isclose(arr1, arr2, rtol=1e-05, atol=1e-08, equal_nan=False)

Returns a boolean array where two arrays are element-wise equal within a tolerance.

The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.

Parameters: a, b : array_like Input arrays to compare. rtol : float The relative tolerance parameter (see Notes). atol : float The absolute tolerance parameter (see Notes). equal_nan : bool Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array. y : array_like Returns a boolean array of where a and b are equal within the given tolerance. If both a and b are scalars, returns a single boolean value.

allclose

Notes

New in version 1.7.0.

For finite values, isclose uses the following equation to test whether two floating point values are equivalent.

absolute(a - b) <= (atol + rtol * absolute(b))

The above equation is not symmetric in a and b, so that isclose(a, b) might be different from isclose(b, a) in some rare cases.

Examples

>>> np.isclose([1e10,1e-7], [1.00001e10,1e-8])
array([True, False])
>>> np.isclose([1e10,1e-8], [1.00001e10,1e-9])
array([True, True])
>>> np.isclose([1e10,1e-8], [1.0001e10,1e-9])
array([False, True])
>>> np.isclose([1.0, np.nan], [1.0, np.nan])
array([True, False])
>>> np.isclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)
array([True, True])

dask.array.iscomplex(*args, **kwargs)

Returns a bool array, where True if input element is complex.

What is tested is whether the input has a non-zero imaginary part, not if the input type is complex.

Parameters: x : array_like Input array. out : ndarray of bools Output array.

isreal

iscomplexobj
Return True if x is a complex type or an array of complex numbers.

Examples

>>> np.iscomplex([1+1j, 1+0j, 4.5, 3, 2, 2j])
array([ True, False, False, False, False,  True], dtype=bool)

dask.array.isfinite(x[, out])

Test element-wise for finiteness (not infinity or not Not a Number).

The result is returned as a boolean array.

Parameters: x : array_like Input values. out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. y : ndarray, bool For scalar input, the result is a new boolean with value True if the input is finite; otherwise the value is False (input is either positive infinity, negative infinity or Not a Number). For array input, the result is a boolean array with the same dimensions as the input and the values are True if the corresponding element of the input is finite; otherwise the values are False (element is either positive infinity, negative infinity or Not a Number).

isinf, isneginf, isposinf, isnan

Notes

Not a Number, positive infinity and negative infinity are considered to be non-finite.

Numpy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Also that positive infinity is not equivalent to negative infinity. But infinity is equivalent to positive infinity. Errors result if the second argument is also supplied when x is a scalar input, or if first and second arguments have different shapes.

Examples

>>> np.isfinite(1)
True
>>> np.isfinite(0)
True
>>> np.isfinite(np.nan)
False
>>> np.isfinite(np.inf)
False
>>> np.isfinite(np.NINF)
False
>>> np.isfinite([np.log(-1.),1.,np.log(0)])
array([False,  True, False], dtype=bool)

>>> x = np.array([-np.inf, 0., np.inf])
>>> y = np.array([2, 2, 2])
>>> np.isfinite(x, y)
array([0, 1, 0])
>>> y
array([0, 1, 0])

dask.array.isinf(x[, out])

Test element-wise for positive or negative infinity.

Returns a boolean array of the same shape as x, True where x == +/-inf, otherwise False.

Parameters: x : array_like Input values out : array_like, optional An array with the same shape as x to store the result. y : bool (scalar) or boolean ndarray For scalar input, the result is a new boolean with value True if the input is positive or negative infinity; otherwise the value is False. For array input, the result is a boolean array with the same shape as the input and the values are True where the corresponding element of the input is positive or negative infinity; elsewhere the values are False. If a second argument was supplied the result is stored there. If the type of that array is a numeric type the result is represented as zeros and ones, if the type is boolean then as False and True, respectively. The return value y is then a reference to that array.

isneginf, isposinf, isnan, isfinite

Notes

Numpy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754).

Errors result if the second argument is supplied when the first argument is a scalar, or if the first and second arguments have different shapes.

Examples

>>> np.isinf(np.inf)
True
>>> np.isinf(np.nan)
False
>>> np.isinf(np.NINF)
True
>>> np.isinf([np.inf, -np.inf, 1.0, np.nan])
array([ True,  True, False, False], dtype=bool)

>>> x = np.array([-np.inf, 0., np.inf])
>>> y = np.array([2, 2, 2])
>>> np.isinf(x, y)
array([1, 0, 1])
>>> y
array([1, 0, 1])

dask.array.isnan(x[, out])

Test element-wise for NaN and return result as a boolean array.

Parameters: x : array_like Input array. y : ndarray or bool For scalar input, the result is a new boolean with value True if the input is NaN; otherwise the value is False. For array input, the result is a boolean array of the same dimensions as the input and the values are True if the corresponding element of the input is NaN; otherwise the values are False.

isinf, isneginf, isposinf, isfinite

Notes

Numpy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity.

Examples

>>> np.isnan(np.nan)
True
>>> np.isnan(np.inf)
False
>>> np.isnan([np.log(-1.),1.,np.log(0)])
array([ True, False, False], dtype=bool)

dask.array.isnull(values)

dask.array.isreal(*args, **kwargs)

Returns a bool array, where True if input element is real.

If element has complex type with zero complex part, the return value for that element is True.

Parameters: x : array_like Input array. out : ndarray, bool Boolean array of same shape as x.

iscomplex

isrealobj
Return True if x is not a complex type.

Examples

>>> np.isreal([1+1j, 1+0j, 4.5, 3, 2, 2j])
array([False,  True,  True,  True,  True, False], dtype=bool)

dask.array.ldexp(x1, x2[, out])

Returns x1 * 2**x2, element-wise.

The mantissas x1 and twos exponents x2 are used to construct floating point numbers x1 * 2**x2.

Parameters: x1 : array_like Array of multipliers. x2 : array_like, int Array of twos exponents. out : ndarray, optional Output array for the result. y : ndarray or scalar The result of x1 * 2**x2.

frexp
Return (y1, y2) from x = y1 * 2**y2, inverse to ldexp.

Notes

Complex dtypes are not supported, they will raise a TypeError.

ldexp is useful as the inverse of frexp, if used by itself it is more clear to simply use the expression x1 * 2**x2.

Examples

>>> np.ldexp(5, np.arange(4))
array([  5.,  10.,  20.,  40.], dtype=float32)

>>> x = np.arange(6)
>>> np.ldexp(*np.frexp(x))
array([ 0.,  1.,  2.,  3.,  4.,  5.])

dask.array.linspace(start, stop, num=50, chunks=None, dtype=None)

Return num evenly spaced values over the closed interval [start, stop].

TODO: implement the endpoint, restep, and dtype keyword args

Parameters: start : scalar The starting value of the sequence. stop : scalar The last value of the sequence. num : int, optional Number of samples to include in the returned dask array, including the endpoints. chunks : int The number of samples on each block. Note that the last block will have fewer samples if num % blocksize != 0 samples : dask array
dask.array.log(x[, out])

Natural logarithm, element-wise.

The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x. The natural logarithm is logarithm in base e.

Parameters: x : array_like Input value. y : ndarray The natural logarithm of x, element-wise.

log10, log2, log1p, emath.log

Notes

Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

References

 [R96] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/
 [R97] Wikipedia, “Logarithm”. http://en.wikipedia.org/wiki/Logarithm

Examples

>>> np.log([1, np.e, np.e**2, 0])
array([  0.,   1.,   2., -Inf])

dask.array.log10(x[, out])

Return the base 10 logarithm of the input array, element-wise.

Parameters: x : array_like Input values. y : ndarray The logarithm to the base 10 of x, element-wise. NaNs are returned where x is negative.

emath.log10

Notes

Logarithm is a multivalued function: for each x there is an infinite number of z such that 10**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log10 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log10 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log10 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

References

 [R98] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/
 [R99] Wikipedia, “Logarithm”. http://en.wikipedia.org/wiki/Logarithm

Examples

>>> np.log10([1e-15, -3.])
array([-15.,  NaN])

dask.array.log1p(x[, out])

Return the natural logarithm of one plus the input array, element-wise.

Calculates log(1 + x).

Parameters: x : array_like Input values. y : ndarray Natural logarithm of 1 + x, element-wise.

expm1
exp(x) - 1, the inverse of log1p.

Notes

For real-valued input, log1p is accurate also for x so small that 1 + x == 1 in floating-point accuracy.

Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = 1 + x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log1p always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log1p is a complex analytical function that has a branch cut [-inf, -1] and is continuous from above on it. log1p handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

References

 [R100] M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/
 [R101] Wikipedia, “Logarithm”. http://en.wikipedia.org/wiki/Logarithm

Examples

>>> np.log1p(1e-99)
1e-99
>>> np.log(1 + 1e-99)
0.0

dask.array.log2(x[, out])

Base-2 logarithm of x.

Parameters: x : array_like Input values. y : ndarray Base-2 logarithm of x.

log, log10, log1p, emath.log2

Notes

New in version 1.3.0.

Logarithm is a multivalued function: for each x there is an infinite number of z such that 2**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log2 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log2 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log2 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

Examples

>>> x = np.array([0, 1, 2, 2**4])
>>> np.log2(x)
array([-Inf,   0.,   1.,   4.])

>>> xi = np.array([0+1.j, 1, 2+0.j, 4.j])
>>> np.log2(xi)
array([ 0.+2.26618007j,  0.+0.j        ,  1.+0.j        ,  2.+2.26618007j])

dask.array.logaddexp(x1, x2[, out])

Logarithm of the sum of exponentiations of the inputs.

Calculates log(exp(x1) + exp(x2)). This function is useful in statistics where the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the logarithm of the calculated probability is stored. This function allows adding probabilities stored in such a fashion.

Parameters: x1, x2 : array_like Input values. result : ndarray Logarithm of exp(x1) + exp(x2).

logaddexp2
Logarithm of the sum of exponentiations of inputs in base 2.

Notes

New in version 1.3.0.

Examples

>>> prob1 = np.log(1e-50)
>>> prob2 = np.log(2.5e-50)
>>> prob12
-113.87649168120691
>>> np.exp(prob12)
3.5000000000000057e-50

dask.array.logaddexp2(x1, x2[, out])

Logarithm of the sum of exponentiations of the inputs in base-2.

Calculates log2(2**x1 + 2**x2). This function is useful in machine learning when the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the base-2 logarithm of the calculated probability can be used instead. This function allows adding probabilities stored in such a fashion.

Parameters: x1, x2 : array_like Input values. out : ndarray, optional Array to store results in. result : ndarray Base-2 logarithm of 2**x1 + 2**x2.

logaddexp
Logarithm of the sum of exponentiations of the inputs.

Notes

New in version 1.3.0.

Examples

>>> prob1 = np.log2(1e-50)
>>> prob2 = np.log2(2.5e-50)
>>> prob1, prob2, prob12
(-166.09640474436813, -164.77447664948076, -164.28904982231052)
>>> 2**prob12
3.4999999999999914e-50

dask.array.logical_and(x1, x2[, out])

Compute the truth value of x1 AND x2 element-wise.

Parameters: x1, x2 : array_like Input arrays. x1 and x2 must be of the same shape. y : ndarray or bool Boolean result with the same shape as x1 and x2 of the logical AND operation on corresponding elements of x1 and x2.

Examples

>>> np.logical_and(True, False)
False
>>> np.logical_and([True, False], [False, False])
array([False, False], dtype=bool)

>>> x = np.arange(5)
>>> np.logical_and(x>1, x<4)
array([False, False,  True,  True, False], dtype=bool)

dask.array.logical_not(x[, out])

Compute the truth value of NOT x element-wise.

Parameters: x : array_like Logical NOT is applied to the elements of x. y : bool or ndarray of bool Boolean result with the same shape as x of the NOT operation on elements of x.

Examples

>>> np.logical_not(3)
False
>>> np.logical_not([True, False, 0, 1])
array([False,  True,  True, False], dtype=bool)

>>> x = np.arange(5)
>>> np.logical_not(x<3)
array([False, False, False,  True,  True], dtype=bool)

dask.array.logical_or(x1, x2[, out])

Compute the truth value of x1 OR x2 element-wise.

Parameters: x1, x2 : array_like Logical OR is applied to the elements of x1 and x2. They have to be of the same shape. y : ndarray or bool Boolean result with the same shape as x1 and x2 of the logical OR operation on elements of x1 and x2.

Examples

>>> np.logical_or(True, False)
True
>>> np.logical_or([True, False], [False, False])
array([ True, False], dtype=bool)

>>> x = np.arange(5)
>>> np.logical_or(x < 1, x > 3)
array([ True, False, False, False,  True], dtype=bool)

dask.array.logical_xor(x1, x2[, out])

Compute the truth value of x1 XOR x2, element-wise.

Parameters: x1, x2 : array_like Logical XOR is applied to the elements of x1 and x2. They must be broadcastable to the same shape. y : bool or ndarray of bool Boolean result of the logical XOR operation applied to the elements of x1 and x2; the shape is determined by whether or not broadcasting of one or both arrays was required.

Examples

>>> np.logical_xor(True, False)
True
>>> np.logical_xor([True, True, False, False], [True, False, True, False])
array([False,  True,  True, False], dtype=bool)

>>> x = np.arange(5)
>>> np.logical_xor(x < 1, x > 3)
array([ True, False, False, False,  True], dtype=bool)


Simple example showing support of broadcasting

>>> np.logical_xor(0, np.eye(2))
array([[ True, False],
[False,  True]], dtype=bool)

dask.array.max(a, axis=None, keepdims=False, split_every=None)

Return the maximum of an array or maximum along an axis.

Parameters: a : array_like Input data. axis : None or int or tuple of ints, optional Axis or axes along which to operate. By default, flattened input is used. If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before. out : ndarray, optional Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See doc.ufuncs (Section “Output arguments”) for more details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. amax : ndarray or scalar Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension a.ndim - 1.

amin
The minimum value of an array along a given axis, propagating any NaNs.
nanmax
The maximum value of an array along a given axis, ignoring any NaNs.
maximum
Element-wise maximum of two arrays, propagating any NaNs.
fmax
Element-wise maximum of two arrays, ignoring any NaNs.
argmax
Return the indices of the maximum values.

Notes

NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.

Don’t use amax for element-wise comparison of 2 arrays; when a.shape[0] is 2, maximum(a[0], a[1]) is faster than amax(a, axis=0).

Examples

>>> a = np.arange(4).reshape((2,2))
>>> a
array([[0, 1],
[2, 3]])
>>> np.amax(a)           # Maximum of the flattened array
3
>>> np.amax(a, axis=0)   # Maxima along the first axis
array([2, 3])
>>> np.amax(a, axis=1)   # Maxima along the second axis
array([1, 3])

>>> b = np.arange(5, dtype=np.float)
>>> b[2] = np.NaN
>>> np.amax(b)
nan
>>> np.nanmax(b)
4.0

dask.array.maximum(x1, x2[, out])

Element-wise maximum of array elements.

Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.

Parameters: x1, x2 : array_like The arrays holding the elements to be compared. They must have the same shape, or shapes that can be broadcast to a single shape. y : ndarray or scalar The maximum of x1 and x2, element-wise. Returns scalar if both x1 and x2 are scalars.

minimum
Element-wise minimum of two arrays, propagates NaNs.
fmax
Element-wise maximum of two arrays, ignores NaNs.
amax
The maximum value of an array along a given axis, propagates NaNs.
nanmax
The maximum value of an array along a given axis, ignores NaNs.

fmin, amin, nanmin

Notes

The maximum is equivalent to np.where(x1 >= x2, x1, x2) when neither x1 nor x2 are nans, but it is faster and does proper broadcasting.

Examples

>>> np.maximum([2, 3, 4], [1, 5, 2])
array([2, 5, 4])

>>> np.maximum(np.eye(2), [0.5, 2]) # broadcasting
array([[ 1. ,  2. ],
[ 0.5,  2. ]])

>>> np.maximum([np.nan, 0, np.nan], [0, np.nan, np.nan])
array([ NaN,  NaN,  NaN])
>>> np.maximum(np.Inf, 1)
inf

dask.array.mean(a, axis=None, dtype=None, keepdims=False, split_every=None)

Compute the arithmetic mean along the specified axis.

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

Parameters: a : array_like Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted. axis : None or int or tuple of ints, optional Axis or axes along which the means are computed. The default is to compute the mean of the flattened array. If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before. dtype : data-type, optional Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype. out : ndarray, optional Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. m : ndarray, see dtype parameter above If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.

average
Weighted average

Notes

The arithmetic mean is the sum of the elements along the axis divided by the number of elements.

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, 2], [3, 4]])
>>> np.mean(a)
2.5
>>> np.mean(a, axis=0)
array([ 2.,  3.])
>>> np.mean(a, axis=1)
array([ 1.5,  3.5])


In single precision, mean can be inaccurate:

>>> a = np.zeros((2, 512*512), dtype=np.float32)
>>> a[0, :] = 1.0
>>> a[1, :] = 0.1
>>> np.mean(a)
0.546875


Computing the mean in float64 is more accurate:

>>> np.mean(a, dtype=np.float64)
0.55000000074505806

dask.array.min(a, axis=None, keepdims=False, split_every=None)

Return the minimum of an array or minimum along an axis.

Parameters: a : array_like Input data. axis : None or int or tuple of ints, optional Axis or axes along which to operate. By default, flattened input is used. If this is a tuple of ints, the minimum is selected over multiple axes, instead of a single axis or all the axes as before. out : ndarray, optional Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See doc.ufuncs (Section “Output arguments”) for more details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. amin : ndarray or scalar Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension a.ndim - 1.

amax
The maximum value of an array along a given axis, propagating any NaNs.
nanmin
The minimum value of an array along a given axis, ignoring any NaNs.
minimum
Element-wise minimum of two arrays, propagating any NaNs.
fmin
Element-wise minimum of two arrays, ignoring any NaNs.
argmin
Return the indices of the minimum values.

Notes

NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmin.

Don’t use amin for element-wise comparison of 2 arrays; when a.shape[0] is 2, minimum(a[0], a[1]) is faster than amin(a, axis=0).

Examples

>>> a = np.arange(4).reshape((2,2))
>>> a
array([[0, 1],
[2, 3]])
>>> np.amin(a)           # Minimum of the flattened array
0
>>> np.amin(a, axis=0)   # Minima along the first axis
array([0, 1])
>>> np.amin(a, axis=1)   # Minima along the second axis
array([0, 2])

>>> b = np.arange(5, dtype=np.float)
>>> b[2] = np.NaN
>>> np.amin(b)
nan
>>> np.nanmin(b)
0.0

dask.array.minimum(x1, x2[, out])

Element-wise minimum of array elements.

Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.

Parameters: x1, x2 : array_like The arrays holding the elements to be compared. They must have the same shape, or shapes that can be broadcast to a single shape. y : ndarray or scalar The minimum of x1 and x2, element-wise. Returns scalar if both x1 and x2 are scalars.

maximum
Element-wise maximum of two arrays, propagates NaNs.
fmin
Element-wise minimum of two arrays, ignores NaNs.
amin
The minimum value of an array along a given axis, propagates NaNs.
nanmin
The minimum value of an array along a given axis, ignores NaNs.

fmax, amax, nanmax

Notes

The minimum is equivalent to np.where(x1 <= x2, x1, x2) when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.

Examples

>>> np.minimum([2, 3, 4], [1, 5, 2])
array([1, 3, 2])

>>> np.minimum(np.eye(2), [0.5, 2]) # broadcasting
array([[ 0.5,  0. ],
[ 0. ,  1. ]])

>>> np.minimum([np.nan, 0, np.nan],[0, np.nan, np.nan])
array([ NaN,  NaN,  NaN])
>>> np.minimum(-np.Inf, 1)
-inf

dask.array.modf(x[, out1, out2])

Return the fractional and integral parts of an array, element-wise.

The fractional and integral parts are negative if the given number is negative.

Parameters: x : array_like Input array. y1 : ndarray Fractional part of x. y2 : ndarray Integral part of x.

Notes

For integer input the return values are floats.

Examples

>>> np.modf([0, 3.5])
(array([ 0. ,  0.5]), array([ 0.,  3.]))
>>> np.modf(-0.5)
(-0.5, -0)

dask.array.moment(a, order, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None)
dask.array.nanargmax(x, axis=None, split_every=None)
dask.array.nanargmin(x, axis=None, split_every=None)
dask.array.nancumprod(x, axis, dtype=None)

Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. The cumulative product does not change when NaNs are encountered and leading NaNs are replaced by ones.

Ones are returned for slices that are all-NaN or empty.

New in version 1.12.0.

Parameters: a : array_like Input array. axis : int, optional Axis along which the cumulative product is computed. By default the input is flattened. dtype : dtype, optional Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary. nancumprod : ndarray A new array holding the result is returned unless out is specified, in which case it is returned.

numpy.cumprod
Cumulative product across array propagating NaNs.
isnan
Show which elements are NaN.

Examples

>>> np.nancumprod(1)
array([1])
>>> np.nancumprod([1])
array([1])
>>> np.nancumprod([1, np.nan])
array([ 1.,  1.])
>>> a = np.array([[1, 2], [3, np.nan]])
>>> np.nancumprod(a)
array([ 1.,  2.,  6.,  6.])
>>> np.nancumprod(a, axis=0)
array([[ 1.,  2.],
[ 3.,  2.]])
>>> np.nancumprod(a, axis=1)
array([[ 1.,  2.],
[ 3.,  3.]])

dask.array.nancumsum(x, axis, dtype=None)

Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. The cumulative sum does not change when NaNs are encountered and leading NaNs are replaced by zeros.

Zeros are returned for slices that are all-NaN or empty.

New in version 1.12.0.

Parameters: a : array_like Input array. axis : int, optional Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array. dtype : dtype, optional Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See doc.ufuncs (Section “Output arguments”) for more details. nancumsum : ndarray. A new array holding the result is returned unless out is specified, in which it is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.

numpy.cumsum
Cumulative sum across array propagating NaNs.
isnan
Show which elements are NaN.

Examples

>>> np.nancumsum(1)
array([1])
>>> np.nancumsum([1])
array([1])
>>> np.nancumsum([1, np.nan])
array([ 1.,  1.])
>>> a = np.array([[1, 2], [3, np.nan]])
>>> np.nancumsum(a)
array([ 1.,  3.,  6.,  6.])
>>> np.nancumsum(a, axis=0)
array([[ 1.,  2.],
[ 4.,  2.]])
>>> np.nancumsum(a, axis=1)
array([[ 1.,  3.],
[ 3.,  3.]])

dask.array.nanmax(a, axis=None, keepdims=False, split_every=None)

Return the maximum of an array or maximum along an axis, ignoring any NaNs. When all-NaN slices are encountered a RuntimeWarning is raised and NaN is returned for that slice.

Parameters: a : array_like Array containing numbers whose maximum is desired. If a is not an array, a conversion is attempted. axis : int, optional Axis along which the maximum is computed. The default is to compute the maximum of the flattened array. out : ndarray, optional Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. New in version 1.8.0. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a. New in version 1.8.0. nanmax : ndarray An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.

nanmin
The minimum value of an array along a given axis, ignoring any NaNs.
amax
The maximum value of an array along a given axis, propagating any NaNs.
fmax
Element-wise maximum of two arrays, ignoring any NaNs.
maximum
Element-wise maximum of two arrays, propagating any NaNs.
isnan
Shows which elements are Not a Number (NaN).
isfinite
Shows which elements are neither NaN nor infinity.

amin, fmin, minimum

Notes

Numpy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.

If the input has a integer type the function is equivalent to np.max.

Examples

>>> a = np.array([[1, 2], [3, np.nan]])
>>> np.nanmax(a)
3.0
>>> np.nanmax(a, axis=0)
array([ 3.,  2.])
>>> np.nanmax(a, axis=1)
array([ 2.,  3.])


When positive infinity and negative infinity are present:

>>> np.nanmax([1, 2, np.nan, np.NINF])
2.0
>>> np.nanmax([1, 2, np.nan, np.inf])
inf

dask.array.nanmean(a, axis=None, dtype=None, keepdims=False, split_every=None)

Compute the arithmetic mean along the specified axis, ignoring NaNs.

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

For all-NaN slices, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

Parameters: a : array_like Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted. axis : int, optional Axis along which the means are computed. The default is to compute the mean of the flattened array. dtype : data-type, optional Type to use in computing the mean. For integer inputs, the default is float64; for inexact inputs, it is the same as the input dtype. out : ndarray, optional Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. m : ndarray, see dtype parameter above If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned. Nan is returned for slices that contain only NaNs.

average
Weighted average
mean
Arithmetic mean taken while not ignoring NaNs

Notes

The arithmetic mean is the sum of the non-NaN elements along the axis divided by the number of non-NaN elements.

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32. Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, np.nan], [3, 4]])
>>> np.nanmean(a)
2.6666666666666665
>>> np.nanmean(a, axis=0)
array([ 2.,  4.])
>>> np.nanmean(a, axis=1)
array([ 1.,  3.5])

dask.array.nanmin(a, axis=None, keepdims=False, split_every=None)

Return minimum of an array or minimum along an axis, ignoring any NaNs. When all-NaN slices are encountered a RuntimeWarning is raised and Nan is returned for that slice.

Parameters: a : array_like Array containing numbers whose minimum is desired. If a is not an array, a conversion is attempted. axis : int, optional Axis along which the minimum is computed. The default is to compute the minimum of the flattened array. out : ndarray, optional Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. New in version 1.8.0. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a. New in version 1.8.0. nanmin : ndarray An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.

nanmax
The maximum value of an array along a given axis, ignoring any NaNs.
amin
The minimum value of an array along a given axis, propagating any NaNs.
fmin
Element-wise minimum of two arrays, ignoring any NaNs.
minimum
Element-wise minimum of two arrays, propagating any NaNs.
isnan
Shows which elements are Not a Number (NaN).
isfinite
Shows which elements are neither NaN nor infinity.

amax, fmax, maximum

Notes

Numpy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.

If the input has a integer type the function is equivalent to np.min.

Examples

>>> a = np.array([[1, 2], [3, np.nan]])
>>> np.nanmin(a)
1.0
>>> np.nanmin(a, axis=0)
array([ 1.,  2.])
>>> np.nanmin(a, axis=1)
array([ 1.,  3.])


When positive infinity and negative infinity are present:

>>> np.nanmin([1, 2, np.nan, np.inf])
1.0
>>> np.nanmin([1, 2, np.nan, np.NINF])
-inf

dask.array.nanprod(a, axis=None, dtype=None, keepdims=False, split_every=None)

Return the product of array elements over a given axis treating Not a Numbers (NaNs) as zero.

One is returned for slices that are all-NaN or empty.

New in version 1.10.0.

Parameters: a : array_like Array containing numbers whose sum is desired. If a is not an array, a conversion is attempted. axis : int, optional Axis along which the product is computed. The default is to compute the product of the flattened array. dtype : data-type, optional The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact. out : ndarray, optional Alternate output array in which to place the result. The default is None. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. The casting of NaN to integer can yield unexpected results. keepdims : bool, optional If True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. y : ndarray or numpy scalar

numpy.prod
Product across array propagating NaNs.
isnan
Show which elements are NaN.

Notes

Numpy integer arithmetic is modular. If the size of a product exceeds the size of an integer accumulator, its value will wrap around and the result will be incorrect. Specifying dtype=double can alleviate that problem.

Examples

>>> np.nanprod(1)
1
>>> np.nanprod([1])
1
>>> np.nanprod([1, np.nan])
1.0
>>> a = np.array([[1, 2], [3, np.nan]])
>>> np.nanprod(a)
6.0
>>> np.nanprod(a, axis=0)
array([ 3.,  2.])

dask.array.nanstd(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None)

Compute the standard deviation along the specified axis, while ignoring NaNs.

Returns the standard deviation, a measure of the spread of a distribution, of the non-NaN array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.

For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

Parameters: a : array_like Calculate the standard deviation of the non-NaN values. axis : int, optional Axis along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array. dtype : dtype, optional Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary. ddof : int, optional Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of non-NaN elements. By default ddof is zero. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. standard_deviation : ndarray, see dtype parameter above. If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.

numpy.doc.ufuncs
Section “Output arguments”

Notes

The standard deviation is the square root of the average of the squared deviations from the mean: std = sqrt(mean(abs(x - x.mean())**2)).

The average squared deviation is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of the infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even with ddof=1, it will not be an unbiased estimate of the standard deviation per se.

Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.

For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, np.nan], [3, 4]])
>>> np.nanstd(a)
1.247219128924647
>>> np.nanstd(a, axis=0)
array([ 1.,  0.])
>>> np.nanstd(a, axis=1)
array([ 0.,  0.5])

dask.array.nansum(a, axis=None, dtype=None, keepdims=False, split_every=None)

Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero.

In Numpy versions <= 1.8 Nan is returned for slices that are all-NaN or empty. In later versions zero is returned.

Parameters: a : array_like Array containing numbers whose sum is desired. If a is not an array, a conversion is attempted. axis : int, optional Axis along which the sum is computed. The default is to compute the sum of the flattened array. dtype : data-type, optional The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact. New in version 1.8.0. out : ndarray, optional Alternate output array in which to place the result. The default is None. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See doc.ufuncs for details. The casting of NaN to integer can yield unexpected results. New in version 1.8.0. keepdims : bool, optional If True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. New in version 1.8.0. y : ndarray or numpy scalar

numpy.sum
Sum across array propagating NaNs.
isnan
Show which elements are NaN.
isfinite
Show which elements are not NaN or +/-inf.

Notes

If both positive and negative infinity are present, the sum will be Not A Number (NaN).

Numpy integer arithmetic is modular. If the size of a sum exceeds the size of an integer accumulator, its value will wrap around and the result will be incorrect. Specifying dtype=double can alleviate that problem.

Examples

>>> np.nansum(1)
1
>>> np.nansum([1])
1
>>> np.nansum([1, np.nan])
1.0
>>> a = np.array([[1, 1], [1, np.nan]])
>>> np.nansum(a)
3.0
>>> np.nansum(a, axis=0)
array([ 2.,  1.])
>>> np.nansum([1, np.nan, np.inf])
inf
>>> np.nansum([1, np.nan, np.NINF])
-inf
>>> np.nansum([1, np.nan, np.inf, -np.inf]) # both +/- infinity present
nan

dask.array.nanvar(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None)

Compute the variance along the specified axis, while ignoring NaNs.

Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.

For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

Parameters: a : array_like Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted. axis : int, optional Axis along which the variance is computed. The default is to compute the variance of the flattened array. dtype : data-type, optional Type to use in computing the variance. For arrays of integer type the default is float32; for arrays of float types it is the same as the array type. out : ndarray, optional Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary. ddof : int, optional “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of non-NaN elements. By default ddof is zero. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. variance : ndarray, see dtype parameter above If out is None, return a new array containing the variance, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.

std
Standard deviation
mean
Average
var
Variance while not ignoring NaNs
numpy.doc.ufuncs
Section “Output arguments”

Notes

The variance is the average of the squared deviations from the mean, i.e., var = mean(abs(x - x.mean())**2).

The mean is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of a hypothetical infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables.

Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.

For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, np.nan], [3, 4]])
>>> np.var(a)
1.5555555555555554
>>> np.nanvar(a, axis=0)
array([ 1.,  0.])
>>> np.nanvar(a, axis=1)
array([ 0.,  0.25])

dask.array.nextafter(x1, x2[, out])

Return the next floating-point value after x1 towards x2, element-wise.

Parameters: x1 : array_like Values to find the next representable value of. x2 : array_like The direction where to look for the next representable value of x1. out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. out : array_like The next representable values of x1 in the direction of x2.

Examples

>>> eps = np.finfo(np.float64).eps
>>> np.nextafter(1, 2) == eps + 1
True
>>> np.nextafter([1, 2], [2, 1]) == [eps + 1, 2 - eps]
array([ True,  True], dtype=bool)

dask.array.notnull(values)

dask.array.ones()

Blocked variant of ones

Follows the signature of ones exactly except that it also requires a keyword argument chunks=(...)

Original signature follows below.

Return a new array of given shape and type, filled with ones.

Parameters: shape : int or sequence of ints Shape of the new array, e.g., (2, 3) or 2. dtype : data-type, optional The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64. order : {‘C’, ‘F’}, optional Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory. out : ndarray Array of ones with the given shape, dtype, and order.

zeros, ones_like

Examples

>>> np.ones(5)
array([ 1.,  1.,  1.,  1.,  1.])

>>> np.ones((5,), dtype=np.int)
array([1, 1, 1, 1, 1])

>>> np.ones((2, 1))
array([[ 1.],
[ 1.]])

>>> s = (2,2)
>>> np.ones(s)
array([[ 1.,  1.],
[ 1.,  1.]])

dask.array.percentile(a, q, interpolation='linear')

Approximate percentile of 1-D array

dask.array.prod(a, axis=None, dtype=None, keepdims=False, split_every=None)

Return the product of array elements over a given axis.

Parameters: a : array_like Input data. axis : None or int or tuple of ints, optional Axis or axes along which a product is performed. The default, axis=None, will calculate the product of all the elements in the input array. If axis is negative it counts from the last to the first axis. New in version 1.7.0. If axis is a tuple of ints, a product is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before. dtype : dtype, optional The type of the returned array, as well as of the accumulator in which the elements are multiplied. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array. product_along_axis : ndarray, see dtype parameter above. An array shaped as a but with the specified axis removed. Returns a reference to out if specified.

ndarray.prod
equivalent method
numpy.doc.ufuncs
Section “Output arguments”

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow. That means that, on a 32-bit platform:

>>> x = np.array([536870910, 536870910, 536870910, 536870910])
>>> np.prod(x) #random
16


The product of an empty array is the neutral element 1:

>>> np.prod([])
1.0


Examples

By default, calculate the product of all elements:

>>> np.prod([1.,2.])
2.0


Even when the input array is two-dimensional:

>>> np.prod([[1.,2.],[3.,4.]])
24.0


But we can also specify the axis over which to multiply:

>>> np.prod([[1.,2.],[3.,4.]], axis=1)
array([  2.,  12.])


If the type of x is unsigned, then the output type is the unsigned platform integer:

>>> x = np.array([1, 2, 3], dtype=np.uint8)
>>> np.prod(x).dtype == np.uint
True


If x is of a signed integer type, then the output type is the default platform integer:

>>> x = np.array([1, 2, 3], dtype=np.int8)
>>> np.prod(x).dtype == np.int
True

dask.array.rad2deg(x[, out])

Convert angles from radians to degrees.

Parameters: x : array_like Angle in radians. out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. y : ndarray The corresponding angle in degrees.

deg2rad
Convert angles from degrees to radians.
unwrap
Remove large jumps in angle by wrapping.

Notes

New in version 1.3.0.

rad2deg(x) is 180 * x / pi.

Examples

>>> np.rad2deg(np.pi/2)
90.0

dask.array.radians(x[, out])

Convert angles from degrees to radians.

Parameters: x : array_like Input array in degrees. out : ndarray, optional Output array of same shape as x. y : ndarray The corresponding radian values.

deg2rad
equivalent function

Examples

Convert a degree array to radians

>>> deg = np.arange(12.) * 30.
array([ 0.        ,  0.52359878,  1.04719755,  1.57079633,  2.0943951 ,
2.61799388,  3.14159265,  3.66519143,  4.1887902 ,  4.71238898,
5.23598776,  5.75958653])

>>> out = np.zeros((deg.shape))
>>> ret is out
True

dask.array.ravel(array)

Return a contiguous flattened array.

A 1-D array, containing the elements of the input, is returned. A copy is made only if needed.

As of NumPy 1.10, the returned array will have the same type as the input array. (for example, a masked array will be returned for a masked array input)

Parameters: a : array_like Input array. The elements in a are read in the order specified by order, and packed as a 1-D array. order : {‘C’,’F’, ‘A’, ‘K’}, optional The elements of a are read using this index order. ‘C’ means to index the elements in row-major, C-style order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of axis indexing. ‘A’ means to read the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. ‘K’ means to read the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, ‘C’ index order is used. y : array_like If a is a matrix, y is a 1-D ndarray, otherwise y is an array of the same subtype as a. The shape of the returned array is (a.size,). Matrices are special cased for backward compatibility.

ndarray.flat
1-D iterator over an array.
ndarray.flatten
1-D array copy of the elements of an array in row-major order.
ndarray.reshape
Change the shape of an array without changing its data.

Notes

In row-major, C-style order, in two dimensions, the row index varies the slowest, and the column index the quickest. This can be generalized to multiple dimensions, where row-major order implies that the index along the first axis varies slowest, and the index along the last quickest. The opposite holds for column-major, Fortran-style index ordering.

When a view is desired in as many cases as possible, arr.reshape(-1) may be preferable.

Examples

It is equivalent to reshape(-1, order=order).

>>> x = np.array([[1, 2, 3], [4, 5, 6]])
>>> print(np.ravel(x))
[1 2 3 4 5 6]

>>> print(x.reshape(-1))
[1 2 3 4 5 6]

>>> print(np.ravel(x, order='F'))
[1 4 2 5 3 6]


When order is ‘A’, it will preserve the array’s ‘C’ or ‘F’ ordering:

>>> print(np.ravel(x.T))
[1 4 2 5 3 6]
>>> print(np.ravel(x.T, order='A'))
[1 2 3 4 5 6]


When order is ‘K’, it will preserve orderings that are neither ‘C’ nor ‘F’, but won’t reverse axes:

>>> a = np.arange(3)[::-1]; a
array([2, 1, 0])
>>> a.ravel(order='C')
array([2, 1, 0])
>>> a.ravel(order='K')
array([2, 1, 0])

>>> a = np.arange(12).reshape(2,3,2).swapaxes(1,2); a
array([[[ 0,  2,  4],
[ 1,  3,  5]],
[[ 6,  8, 10],
[ 7,  9, 11]]])
>>> a.ravel(order='C')
array([ 0,  2,  4,  1,  3,  5,  6,  8, 10,  7,  9, 11])
>>> a.ravel(order='K')
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

dask.array.real(*args, **kwargs)

Return the real part of the elements of the array.

Parameters: val : array_like Input array. out : ndarray Output array. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.

real_if_close, imag, angle

Examples

>>> a = np.array([1+2j, 3+4j, 5+6j])
>>> a.real
array([ 1.,  3.,  5.])
>>> a.real = 9
>>> a
array([ 9.+2.j,  9.+4.j,  9.+6.j])
>>> a.real = np.array([9, 8, 7])
>>> a
array([ 9.+2.j,  8.+4.j,  7.+6.j])

dask.array.rechunk(x, chunks, threshold=4, block_size_limit=100000000.0)

Convert blocks in dask array x for new chunks.

>>> import dask.array as da
>>> a = np.random.uniform(0, 1, 7**4).reshape((7,) * 4)
>>> x = da.from_array(a, chunks=((2, 3, 2),)*4)
>>> x.chunks
((2, 3, 2), (2, 3, 2), (2, 3, 2), (2, 3, 2))

>>> y = rechunk(x, chunks=((2, 4, 1), (4, 2, 1), (4, 3), (7,)))
>>> y.chunks
((2, 4, 1), (4, 2, 1), (4, 3), (7,))


chunks also accept dict arguments mapping axis to blockshape

>>> y = rechunk(x, chunks={1: 2})  # rechunk axis 1 with blockshape 2

Parameters: x: dask array chunks: tuple The new block dimensions to create threshold: int The graph growth factor under which we don’t bother introducing an intermediate step block_size_limit: int The maximum block size (in bytes) we want to produce during an intermediate step
dask.array.reshape(array, shape)

Gives a new shape to an array without changing its data.

Parameters: a : array_like Array to be reshaped. newshape : int or tuple of ints The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions. order : {‘C’, ‘F’, ‘A’}, optional Read the elements of a using this index order, and place the elements into the reshaped array using this index order. ‘C’ means to read / write the elements using C-like index order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to read / write the elements using Fortran-like index order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of indexing. ‘A’ means to read / write the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. reshaped_array : ndarray This will be a new view object if possible; otherwise, it will be a copy. Note there is no guarantee of the memory layout (C- or Fortran- contiguous) of the returned array.

ndarray.reshape
Equivalent method.

Notes

It is not always possible to change the shape of an array without copying the data. If you want an error to be raise if the data is copied, you should assign the new shape to the shape attribute of the array:

>>> a = np.zeros((10, 2))
# A transpose make the array non-contiguous
>>> b = a.T
# Taking a view makes it possible to modify the shape without modifying
# the initial object.
>>> c = b.view()
>>> c.shape = (20)
AttributeError: incompatible shape for a non-contiguous array


The order keyword gives the index ordering both for fetching the values from a, and then placing the values into the output array. For example, let’s say you have an array:

>>> a = np.arange(6).reshape((3, 2))
>>> a
array([[0, 1],
[2, 3],
[4, 5]])


You can think of reshaping as first raveling the array (using the given index order), then inserting the elements from the raveled array into the new array using the same kind of index ordering as was used for the raveling.

>>> np.reshape(a, (2, 3)) # C-like index ordering
array([[0, 1, 2],
[3, 4, 5]])
>>> np.reshape(np.ravel(a), (2, 3)) # equivalent to C ravel then C reshape
array([[0, 1, 2],
[3, 4, 5]])
>>> np.reshape(a, (2, 3), order='F') # Fortran-like index ordering
array([[0, 4, 3],
[2, 1, 5]])
>>> np.reshape(np.ravel(a, order='F'), (2, 3), order='F')
array([[0, 4, 3],
[2, 1, 5]])


Examples

>>> a = np.array([[1,2,3], [4,5,6]])
>>> np.reshape(a, 6)
array([1, 2, 3, 4, 5, 6])
>>> np.reshape(a, 6, order='F')
array([1, 4, 2, 5, 3, 6])

>>> np.reshape(a, (3,-1))       # the unspecified value is inferred to be 2
array([[1, 2],
[3, 4],
[5, 6]])

dask.array.rint(x[, out])

Round elements of the array to the nearest integer.

Parameters: x : array_like Input array. out : ndarray or scalar Output array is same shape and type as x.

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])
>>> np.rint(a)
array([-2., -2., -0.,  0.,  2.,  2.,  2.])

dask.array.sign(x[, out])

Returns an element-wise indication of the sign of a number.

The sign function returns -1 if x < 0, 0 if x==0, 1 if x > 0. nan is returned for nan inputs.

For complex inputs, the sign function returns sign(x.real) + 0j if x.real != 0 else sign(x.imag) + 0j.

complex(nan, 0) is returned for complex nan inputs.

Parameters: x : array_like Input values. y : ndarray The sign of x.

Notes

There is more than one definition of sign in common use for complex numbers. The definition used here is equivalent to $$x/\sqrt{x*x}$$ which is different from a common alternative, $$x/|x|$$.

Examples

>>> np.sign([-5., 4.5])
array([-1.,  1.])
>>> np.sign(0)
0
>>> np.sign(5-2j)
(1+0j)

dask.array.signbit(x[, out])

Returns element-wise True where signbit is set (less than zero).

Parameters: x : array_like The input value(s). out : ndarray, optional Array into which the output is placed. Its type is preserved and it must be of the right shape to hold the output. See doc.ufuncs. result : ndarray of bool Output array, or reference to out if that was supplied.

Examples

>>> np.signbit(-1.2)
True
>>> np.signbit(np.array([1, -2.3, 2.1]))
array([False,  True, False], dtype=bool)

dask.array.sin(x[, out])

Trigonometric sine, element-wise.

Parameters: x : array_like Angle, in radians ($$2 \pi$$ rad equals 360 degrees). y : array_like The sine of each element of x.

Notes

The sine is one of the fundamental functions of trigonometry (the mathematical study of triangles). Consider a circle of radius 1 centered on the origin. A ray comes in from the $$+x$$ axis, makes an angle at the origin (measured counter-clockwise from that axis), and departs from the origin. The $$y$$ coordinate of the outgoing ray’s intersection with the unit circle is the sine of that angle. It ranges from -1 for $$x=3\pi / 2$$ to +1 for $$\pi / 2.$$ The function has zeroes where the angle is a multiple of $$\pi$$. Sines of angles between $$\pi$$ and $$2\pi$$ are negative. The numerous properties of the sine and related functions are included in any standard trigonometry text.

Examples

Print sine of one angle:

>>> np.sin(np.pi/2.)
1.0


Print sines of an array of angles given in degrees:

>>> np.sin(np.array((0., 30., 45., 60., 90.)) * np.pi / 180. )
array([ 0.        ,  0.5       ,  0.70710678,  0.8660254 ,  1.        ])


Plot the sine function:

>>> import matplotlib.pylab as plt
>>> x = np.linspace(-np.pi, np.pi, 201)
>>> plt.plot(x, np.sin(x))
>>> plt.ylabel('sin(x)')
>>> plt.axis('tight')
>>> plt.show()

dask.array.sinh(x[, out])

Hyperbolic sine, element-wise.

Equivalent to 1/2 * (np.exp(x) - np.exp(-x)) or -1j * np.sin(1j*x).

Parameters: x : array_like Input array. out : ndarray, optional Output array of same shape as x. y : ndarray The corresponding hyperbolic sine values. ValueError: invalid return array shape if out is provided and out.shape != x.shape (See Examples)

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83.

Examples

>>> np.sinh(0)
0.0
>>> np.sinh(np.pi*1j/2)
1j
>>> np.sinh(np.pi*1j) # (exact value is 0)
1.2246063538223773e-016j
>>> # Discrepancy due to vagaries of floating point arithmetic.

>>> # Example of providing the optional output parameter
>>> out2 = np.sinh([0.1], out1)
>>> out2 is out1
True

>>> # Example of ValueError due to provision of shape mis-matched out
>>> np.sinh(np.zeros((3,3)),np.zeros((2,2)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid return array shape

dask.array.sqrt(x[, out])

Return the positive square-root of an array, element-wise.

Parameters: x : array_like The values whose square-roots are required. out : ndarray, optional Alternate array object in which to put the result; if provided, it must have the same shape as x y : ndarray An array of the same shape as x, containing the positive square-root of each element in x. If any element in x is complex, a complex array is returned (and the square-roots of negative reals are calculated). If all of the elements in x are real, so is y, with negative elements returning nan. If out was provided, y is a reference to it.

lib.scimath.sqrt
A version which returns complex numbers when given negative reals.

Notes

sqrt has–consistent with common convention–as its branch cut the real “interval” [-inf, 0), and is continuous from above on it. A branch cut is a curve in the complex plane across which a given complex function fails to be continuous.

Examples

>>> np.sqrt([1,4,9])
array([ 1.,  2.,  3.])

>>> np.sqrt([4, -1, -3+4J])
array([ 2.+0.j,  0.+1.j,  1.+2.j])

>>> np.sqrt([4, -1, numpy.inf])
array([  2.,  NaN,  Inf])

dask.array.square(x[, out])

Return the element-wise square of the input.

Parameters: x : array_like Input data. out : ndarray Element-wise x*x, of the same shape and dtype as x. Returns scalar if x is a scalar.

numpy.linalg.matrix_power, sqrt, power

Examples

>>> np.square([-1j, 1])
array([-1.-0.j,  1.+0.j])

dask.array.squeeze(a, axis=None)

Remove single-dimensional entries from the shape of an array.

Parameters: a : array_like Input data. axis : None or int or tuple of ints, optional New in version 1.7.0. Selects a subset of the single-dimensional entries in the shape. If an axis is selected with shape entry greater than one, an error is raised. squeezed : ndarray The input array, but with all or a subset of the dimensions of length 1 removed. This is always a itself or a view into a.

Examples

>>> x = np.array([[[0], [1], [2]]])
>>> x.shape
(1, 3, 1)
>>> np.squeeze(x).shape
(3,)
>>> np.squeeze(x, axis=(2,)).shape
(1, 3)

dask.array.stack(seq, axis=0)

Stack arrays along a new axis

Given a sequence of dask Arrays form a new dask Array by stacking them along a new dimension (axis=0 by default)

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np

>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]

>>> x = da.stack(data, axis=0)
>>> x.shape
(3, 4, 4)

>>> da.stack(data, axis=1).shape
(4, 3, 4)

>>> da.stack(data, axis=-1).shape
(4, 4, 3)


Result is a new dask Array

dask.array.std(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None)

Compute the standard deviation along the specified axis.

Returns the standard deviation, a measure of the spread of a distribution, of the array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.

Parameters: a : array_like Calculate the standard deviation of these values. axis : None or int or tuple of ints, optional Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array. If this is a tuple of ints, a standard deviation is performed over multiple axes, instead of a single axis or all the axes as before. dtype : dtype, optional Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary. ddof : int, optional Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. standard_deviation : ndarray, see dtype parameter above. If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array.

numpy.doc.ufuncs
Section “Output arguments”

Notes

The standard deviation is the square root of the average of the squared deviations from the mean, i.e., std = sqrt(mean(abs(x - x.mean())**2)).

The average squared deviation is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of the infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even with ddof=1, it will not be an unbiased estimate of the standard deviation per se.

Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.

For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, 2], [3, 4]])
>>> np.std(a)
1.1180339887498949
>>> np.std(a, axis=0)
array([ 1.,  1.])
>>> np.std(a, axis=1)
array([ 0.5,  0.5])


In single precision, std() can be inaccurate:

>>> a = np.zeros((2, 512*512), dtype=np.float32)
>>> a[0, :] = 1.0
>>> a[1, :] = 0.1
>>> np.std(a)
0.45000005


Computing the standard deviation in float64 is more accurate:

>>> np.std(a, dtype=np.float64)
0.44999999925494177

dask.array.sum(a, axis=None, dtype=None, keepdims=False, split_every=None)

Sum of array elements over a given axis.

Parameters: a : array_like Elements to sum. axis : None or int or tuple of ints, optional Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input array. If axis is negative it counts from the last to the first axis. New in version 1.7.0. If axis is a tuple of ints, a sum is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before. dtype : dtype, optional The type of the returned array and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used. out : ndarray, optional Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array. sum_along_axis : ndarray An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, a scalar is returned. If an output array is specified, a reference to out is returned.

ndarray.sum
Equivalent method.
cumsum
Cumulative sum of array elements.
trapz
Integration of array values using the composite trapezoidal rule.

mean, average

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow.

The sum of an empty array is the neutral element 0:

>>> np.sum([])
0.0


Examples

>>> np.sum([0.5, 1.5])
2.0
>>> np.sum([0.5, 0.7, 0.2, 1.5], dtype=np.int32)
1
>>> np.sum([[0, 1], [0, 5]])
6
>>> np.sum([[0, 1], [0, 5]], axis=0)
array([0, 6])
>>> np.sum([[0, 1], [0, 5]], axis=1)
array([1, 5])


If the accumulator is too small, overflow occurs:

>>> np.ones(128, dtype=np.int8).sum(dtype=np.int8)
-128

dask.array.take(a, indices, axis=0)

Take elements from an array along an axis.

This function does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis.

Parameters: a : array_like The source array. indices : array_like The indices of the values to extract. New in version 1.8.0. Also allow scalars for indices. axis : int, optional The axis over which to select values. By default, the flattened input array is used. out : ndarray, optional If provided, the result will be placed in this array. It should be of the appropriate shape and dtype. mode : {‘raise’, ‘wrap’, ‘clip’}, optional Specifies how out-of-bounds indices will behave. ‘raise’ – raise an error (default) ‘wrap’ – wrap around ‘clip’ – clip to the range ‘clip’ mode means that all indices that are too large are replaced by the index that addresses the last element along that axis. Note that this disables indexing with negative numbers. subarray : ndarray The returned array has the same type as a.

compress
Take elements using a boolean mask
ndarray.take
equivalent method

Examples

>>> a = [4, 3, 5, 7, 6, 8]
>>> indices = [0, 1, 4]
>>> np.take(a, indices)
array([4, 3, 6])


In this example if a is an ndarray, “fancy” indexing can be used.

>>> a = np.array(a)
>>> a[indices]
array([4, 3, 6])


If indices is not one dimensional, the output also has these dimensions.

>>> np.take(a, [[0, 1], [2, 3]])
array([[4, 3],
[5, 7]])

dask.array.tan(x[, out])

Compute tangent element-wise.

Equivalent to np.sin(x)/np.cos(x) element-wise.

Parameters: x : array_like Input array. out : ndarray, optional Output array of same shape as x. y : ndarray The corresponding tangent values. ValueError: invalid return array shape if out is provided and out.shape != x.shape (See Examples)

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.

Examples

>>> from math import pi
>>> np.tan(np.array([-pi,pi/2,pi]))
array([  1.22460635e-16,   1.63317787e+16,  -1.22460635e-16])
>>>
>>> # Example of providing the optional output parameter illustrating
>>> # that what is returned is a reference to said parameter
>>> out2 = np.cos([0.1], out1)
>>> out2 is out1
True
>>>
>>> # Example of ValueError due to provision of shape mis-matched out
>>> np.cos(np.zeros((3,3)),np.zeros((2,2)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid return array shape

dask.array.tanh(x[, out])

Compute hyperbolic tangent element-wise.

Equivalent to np.sinh(x)/np.cosh(x) or -1j * np.tan(1j*x).

Parameters: x : array_like Input array. out : ndarray, optional Output array of same shape as x. y : ndarray The corresponding hyperbolic tangent values. ValueError: invalid return array shape if out is provided and out.shape != x.shape (See Examples)

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

 [R102] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83. http://www.math.sfu.ca/~cbm/aands/
 [R103] Wikipedia, “Hyperbolic function”, http://en.wikipedia.org/wiki/Hyperbolic_function

Examples

>>> np.tanh((0, np.pi*1j, np.pi*1j/2))
array([ 0. +0.00000000e+00j,  0. -1.22460635e-16j,  0. +1.63317787e+16j])

>>> # Example of providing the optional output parameter illustrating
>>> # that what is returned is a reference to said parameter
>>> out2 = np.tanh([0.1], out1)
>>> out2 is out1
True

>>> # Example of ValueError due to provision of shape mis-matched out
>>> np.tanh(np.zeros((3,3)),np.zeros((2,2)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid return array shape

dask.array.tensordot(lhs, rhs, axes=2)

Compute tensor dot product along specified axes for arrays >= 1-D.

Given two tensors (arrays of dimension greater than or equal to one), a and b, and an array_like object containing two array_like objects, (a_axes, b_axes), sum the products of a‘s and b‘s elements (components) over the axes specified by a_axes and b_axes. The third argument can be a single non-negative integer_like scalar, N; if it is such, then the last N dimensions of a and the first N dimensions of b are summed over.

Parameters: a, b : array_like, len(shape) >= 1 Tensors to “dot”. axes : int or (2,) array_like integer_like If an int N, sum over the last N axes of a and the first N axes of b in order. The sizes of the corresponding axes must match. (2,) array_like Or, a list of axes to be summed over, first sequence applying to a, second to b. Both elements array_like must be of the same length.

dot, einsum

Notes

Three common use cases are:
axes = 0 : tensor product $aotimes b$ axes = 1 : tensor dot product $acdot b$ axes = 2 : (default) tensor double contraction $a:b$

When axes is integer_like, the sequence for evaluation will be: first the -Nth axis in a and 0th axis in b, and the -1th axis in a and Nth axis in b last.

When there is more than one axis to sum over - and they are not the last (first) axes of a (b) - the argument axes should consist of two sequences of the same length, with the first axis to sum over given first in both sequences, the second axis second, and so forth.

Examples

>>> a = np.arange(60.).reshape(3,4,5)
>>> b = np.arange(24.).reshape(4,3,2)
>>> c = np.tensordot(a,b, axes=([1,0],[0,1]))
>>> c.shape
(5, 2)
>>> c
array([[ 4400.,  4730.],
[ 4532.,  4874.],
[ 4664.,  5018.],
[ 4796.,  5162.],
[ 4928.,  5306.]])
>>> # A slower but equivalent way of computing the same...
>>> d = np.zeros((5,2))
>>> for i in range(5):
...   for j in range(2):
...     for k in range(3):
...       for n in range(4):
...         d[i,j] += a[k,n,i] * b[n,k,j]
>>> c == d
array([[ True,  True],
[ True,  True],
[ True,  True],
[ True,  True],
[ True,  True]], dtype=bool)


>>> a = np.array(range(1, 9))
>>> a.shape = (2, 2, 2)
>>> A = np.array(('a', 'b', 'c', 'd'), dtype=object)
>>> A.shape = (2, 2)
>>> a; A
array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
array([[a, b],
[c, d]], dtype=object)

>>> np.tensordot(a, A) # third argument default is 2 for double-contraction
array([abbcccdddd, aaaaabbbbbbcccccccdddddddd], dtype=object)

>>> np.tensordot(a, A, 1)
array([[[acc, bdd],
[aaacccc, bbbdddd]],
[[aaaaacccccc, bbbbbdddddd],
[aaaaaaacccccccc, bbbbbbbdddddddd]]], dtype=object)

>>> np.tensordot(a, A, 0) # tensor product (result too long to incl.)
array([[[[[a, b],
[c, d]],
...

>>> np.tensordot(a, A, (0, 1))
array([[[abbbbb, cddddd],
[aabbbbbb, ccdddddd]],
[[aaabbbbbbb, cccddddddd],
[aaaabbbbbbbb, ccccdddddddd]]], dtype=object)

>>> np.tensordot(a, A, (2, 1))
array([[[abb, cdd],
[aaabbbb, cccdddd]],
[[aaaaabbbbbb, cccccdddddd],
[aaaaaaabbbbbbbb, cccccccdddddddd]]], dtype=object)

>>> np.tensordot(a, A, ((0, 1), (0, 1)))
array([abbbcccccddddddd, aabbbbccccccdddddddd], dtype=object)

>>> np.tensordot(a, A, ((2, 1), (1, 0)))
array([acccbbdddd, aaaaacccccccbbbbbbdddddddd], dtype=object)

dask.array.topk(k, x)

The top k elements of an array

Returns the k greatest elements of the array in sorted order. Only works on arrays of a single dimension.

This assumes that k is small. All results will be returned in a single chunk.

Examples

>>> x = np.array([5, 1, 3, 6])
>>> d = from_array(x, chunks=2)
>>> d.topk(2).compute()
array([6, 5])

dask.array.transpose(a, axes=None)

Permute the dimensions of an array.

Parameters: a : array_like Input array. axes : list of ints, optional By default, reverse the dimensions, otherwise permute the axes according to the values given. p : ndarray a with its axes permuted. A view is returned whenever possible.

moveaxis, argsort

Notes

Use transpose(a, argsort(axes)) to invert the transposition of tensors when using the axes keyword argument.

Transposing a 1-D array returns an unchanged view of the original array.

Examples

>>> x = np.arange(4).reshape((2,2))
>>> x
array([[0, 1],
[2, 3]])

>>> np.transpose(x)
array([[0, 2],
[1, 3]])

>>> x = np.ones((1, 2, 3))
>>> np.transpose(x, (1, 0, 2)).shape
(2, 1, 3)

dask.array.tril(m, k=0)

Lower triangle of an array with elements above the k-th diagonal zeroed.

Parameters: m : array_like, shape (M, M) Input array. k : int, optional Diagonal above which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above. tril : ndarray, shape (M, M) Lower triangle of m, of same shape and data-type as m.

triu
upper triangle of an array
dask.array.triu(m, k=0)

Upper triangle of an array with elements above the k-th diagonal zeroed.

Parameters: m : array_like, shape (M, N) Input array. k : int, optional Diagonal above which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above. triu : ndarray, shape (M, N) Upper triangle of m, of same shape and data-type as m.

tril
lower triangle of an array
dask.array.trunc(x[, out])

Return the truncated value of the input, element-wise.

The truncated value of the scalar x is the nearest integer i which is closer to zero than x is. In short, the fractional part of the signed number x is discarded.

Parameters: x : array_like Input data. y : ndarray or scalar The truncated value of each element in x.

Notes

New in version 1.3.0.

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])
>>> np.trunc(a)
array([-1., -1., -0.,  0.,  1.,  1.,  2.])

dask.array.unique(x)

Find the unique elements of an array.

Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements: the indices of the input array that give the unique values, the indices of the unique array that reconstruct the input array, and the number of times each unique value comes up in the input array.

Parameters: ar : array_like Input array. This will be flattened if it is not already 1-D. return_index : bool, optional If True, also return the indices of ar that result in the unique array. return_inverse : bool, optional If True, also return the indices of the unique array that can be used to reconstruct ar. return_counts : bool, optional If True, also return the number of times each unique value comes up in ar. New in version 1.9.0. unique : ndarray The sorted unique values. unique_indices : ndarray, optional The indices of the first occurrences of the unique values in the (flattened) original array. Only provided if return_index is True. unique_inverse : ndarray, optional The indices to reconstruct the (flattened) original array from the unique array. Only provided if return_inverse is True. unique_counts : ndarray, optional The number of times each of the unique values comes up in the original array. Only provided if return_counts is True. New in version 1.9.0.

numpy.lib.arraysetops
Module with a number of other functions for performing set operations on arrays.

Examples

>>> np.unique([1, 1, 2, 2, 3, 3])
array([1, 2, 3])
>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])


Return the indices of the original array that give the unique values:

>>> a = np.array(['a', 'b', 'b', 'c', 'a'])
>>> u, indices = np.unique(a, return_index=True)
>>> u
array(['a', 'b', 'c'],
dtype='|S1')
>>> indices
array([0, 1, 3])
>>> a[indices]
array(['a', 'b', 'c'],
dtype='|S1')


Reconstruct the input array from the unique values:

>>> a = np.array([1, 2, 6, 4, 2, 3, 2])
>>> u, indices = np.unique(a, return_inverse=True)
>>> u
array([1, 2, 3, 4, 6])
>>> indices
array([0, 1, 4, 3, 1, 2, 1])
>>> u[indices]
array([1, 2, 6, 4, 2, 3, 2])

dask.array.var(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None)

Compute the variance along the specified axis.

Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.

Parameters: a : array_like Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted. axis : None or int or tuple of ints, optional Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array. If this is a tuple of ints, a variance is performed over multiple axes, instead of a single axis or all the axes as before. dtype : data-type, optional Type to use in computing the variance. For arrays of integer type the default is float32; for arrays of float types it is the same as the array type. out : ndarray, optional Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary. ddof : int, optional “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is zero. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. variance : ndarray, see dtype parameter above If out=None, returns a new array containing the variance; otherwise, a reference to the output array is returned.

numpy.doc.ufuncs
Section “Output arguments”

Notes

The variance is the average of the squared deviations from the mean, i.e., var = mean(abs(x - x.mean())**2).

The mean is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of a hypothetical infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables.

Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.

For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, 2], [3, 4]])
>>> np.var(a)
1.25
>>> np.var(a, axis=0)
array([ 1.,  1.])
>>> np.var(a, axis=1)
array([ 0.25,  0.25])


In single precision, var() can be inaccurate:

>>> a = np.zeros((2, 512*512), dtype=np.float32)
>>> a[0, :] = 1.0
>>> a[1, :] = 0.1
>>> np.var(a)
0.20250003


Computing the variance in float64 is more accurate:

>>> np.var(a, dtype=np.float64)
0.20249999932944759
>>> ((1-0.55)**2 + (0.1-0.55)**2)/2
0.2025

dask.array.vnorm(a, ord=None, axis=None, dtype=None, keepdims=False, split_every=None)

Vector norm

See np.linalg.norm

dask.array.vstack(tup)

Stack arrays in sequence vertically (row wise).

Take a sequence of arrays and stack them vertically to make a single array. Rebuild arrays divided by vsplit.

Parameters: tup : sequence of ndarrays Tuple containing arrays to be stacked. The arrays must have the same shape along all but the first axis. stacked : ndarray The array formed by stacking the given arrays.

stack
Join a sequence of arrays along a new axis.
hstack
Stack arrays in sequence horizontally (column wise).
dstack
Stack arrays in sequence depth wise (along third dimension).
concatenate
Join a sequence of arrays along an existing axis.
vsplit
Split array into a list of multiple sub-arrays vertically.

Notes

Equivalent to np.concatenate(tup, axis=0) if tup contains arrays that are at least 2-dimensional.

Examples

>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 3, 4])
>>> np.vstack((a,b))
array([[1, 2, 3],
[2, 3, 4]])

>>> a = np.array([[1], [2], [3]])
>>> b = np.array([[2], [3], [4]])
>>> np.vstack((a,b))
array([[1],
[2],
[3],
[2],
[3],
[4]])

dask.array.where(condition[, x, y])

Return elements, either from x or y, depending on condition.

If only condition is given, return condition.nonzero().

Parameters: condition : array_like, bool When True, yield x, otherwise yield y. x, y : array_like, optional Values from which to choose. x and y need to have the same shape as condition. out : ndarray or tuple of ndarrays If both x and y are specified, the output array contains elements of x where condition is True, and elements from y elsewhere. If only condition is given, return the tuple condition.nonzero(), the indices where condition is True.

nonzero, choose

Notes

If x and y are given and input arrays are 1-D, where is equivalent to:

[xv if c else yv for (c,xv,yv) in zip(condition,x,y)]


Examples

>>> np.where([[True, False], [True, True]],
...          [[1, 2], [3, 4]],
...          [[9, 8], [7, 6]])
array([[1, 8],
[3, 4]])

>>> np.where([[0, 1], [1, 0]])
(array([0, 1]), array([1, 0]))

>>> x = np.arange(9.).reshape(3, 3)
>>> np.where( x > 5 )
(array([2, 2, 2]), array([0, 1, 2]))
>>> x[np.where( x > 3.0 )]               # Note: result is 1D.
array([ 4.,  5.,  6.,  7.,  8.])
>>> np.where(x < 5, x, -1)               # Note: broadcasting.
array([[ 0.,  1.,  2.],
[ 3.,  4., -1.],
[-1., -1., -1.]])


Find the indices of elements of x that are in goodvalues.

>>> goodvalues = [3, 4, 7]
>>> ix = np.in1d(x.ravel(), goodvalues).reshape(x.shape)
>>> ix
array([[False, False, False],
[ True,  True, False],
[False,  True, False]], dtype=bool)
>>> np.where(ix)
(array([1, 1, 2]), array([0, 1, 1]))

dask.array.zeros()

Blocked variant of zeros

Follows the signature of zeros exactly except that it also requires a keyword argument chunks=(...)

Original signature follows below. zeros(shape, dtype=float, order=’C’)

Return a new array of given shape and type, filled with zeros.

Parameters: shape : int or sequence of ints Shape of the new array, e.g., (2, 3) or 2. dtype : data-type, optional The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64. order : {‘C’, ‘F’}, optional Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory. out : ndarray Array of zeros with the given shape, dtype, and order.

zeros_like
Return an array of zeros with shape and type of input.
ones_like
Return an array of ones with shape and type of input.
empty_like
Return an empty array with shape and type of input.
ones
Return a new array setting values to one.
empty
Return a new uninitialized array.

Examples

>>> np.zeros(5)
array([ 0.,  0.,  0.,  0.,  0.])

>>> np.zeros((5,), dtype=np.int)
array([0, 0, 0, 0, 0])

>>> np.zeros((2, 1))
array([[ 0.],
[ 0.]])

>>> s = (2,2)
>>> np.zeros(s)
array([[ 0.,  0.],
[ 0.,  0.]])

>>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
array([(0, 0), (0, 0)],
dtype=[('x', '<i4'), ('y', '<i4')])

dask.array.linalg.cholesky(a, lower=False)

Returns the Cholesky decomposition, $$A = L L^*$$ or $$A = U^* U$$ of a Hermitian positive-definite matrix A.

Parameters: a : (M, M) array_like Matrix to be decomposed lower : bool, optional Whether to compute the upper or lower triangular Cholesky factorization. Default is upper-triangular. c : (M, M) Array Upper- or lower-triangular Cholesky factor of a.
dask.array.linalg.inv(a)

Compute the inverse of a matrix with LU decomposition and forward / backward substitutions.

Parameters: a : array_like Square matrix to be inverted. ainv : Array Inverse of the matrix a.
dask.array.linalg.lstsq(a, b)

Return the least-squares solution to a linear matrix equation using QR decomposition.

Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2. The equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows of a can be less than, equal to, or greater than its number of linearly independent columns). If a is square and of full rank, then x (but for round-off error) is the “exact” solution of the equation.

Parameters: a : (M, N) array_like “Coefficient” matrix. b : (M,) array_like Ordinate or “dependent variable” values. x : (N,) Array Least-squares solution. If b is two-dimensional, the solutions are in the K columns of x. residuals : (1,) Array Sums of residuals; squared Euclidean 2-norm for each column in b - a*x. rank : Array Rank of matrix a. s : (min(M, N),) Array Singular values of a.
dask.array.linalg.lu(a)

Compute the lu decomposition of a matrix.

Returns: p: Array, permutation matrix l: Array, lower triangular matrix with unit diagonal. u: Array, upper triangular matrix

Examples

>>> p, l, u = da.linalg.lu(x)

dask.array.linalg.qr(a, name=None)

Compute the qr factorization of a matrix.

Returns: q: Array, orthonormal r: Array, upper-triangular

np.linalg.qr
Equivalent NumPy Operation
dask.array.linalg.tsqr
Actual implementation with citation

Examples

>>> q, r = da.linalg.qr(x)

dask.array.linalg.solve(a, b, sym_pos=False)

Solve the equation a x = b for x. By default, use LU decomposition and forward / backward substitutions. When sym_pos is True, use Cholesky decomposition.

Parameters: a : (M, M) array_like A square matrix. b : (M,) or (M, N) array_like Right-hand side matrix in a x = b. sym_pos : bool Assume a is symmetric and positive definite. If True, use Cholesky decomposition. x : (M,) or (M, N) Array Solution to the system a x = b. Shape of the return matches the shape of b.
dask.array.linalg.solve_triangular(a, b, lower=False)

Solve the equation a x = b for x, assuming a is a triangular matrix.

Parameters: a : (M, M) array_like A triangular matrix b : (M,) or (M, N) array_like Right-hand side matrix in a x = b lower : bool, optional Use only data contained in the lower triangle of a. Default is to use upper triangle. x : (M,) or (M, N) array Solution to the system a x = b. Shape of return matches b.
dask.array.linalg.svd(a, name=None)

Compute the singular value decomposition of a matrix.

Returns: u: Array, unitary / orthogonal s: Array, singular values in decreasing order (largest first) v: Array, unitary / orthogonal

np.linalg.svd
Equivalent NumPy Operation
dask.array.linalg.tsqr
Actual implementation with citation

Examples

>>> u, s, v = da.linalg.svd(x)

dask.array.linalg.svd_compressed(a, k, n_power_iter=0, seed=None, name=None)

Randomly compressed rank-k thin Singular Value Decomposition.

This computes the approximate singular value decomposition of a large array. This algorithm is generally faster than the normal algorithm but does not provide exact results. One can balance between performance and accuracy with input parameters (see below).

Parameters: a: Array Input array k: int Rank of the desired thin SVD decomposition. n_power_iter: int Number of power iterations, useful when the singular values decay slowly. Error decreases exponentially as n_power_iter increases. In practice, set n_power_iter <= 4. u: Array, unitary / orthogonal s: Array, singular values in decreasing order (largest first) v: Array, unitary / orthogonal

References

N. Halko, P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011 http://arxiv.org/abs/0909.4061

Examples

>>> u, s, vt = svd_compressed(x, 20)

dask.array.linalg.tsqr(data, name=None, compute_svd=False)

Direct Tall-and-Skinny QR algorithm

As presented in:

A. Benson, D. Gleich, and J. Demmel. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. IEEE International Conference on Big Data, 2013. http://arxiv.org/abs/1301.1071

This algorithm is used to compute both the QR decomposition and the Singular Value Decomposition. It requires that the input array have a single column of blocks, each of which fit in memory.

If blocks are of size (n, k) then this algorithm has memory use that scales as n**2 * k * nthreads.

Parameters: data: Array compute_svd: bool Whether to compute the SVD rather than the QR decomposition
dask.array.ghost.ghost(x, depth, boundary)

Share boundaries between neighboring blocks

Parameters: x: da.Array A dask array depth: dict The size of the shared boundary per axis boundary: dict The boundary condition on each axis. Options are ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or an array value. Such a value will fill the boundary with that value. The depth input informs how many cells to overlap between neighboring blocks {0: 2, 2: 5} means share two cells in 0 axis, 5 cells in 2 axis. Axes missing from this input will not be overlapped.

Examples

>>> import numpy as np

>>> x = np.arange(64).reshape((8, 8))
>>> d = da.from_array(x, chunks=(4, 4))
>>> d.chunks
((4, 4), (4, 4))

>>> g = da.ghost.ghost(d, depth={0: 2, 1: 1},
...                       boundary={0: 100, 1: 'reflect'})
>>> g.chunks
((8, 8), (6, 6))

>>> np.array(g)
array([[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100],
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100],
[  0,   0,   1,   2,   3,   4,   3,   4,   5,   6,   7,   7],
[  8,   8,   9,  10,  11,  12,  11,  12,  13,  14,  15,  15],
[ 16,  16,  17,  18,  19,  20,  19,  20,  21,  22,  23,  23],
[ 24,  24,  25,  26,  27,  28,  27,  28,  29,  30,  31,  31],
[ 32,  32,  33,  34,  35,  36,  35,  36,  37,  38,  39,  39],
[ 40,  40,  41,  42,  43,  44,  43,  44,  45,  46,  47,  47],
[ 16,  16,  17,  18,  19,  20,  19,  20,  21,  22,  23,  23],
[ 24,  24,  25,  26,  27,  28,  27,  28,  29,  30,  31,  31],
[ 32,  32,  33,  34,  35,  36,  35,  36,  37,  38,  39,  39],
[ 40,  40,  41,  42,  43,  44,  43,  44,  45,  46,  47,  47],
[ 48,  48,  49,  50,  51,  52,  51,  52,  53,  54,  55,  55],
[ 56,  56,  57,  58,  59,  60,  59,  60,  61,  62,  63,  63],
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100],
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100]])

dask.array.ghost.map_overlap(x, func, depth, boundary=None, trim=True, **kwargs)
dask.array.from_array(x, chunks, name=None, lock=False, fancy=True)

Create dask array from something that looks like an array

Input must have a .shape and support numpy-style slicing.

Parameters: x : array_like chunks : int, tuple How to chunk the array. Must be one of the following forms: - A blocksize like 1000. - A blockshape like (1000, 1000). - Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)). name : str, optional The key name to use for the array. Defaults to a hash of x. Use name=False to generate a random name instead of hashing (fast) lock : bool or Lock, optional If x doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you. fancy : bool, optional If x doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.

Examples

>>> x = h5py.File('...')['/data/path']
>>> a = da.from_array(x, chunks=(1000, 1000))


If your underlying datastore does not support concurrent reads then include the lock=True keyword argument or lock=mylock if you want multiple arrays to coordinate around the same lock.

>>> a = da.from_array(x, chunks=(1000, 1000), lock=True)

dask.array.from_delayed(value, shape, dtype, name=None)

This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.

The dask array will consist of a single chunk.

Examples

>>> from dask import delayed
>>> value = delayed(np.ones)(5)
>>> array = from_delayed(value, (5,), float)
>>> array
>>> array.compute()
array([ 1.,  1.,  1.,  1.,  1.])

dask.array.from_npy_stack(dirname, mmap_mode='r')

See da.to_npy_stack for docstring

Parameters: dirname: string Directory of .npy files mmap_mode: (None or ‘r’) Read data in memory map mode
dask.array.store(sources, targets, lock=True, regions=None, compute=True, **kwargs)

Store dask arrays in array-like objects, overwrite data in target

This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.

If your data fits in memory then you may prefer calling np.array(myarray) instead.

Parameters: sources: Array or iterable of Arrays targets: array-like or iterable of array-likes These should support setitem syntax target[10:20] = ... lock: boolean or threading.Lock, optional Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular threading.Lock object to be shared among all writes. regions: tuple of slices or iterable of tuple of slices Each region tuple in regions should be such that target[region].shape = source.shape for the corresponding source and target in sources and targets, respectively. compute: boolean, optional If true compute immediately, return dask.delayed.Delayed otherwise

Examples

>>> x = ...

>>> import h5py
>>> f = h5py.File('myfile.hdf5')
>>> dset = f.create_dataset('/data', shape=x.shape,
...                                  chunks=x.chunks,
...                                  dtype='f8')

>>> store(x, dset)


Alternatively store many arrays at the same time

>>> store([x, y, z], [dset1, dset2, dset3])

dask.array.to_hdf5(filename, *args, **kwargs)

Store arrays in HDF5 file

This saves several dask arrays into several datapaths in an HDF5 file. It creates the necessary datasets and handles clean file opening/closing.

>>> da.to_hdf5('myfile.hdf5', '/x', x)


or

>>> da.to_hdf5('myfile.hdf5', {'/x': x, '/y': y})


Optionally provide arguments as though to h5py.File.create_dataset

>>> da.to_hdf5('myfile.hdf5', '/x', x, compression='lzf', shuffle=True)


This can also be used as a method on a single Array

>>> x.to_hdf5('myfile.hdf5', '/x')


da.store, h5py.File.create_dataset

dask.array.to_npy_stack(dirname, x, axis=0)

Write dask array to a stack of .npy files

This partitions the dask.array along one axis and stores each block along that axis as a single .npy file in the specified directory

Examples

>>> x = da.ones((5, 10, 10), chunks=(2, 4, 4))
>>> da.to_npy_stack('data/', x, axis=0)

\$ tree data/ data/ |– 0.npy |– 1.npy |– 2.npy |– info

The .npy files store numpy arrays for x[0:2], x[2:4], and x[4:5] respectively, as is specified by the chunk size along the zeroth axis. The info file stores the dtype, chunks, and axis information of the array.

You can load these stacks with the da.from_npy_stack function.

>>> y = da.from_npy_stack('data/')

dask.array.fft.fft(a, n=None, axis=-1)

Wrapping of numpy.fft.fft

The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.fft docstring follows below:

Compute the one-dimensional discrete Fourier Transform.

This function computes the one-dimensional n-point discrete Fourier Transform (DFT) with the efficient Fast Fourier Transform (FFT) algorithm [CT].

Parameters: a : array_like Input array, can be complex. n : int, optional Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used. axis : int, optional Axis over which to compute the FFT. If not given, the last axis is used. norm : {None, “ortho”}, optional New in version 1.10.0. Normalization mode (see numpy.fft). Default is None. out : complex ndarray The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. IndexError if axes is larger than the last axis of a.

numpy.fft
for definition of the DFT and conventions used.
ifft
The inverse of fft.
fft2
The two-dimensional FFT.
fftn
The n-dimensional FFT.
rfftn
The n-dimensional FFT of real input.
fftfreq
Frequency bins for given FFT parameters.

Notes

FFT (Fast Fourier Transform) refers to a way the discrete Fourier Transform (DFT) can be calculated efficiently, by using symmetries in the calculated terms. The symmetry is highest when n is a power of 2, and the transform is therefore most efficient for these sizes.

The DFT is defined, with the conventions used in this implementation, in the documentation for the numpy.fft module.

References

 [CT] Cooley, James W., and John W. Tukey, 1965, “An algorithm for the machine calculation of complex Fourier series,” Math. Comput. 19: 297-301.

Examples

>>> np.fft.fft(np.exp(2j * np.pi * np.arange(8) / 8))
array([ -3.44505240e-16 +1.14383329e-17j,
8.00000000e+00 -5.71092652e-15j,
2.33482938e-16 +1.22460635e-16j,
1.64863782e-15 +1.77635684e-15j,
9.95839695e-17 +2.33482938e-16j,
0.00000000e+00 +1.66837030e-15j,
1.14383329e-17 +1.22460635e-16j,
-1.64863782e-15 +1.77635684e-15j])

>>> import matplotlib.pyplot as plt
>>> t = np.arange(256)
>>> sp = np.fft.fft(np.sin(t))
>>> freq = np.fft.fftfreq(t.shape[-1])
>>> plt.plot(freq, sp.real, freq, sp.imag)
[<matplotlib.lines.Line2D object at 0x...>, <matplotlib.lines.Line2D object at 0x...>]
>>> plt.show()


In this example, real input has an FFT which is Hermitian, i.e., symmetric in the real part and anti-symmetric in the imaginary part, as described in the numpy.fft documentation.

dask.array.fft.ifft(a, n=None, axis=-1)

Wrapping of numpy.fft.ifft

The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.ifft docstring follows below:

Compute the one-dimensional inverse discrete Fourier Transform.

This function computes the inverse of the one-dimensional n-point discrete Fourier transform computed by fft. In other words, ifft(fft(a)) == a to within numerical accuracy. For a general description of the algorithm and definitions, see numpy.fft.

The input should be ordered in the same way as is returned by fft, i.e.,

• a[0] should contain the zero frequency term,
• a[1:n//2] should contain the positive-frequency terms,
• a[n//2 + 1:] should contain the negative-frequency terms, in increasing order starting from the most negative frequency.

For an even number of input points, A[n//2] represents the sum of the values at the positive and negative Nyquist frequencies, as the two are aliased together. See numpy.fft for details.

Parameters: a : array_like Input array, can be complex. n : int, optional Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used. See notes about padding issues. axis : int, optional Axis over which to compute the inverse DFT. If not given, the last axis is used. norm : {None, “ortho”}, optional New in version 1.10.0. Normalization mode (see numpy.fft). Default is None. out : complex ndarray The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. IndexError If axes is larger than the last axis of a.

numpy.fft
An introduction, with definitions and general explanations.
fft
The one-dimensional (forward) FFT, of which ifft is the inverse
ifft2
The two-dimensional inverse FFT.
ifftn
The n-dimensional inverse FFT.

Notes

If the input parameter n is larger than the size of the input, the input is padded by appending zeros at the end. Even though this is the common approach, it might lead to surprising results. If a different padding is desired, it must be performed before calling ifft.

Examples

>>> np.fft.ifft([0, 4, 0, 0])
array([ 1.+0.j,  0.+1.j, -1.+0.j,  0.-1.j])


Create and plot a band-limited signal with random phases:

>>> import matplotlib.pyplot as plt
>>> t = np.arange(400)
>>> n = np.zeros((400,), dtype=complex)
>>> n[40:60] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20,)))
>>> s = np.fft.ifft(n)
>>> plt.plot(t, s.real, 'b-', t, s.imag, 'r--')
...
>>> plt.legend(('real', 'imaginary'))
...
>>> plt.show()

dask.array.fft.hfft(a, n=None, axis=-1)

Wrapping of numpy.fft.hfft

The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.hfft docstring follows below:

Compute the FFT of a signal which has Hermitian symmetry (real spectrum).

Parameters: a : array_like The input array. n : int, optional Length of the transformed axis of the output. For n output points, n//2+1 input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is determined from the length of the input along the axis specified by axis. axis : int, optional Axis over which to compute the FFT. If not given, the last axis is used. norm : {None, “ortho”}, optional New in version 1.10.0. Normalization mode (see numpy.fft). Default is None. out : ndarray The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given, 2*(m-1) where m is the length of the transformed axis of the input. To get an odd number of output points, n must be specified. IndexError If axis is larger than the last axis of a.

rfft
Compute the one-dimensional FFT for real input.
ihfft
The inverse of hfft.

Notes

hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd: ihfft(hfft(a), len(a)) == a, within numerical accuracy.

Examples

>>> signal = np.array([1, 2, 3, 4, 3, 2])
>>> np.fft.fft(signal)
array([ 15.+0.j,  -4.+0.j,   0.+0.j,  -1.-0.j,   0.+0.j,  -4.+0.j])
>>> np.fft.hfft(signal[:4]) # Input first half of signal
array([ 15.,  -4.,   0.,  -1.,   0.,  -4.])
>>> np.fft.hfft(signal, 6)  # Input entire signal and truncate
array([ 15.,  -4.,   0.,  -1.,   0.,  -4.])

>>> signal = np.array([[1, 1.j], [-1.j, 2]])
>>> np.conj(signal.T) - signal   # check Hermitian symmetry
array([[ 0.-0.j,  0.+0.j],
[ 0.+0.j,  0.-0.j]])
>>> freq_spectrum = np.fft.hfft(signal)
>>> freq_spectrum
array([[ 1.,  1.],
[ 2., -2.]])

dask.array.fft.ihfft(a, n=None, axis=-1)

Wrapping of numpy.fft.ihfft

The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.ihfft docstring follows below:

Compute the inverse FFT of a signal which has Hermitian symmetry.

Parameters: a : array_like Input array. n : int, optional Length of the inverse FFT. Number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used. axis : int, optional Axis over which to compute the inverse FFT. If not given, the last axis is used. norm : {None, “ortho”}, optional New in version 1.10.0. Normalization mode (see numpy.fft). Default is None. out : complex ndarray The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. If n is even, the length of the transformed axis is (n/2)+1. If n is odd, the length is (n+1)/2.

Notes

hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd: ihfft(hfft(a), len(a)) == a, within numerical accuracy.

Examples

>>> spectrum = np.array([ 15, -4, 0, -1, 0, -4])
>>> np.fft.ifft(spectrum)
array([ 1.+0.j,  2.-0.j,  3.+0.j,  4.+0.j,  3.+0.j,  2.-0.j])
>>> np.fft.ihfft(spectrum)
array([ 1.-0.j,  2.-0.j,  3.-0.j,  4.-0.j])

dask.array.fft.rfft(a, n=None, axis=-1)

Wrapping of numpy.fft.rfft

The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.rfft docstring follows below:

Compute the one-dimensional discrete Fourier Transform for real input.

This function computes the one-dimensional n-point discrete Fourier Transform (DFT) of a real-valued array by means of an efficient algorithm called the Fast Fourier Transform (FFT).

Parameters: a : array_like Input array n : int, optional Number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used. axis : int, optional Axis over which to compute the FFT. If not given, the last axis is used. norm : {None, “ortho”}, optional New in version 1.10.0. Normalization mode (see numpy.fft). Default is None. out : complex ndarray The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. If n is even, the length of the transformed axis is (n/2)+1. If n is odd, the length is (n+1)/2. IndexError If axis is larger than the last axis of a.

numpy.fft
For definition of the DFT and conventions used.
irfft
The inverse of rfft.
fft
The one-dimensional FFT of general (complex) input.
fftn
The n-dimensional FFT.
rfftn
The n-dimensional FFT of real input.

Notes

When the DFT is computed for purely real input, the output is Hermitian-symmetric, i.e. the negative frequency terms are just the complex conjugates of the corresponding positive-frequency terms, and the negative-frequency terms are therefore redundant. This function does not compute the negative frequency terms, and the length of the transformed axis of the output is therefore n//2 + 1.

When A = rfft(a) and fs is the sampling frequency, A[0] contains the zero-frequency term 0*fs, which is real due to Hermitian symmetry.

If n is even, A[-1] contains the term representing both positive and negative Nyquist frequency (+fs/2 and -fs/2), and must also be purely real. If n is odd, there is no term at fs/2; A[-1] contains the largest positive frequency (fs/2*(n-1)/n), and is complex in the general case.

If the input a contains an imaginary part, it is silently discarded.

Examples

>>> np.fft.fft([0, 1, 0, 0])
array([ 1.+0.j,  0.-1.j, -1.+0.j,  0.+1.j])
>>> np.fft.rfft([0, 1, 0, 0])
array([ 1.+0.j,  0.-1.j, -1.+0.j])


Notice how the final element of the fft output is the complex conjugate of the second element, for real input. For rfft, this symmetry is exploited to compute only the non-negative frequency terms.

dask.array.fft.irfft(a, n=None, axis=-1)

Wrapping of numpy.fft.irfft

The axis along which the FFT is applied must have a one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.irfft docstring follows below:

Compute the inverse of the n-point DFT for real input.

This function computes the inverse of the one-dimensional n-point discrete Fourier Transform of real input computed by rfft. In other words, irfft(rfft(a), len(a)) == a to within numerical accuracy. (See Notes below for why len(a) is necessary here.)

The input is expected to be in the form returned by rfft, i.e. the real zero-frequency term followed by the complex positive frequency terms in order of increasing frequency. Since the discrete Fourier Transform of real input is Hermitian-symmetric, the negative frequency terms are taken to be the complex conjugates of the corresponding positive frequency terms.

Parameters: a : array_like The input array. n : int, optional Length of the transformed axis of the output. For n output points, n//2+1 input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is determined from the length of the input along the axis specified by axis. axis : int, optional Axis over which to compute the inverse FFT. If not given, the last axis is used. norm : {None, “ortho”}, optional New in version 1.10.0. Normalization mode (see numpy.fft). Default is None. out : ndarray The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given, 2*(m-1) where m is the length of the transformed axis of the input. To get an odd number of output points, n must be specified. IndexError If axis is larger than the last axis of a.

numpy.fft
For definition of the DFT and conventions used.
rfft
The one-dimensional FFT of real input, of which irfft is inverse.
fft
The one-dimensional FFT.
irfft2
The inverse of the two-dimensional FFT of real input.
irfftn
The inverse of the n-dimensional FFT of real input.

Notes

Returns the real valued n-point inverse discrete Fourier transform of a, where a contains the non-negative frequency terms of a Hermitian-symmetric sequence. n is the length of the result, not the input.

If you specify an n such that a must be zero-padded or truncated, the extra/removed values will be added/removed at high frequencies. One can thus resample a series to m points via Fourier interpolation by: a_resamp = irfft(rfft(a), m).

Examples

>>> np.fft.ifft([1, -1j, -1, 1j])
array([ 0.+0.j,  1.+0.j,  0.+0.j,  0.+0.j])
>>> np.fft.irfft([1, -1j, -1])
array([ 0.,  1.,  0.,  0.])


Notice how the last term in the input to the ordinary ifft is the complex conjugate of the second term, and the output has zero imaginary part everywhere. When calling irfft, the negative frequencies are not specified, and the output array is purely real.

dask.array.random.beta(a, b, size=None)

Draw samples from a Beta distribution.

The Beta distribution is a special case of the Dirichlet distribution, and is related to the Gamma distribution. It has the probability distribution function

$f(x; a,b) = \frac{1}{B(\alpha, \beta)} x^{\alpha - 1} (1 - x)^{\beta - 1},$

where the normalisation, B, is the beta function,

$B(\alpha, \beta) = \int_0^1 t^{\alpha - 1} (1 - t)^{\beta - 1} dt.$

It is often seen in Bayesian inference and order statistics.

Parameters: a : float Alpha, non-negative. b : float Beta, non-negative. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : ndarray Array of the given shape, containing values drawn from a Beta distribution.
dask.array.random.binomial(n, p, size=None)

Draw samples from a binomial distribution.

Samples are drawn from a binomial distribution with specified parameters, n trials and p probability of success where n an integer >= 0 and p is in the interval [0,1]. (n may be input as a float, but it is truncated to an integer in use)

Parameters: n : float (but truncated to an integer) parameter, >= 0. p : float parameter, >= 0 and <=1. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar where the values are all integers in [0, n].

scipy.stats.distributions.binom
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the binomial distribution is

$P(N) = \binom{n}{N}p^N(1-p)^{n-N},$

where $$n$$ is the number of trials, $$p$$ is the probability of success, and $$N$$ is the number of successes.

When estimating the standard error of a proportion in a population by using a random sample, the normal distribution works well unless the product p*n <=5, where p = population proportion estimate, and n = number of samples, in which case the binomial distribution is used instead. For example, a sample of 15 people shows 4 who are left handed, and 11 who are right handed. Then p = 4/15 = 27%. 0.27*15 = 4, so the binomial distribution should be used in this case.

References

 [R104] Dalgaard, Peter, “Introductory Statistics with R”, Springer-Verlag, 2002.
 [R105] Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002.
 [R106] Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972.
 [R107] Weisstein, Eric W. “Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/BinomialDistribution.html
 [R108] Wikipedia, “Binomial-distribution”, http://en.wikipedia.org/wiki/Binomial_distribution

Examples

Draw samples from the distribution:

>> n, p = 10, .5 # number of trials, probability of each trial >> s = np.random.binomial(n, p, 1000) # result of flipping a coin 10 times, tested 1000 times.

A real world example. A company drills 9 wild-cat oil exploration wells, each with an estimated probability of success of 0.1. All nine wells fail. What is the probability of that happening?

Let’s do 20,000 trials of the model, and count the number that generate zero positive results.

>> sum(np.random.binomial(9, 0.1, 20000) == 0)/20000. # answer = 0.38885, or 38%.

dask.array.random.chisquare(df, size=None)

Draw samples from a chi-square distribution.

When df independent random variables, each with standard normal distributions (mean 0, variance 1), are squared and summed, the resulting distribution is chi-square (see Notes). This distribution is often used in hypothesis testing.

Parameters: df : int Number of degrees of freedom. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. output : ndarray Samples drawn from the distribution, packed in a size-shaped array. ValueError When df <= 0 or when an inappropriate size (e.g. size=-1) is given.

Notes

The variable obtained by summing the squares of df independent, standard normally distributed random variables:

$Q = \sum_{i=0}^{\mathtt{df}} X^2_i$

is chi-square distributed, denoted

$Q \sim \chi^2_k.$

The probability density function of the chi-squared distribution is

$p(x) = \frac{(1/2)^{k/2}}{\Gamma(k/2)} x^{k/2 - 1} e^{-x/2},$

where $$\Gamma$$ is the gamma function,

$\Gamma(x) = \int_0^{-\infty} t^{x - 1} e^{-t} dt.$

References

 [R109] NIST “Engineering Statistics Handbook” http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm

Examples

>> np.random.chisquare(2,4) array([ 1.89920014, 9.00867716, 3.13710533, 5.62318272])

dask.array.random.exponential(scale=1.0, size=None)

Draw samples from an exponential distribution.

Its probability density function is

$f(x; \frac{1}{\beta}) = \frac{1}{\beta} \exp(-\frac{x}{\beta}),$

for x > 0 and 0 elsewhere. $$\beta$$ is the scale parameter, which is the inverse of the rate parameter $$\lambda = 1/\beta$$. The rate parameter is an alternative, widely used parameterization of the exponential distribution [R112].

The exponential distribution is a continuous analogue of the geometric distribution. It describes many common situations, such as the size of raindrops measured over many rainstorms [R110], or the time between page requests to Wikipedia [R111].

Parameters: scale : float The scale parameter, $$\beta = 1/\lambda$$. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

References

 [R110] (1, 2) Peyton Z. Peebles Jr., “Probability, Random Variables and Random Signal Principles”, 4th ed, 2001, p. 57.
 [R111] (1, 2) “Poisson Process”, Wikipedia, http://en.wikipedia.org/wiki/Poisson_process
 [R112] (1, 2) “Exponential Distribution, Wikipedia, http://en.wikipedia.org/wiki/Exponential_distribution
dask.array.random.f(dfnum, dfden, size=None)

Draw samples from an F distribution.

Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters should be greater than zero.

The random variate of the F distribution (also known as the Fisher distribution) is a continuous probability distribution that arises in ANOVA tests, and is the ratio of two chi-square variates.

Parameters: dfnum : float Degrees of freedom in numerator. Should be greater than zero. dfden : float Degrees of freedom in denominator. Should be greater than zero. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar Samples from the Fisher distribution.

scipy.stats.distributions.f
probability density function, distribution or cumulative density function, etc.

Notes

The F statistic is used to compare in-group variances to between-group variances. Calculating the distribution depends on the sampling, and so it is a function of the respective degrees of freedom in the problem. The variable dfnum is the number of samples minus one, the between-groups degrees of freedom, while dfden is the within-groups degrees of freedom, the sum of the number of samples in each group minus the number of groups.

References

 [R113] Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002.
 [R114] Wikipedia, “F-distribution”, http://en.wikipedia.org/wiki/F-distribution

Examples

An example from Glantz[1], pp 47-40:

Two groups, children of diabetics (25 people) and children from people without diabetes (25 controls). Fasting blood glucose was measured, case group had a mean value of 86.1, controls had a mean value of 82.2. Standard deviations were 2.09 and 2.49 respectively. Are these data consistent with the null hypothesis that the parents diabetic status does not affect their children’s blood glucose levels? Calculating the F statistic from the data gives a value of 36.01.

Draw samples from the distribution:

>> dfnum = 1. # between group degrees of freedom >> dfden = 48. # within groups degrees of freedom >> s = np.random.f(dfnum, dfden, 1000)

The lower bound for the top 1% of the samples is :

>> sort(s)[-10] 7.61988120985

So there is about a 1% chance that the F statistic will exceed 7.62, the measured value is 36, so the null hypothesis is rejected at the 1% level.

dask.array.random.gamma(shape, scale=1.0, size=None)

Draw samples from a Gamma distribution.

Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale (sometimes designated “theta”), where both parameters are > 0.

Parameters: shape : scalar > 0 The shape of the gamma distribution. scale : scalar > 0, optional The scale of the gamma distribution. Default is equal to 1. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : ndarray, float Returns one sample unless size parameter is specified.

scipy.stats.distributions.gamma
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Gamma distribution is

$p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},$

where $$k$$ is the shape and $$\theta$$ the scale, and $$\Gamma$$ is the Gamma function.

The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.

References

 [R115] Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html
 [R116] Wikipedia, “Gamma-distribution”, http://en.wikipedia.org/wiki/Gamma-distribution

Examples

Draw samples from the distribution:

>> shape, scale = 2., 2. # mean and dispersion >> s = np.random.gamma(shape, scale, 1000)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> import scipy.special as sps >> count, bins, ignored = plt.hist(s, 50, normed=True) >> y = bins**(shape-1)*(np.exp(-bins/scale) / .. (sps.gamma(shape)*scale**shape)) >> plt.plot(bins, y, linewidth=2, color=’r’) >> plt.show()

dask.array.random.geometric(p, size=None)

Draw samples from the geometric distribution.

Bernoulli trials are experiments with one of two outcomes: success or failure (an example of such an experiment is flipping a coin). The geometric distribution models the number of trials that must be run in order to achieve success. It is therefore supported on the positive integers, k = 1, 2, ...

The probability mass function of the geometric distribution is

$f(k) = (1 - p)^{k - 1} p$

where p is the probability of success of an individual trial.

Parameters: p : float The probability of success of an individual trial. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : ndarray Samples from the geometric distribution, shaped according to size.

Examples

Draw ten thousand values from the geometric distribution, with the probability of an individual success equal to 0.35:

>> z = np.random.geometric(p=0.35, size=10000)

How many trials succeeded after a single run?

>> (z == 1).sum() / 10000. 0.34889999999999999 #random

dask.array.random.gumbel(loc=0.0, scale=1.0, size=None)

Draw samples from a Gumbel distribution.

Draw samples from a Gumbel distribution with specified location and scale. For more information on the Gumbel distribution, see Notes and References below.

Parameters: loc : float The location of the mode of the distribution. scale : float The scale parameter of the distribution. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar

scipy.stats.gumbel_l, scipy.stats.gumbel_r, scipy.stats.genextreme, weibull

Notes

The Gumbel (or Smallest Extreme Value (SEV) or the Smallest Extreme Value Type I) distribution is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. The Gumbel is a special case of the Extreme Value Type I distribution for maximums from distributions with “exponential-like” tails.

The probability density for the Gumbel distribution is

$p(x) = \frac{e^{-(x - \mu)/ \beta}}{\beta} e^{ -e^{-(x - \mu)/ \beta}},$

where $$\mu$$ is the mode, a location parameter, and $$\beta$$ is the scale parameter.

The Gumbel (named for German mathematician Emil Julius Gumbel) was used very early in the hydrology literature, for modeling the occurrence of flood events. It is also used for modeling maximum wind speed and rainfall rates. It is a “fat-tailed” distribution - the probability of an event in the tail of the distribution is larger than if one used a Gaussian, hence the surprisingly frequent occurrence of 100-year floods. Floods were initially modeled as a Gaussian process, which underestimated the frequency of extreme events.

It is one of a class of extreme value distributions, the Generalized Extreme Value (GEV) distributions, which also includes the Weibull and Frechet.

The function has a mean of $$\mu + 0.57721\beta$$ and a variance of $$\frac{\pi^2}{6}\beta^2$$.

References

 [R117] Gumbel, E. J., “Statistics of Extremes,” New York: Columbia University Press, 1958.
 [R118] Reiss, R.-D. and Thomas, M., “Statistical Analysis of Extreme Values from Insurance, Finance, Hydrology and Other Fields,” Basel: Birkhauser Verlag, 2001.

Examples

Draw samples from the distribution:

>> mu, beta = 0, 0.1 # location and scale >> s = np.random.gumbel(mu, beta, 1000)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> count, bins, ignored = plt.hist(s, 30, normed=True) >> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta) .. * np.exp( -np.exp( -(bins - mu) /beta) ), .. linewidth=2, color=’r’) >> plt.show()

Show how an extreme value distribution can arise from a Gaussian process and compare to a Gaussian:

>> means = [] >> maxima = [] >> for i in range(0,1000) : .. a = np.random.normal(mu, beta, 1000) .. means.append(a.mean()) .. maxima.append(a.max()) >> count, bins, ignored = plt.hist(maxima, 30, normed=True) >> beta = np.std(maxima) * np.sqrt(6) / np.pi >> mu = np.mean(maxima) - 0.57721*beta >> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta) .. * np.exp(-np.exp(-(bins - mu)/beta)), .. linewidth=2, color=’r’) >> plt.plot(bins, 1/(beta * np.sqrt(2 * np.pi)) .. * np.exp(-(bins - mu)**2 / (2 * beta**2)), .. linewidth=2, color=’g’) >> plt.show()

dask.array.random.hypergeometric(ngood, nbad, nsample, size=None)

Draw samples from a Hypergeometric distribution.

Samples are drawn from a hypergeometric distribution with specified parameters, ngood (ways to make a good selection), nbad (ways to make a bad selection), and nsample = number of items sampled, which is less than or equal to the sum ngood + nbad.

Parameters: ngood : int or array_like Number of ways to make a good selection. Must be nonnegative. nbad : int or array_like Number of ways to make a bad selection. Must be nonnegative. nsample : int or array_like Number of items sampled. Must be at least 1 and at most ngood + nbad. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar The values are all integers in [0, n].

scipy.stats.distributions.hypergeom
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Hypergeometric distribution is

$P(x) = \frac{\binom{m}{n}\binom{N-m}{n-x}}{\binom{N}{n}},$

where $$0 \le x \le m$$ and $$n+m-N \le x \le n$$

for P(x) the probability of x successes, n = ngood, m = nbad, and N = number of samples.

Consider an urn with black and white marbles in it, ngood of them black and nbad are white. If you draw nsample balls without replacement, then the hypergeometric distribution describes the distribution of black balls in the drawn sample.

Note that this distribution is very similar to the binomial distribution, except that in this case, samples are drawn without replacement, whereas in the Binomial case samples are drawn with replacement (or the sample space is infinite). As the sample space becomes large, this distribution approaches the binomial.

References

 [R119] Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972.
 [R120] Weisstein, Eric W. “Hypergeometric Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/HypergeometricDistribution.html
 [R121] Wikipedia, “Hypergeometric-distribution”, http://en.wikipedia.org/wiki/Hypergeometric_distribution

Examples

Draw samples from the distribution:

>> ngood, nbad, nsamp = 100, 2, 10 # number of good, number of bad, and number of samples >> s = np.random.hypergeometric(ngood, nbad, nsamp, 1000) >> hist(s) # note that it is very unlikely to grab both bad items

Suppose you have an urn with 15 white and 15 black marbles. If you pull 15 marbles at random, how likely is it that 12 or more of them are one color?

>> s = np.random.hypergeometric(15, 15, 15, 100000) >> sum(s>=12)/100000. + sum(s<=3)/100000. # answer = 0.003 .. pretty unlikely!

dask.array.random.laplace(loc=0.0, scale=1.0, size=None)

Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay).

The Laplace distribution is similar to the Gaussian/normal distribution, but is sharper at the peak and has fatter tails. It represents the difference between two independent, identically distributed exponential random variables.

Parameters: loc : float, optional The position, $$\mu$$, of the distribution peak. scale : float, optional $$\lambda$$, the exponential decay. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or float

Notes

It has the probability density function

$f(x; \mu, \lambda) = \frac{1}{2\lambda} \exp\left(-\frac{|x - \mu|}{\lambda}\right).$

The first law of Laplace, from 1774, states that the frequency of an error can be expressed as an exponential function of the absolute magnitude of the error, which leads to the Laplace distribution. For many problems in economics and health sciences, this distribution seems to model the data better than the standard Gaussian distribution.

References

 [R122] Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972.
 [R123] Kotz, Samuel, et. al. “The Laplace Distribution and Generalizations, ” Birkhauser, 2001.
 [R124] Weisstein, Eric W. “Laplace Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LaplaceDistribution.html
 [R125] Wikipedia, “Laplace Distribution”, http://en.wikipedia.org/wiki/Laplace_distribution

Examples

Draw samples from the distribution

>> loc, scale = 0., 1. >> s = np.random.laplace(loc, scale, 1000)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> count, bins, ignored = plt.hist(s, 30, normed=True) >> x = np.arange(-8., 8., .01) >> pdf = np.exp(-abs(x-loc)/scale)/(2.*scale) >> plt.plot(x, pdf)

Plot Gaussian for comparison:

>> g = (1/(scale * np.sqrt(2 * np.pi)) * .. np.exp(-(x - loc)**2 / (2 * scale**2))) >> plt.plot(x,g)

dask.array.random.logistic(loc=0.0, scale=1.0, size=None)

Draw samples from a logistic distribution.

Samples are drawn from a logistic distribution with specified parameters, loc (location or mean, also median), and scale (>0).

Parameters: loc : float scale : float > 0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar where the values are all integers in [0, n].

scipy.stats.distributions.logistic
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Logistic distribution is

$P(x) = P(x) = \frac{e^{-(x-\mu)/s}}{s(1+e^{-(x-\mu)/s})^2},$

where $$\mu$$ = location and $$s$$ = scale.

The Logistic distribution is used in Extreme Value problems where it can act as a mixture of Gumbel distributions, in Epidemiology, and by the World Chess Federation (FIDE) where it is used in the Elo ranking system, assuming the performance of each player is a logistically distributed random variable.

References

 [R126] Reiss, R.-D. and Thomas M. (2001), “Statistical Analysis of Extreme Values, from Insurance, Finance, Hydrology and Other Fields,” Birkhauser Verlag, Basel, pp 132-133.
 [R127] Weisstein, Eric W. “Logistic Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LogisticDistribution.html
 [R128] Wikipedia, “Logistic-distribution”, http://en.wikipedia.org/wiki/Logistic_distribution

Examples

Draw samples from the distribution:

>> loc, scale = 10, 1 >> s = np.random.logistic(loc, scale, 10000) >> count, bins, ignored = plt.hist(s, bins=50)

# plot against distribution

>> def logist(x, loc, scale): .. return exp((loc-x)/scale)/(scale*(1+exp((loc-x)/scale))**2) >> plt.plot(bins, logist(bins, loc, scale)*count.max()/.. logist(bins, loc, scale).max()) >> plt.show()

dask.array.random.lognormal(mean=0.0, sigma=1.0, size=None)

Draw samples from a log-normal distribution.

Draw samples from a log-normal distribution with specified mean, standard deviation, and array shape. Note that the mean and standard deviation are not the values for the distribution itself, but of the underlying normal distribution it is derived from.

Parameters: mean : float Mean value of the underlying normal distribution sigma : float, > 0. Standard deviation of the underlying normal distribution size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or float The desired samples. An array of the same shape as size if given, if size is None a float is returned.

scipy.stats.lognorm
probability density function, distribution, cumulative density function, etc.

Notes

A variable x has a log-normal distribution if log(x) is normally distributed. The probability density function for the log-normal distribution is:

$p(x) = \frac{1}{\sigma x \sqrt{2\pi}} e^{(-\frac{(ln(x)-\mu)^2}{2\sigma^2})}$

where $$\mu$$ is the mean and $$\sigma$$ is the standard deviation of the normally distributed logarithm of the variable. A log-normal distribution results if a random variable is the product of a large number of independent, identically-distributed variables in the same way that a normal distribution results if the variable is the sum of a large number of independent, identically-distributed variables.

References

 [R129] Limpert, E., Stahel, W. A., and Abbt, M., “Log-normal Distributions across the Sciences: Keys and Clues,” BioScience, Vol. 51, No. 5, May, 2001. http://stat.ethz.ch/~stahel/lognormal/bioscience.pdf
 [R130] Reiss, R.D. and Thomas, M., “Statistical Analysis of Extreme Values,” Basel: Birkhauser Verlag, 2001, pp. 31-32.

Examples

Draw samples from the distribution:

>> mu, sigma = 3., 1. # mean and standard deviation >> s = np.random.lognormal(mu, sigma, 1000)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> count, bins, ignored = plt.hist(s, 100, normed=True, align=’mid’)

>> x = np.linspace(min(bins), max(bins), 10000) >> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) .. / (x * sigma * np.sqrt(2 * np.pi)))

>> plt.plot(x, pdf, linewidth=2, color=’r’) >> plt.axis(‘tight’) >> plt.show()

Demonstrate that taking the products of random samples from a uniform distribution can be fit well by a log-normal probability density function.

>> # Generate a thousand samples: each is the product of 100 random >> # values, drawn from a normal distribution. >> b = [] >> for i in range(1000): .. a = 10. + np.random.random(100) .. b.append(np.product(a))

>> b = np.array(b) / np.min(b) # scale values to be positive >> count, bins, ignored = plt.hist(b, 100, normed=True, align=’mid’) >> sigma = np.std(np.log(b)) >> mu = np.mean(np.log(b))

>> x = np.linspace(min(bins), max(bins), 10000) >> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) .. / (x * sigma * np.sqrt(2 * np.pi)))

>> plt.plot(x, pdf, color=’r’, linewidth=2) >> plt.show()

dask.array.random.logseries(p, size=None)

Draw samples from a logarithmic series distribution.

Samples are drawn from a log series distribution with specified shape parameter, 0 < p < 1.

Parameters: loc : float scale : float > 0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar where the values are all integers in [0, n].

scipy.stats.distributions.logser
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Log Series distribution is

$P(k) = \frac{-p^k}{k \ln(1-p)},$

where p = probability.

The log series distribution is frequently used to represent species richness and occurrence, first proposed by Fisher, Corbet, and Williams in 1943 [2]. It may also be used to model the numbers of occupants seen in cars [3].

References

 [R131] Buzas, Martin A.; Culver, Stephen J., Understanding regional species diversity through the log series distribution of occurrences: BIODIVERSITY RESEARCH Diversity & Distributions, Volume 5, Number 5, September 1999 , pp. 187-195(9).
 [R132] Fisher, R.A,, A.S. Corbet, and C.B. Williams. 1943. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology, 12:42-58.
 [R133] D. J. Hand, F. Daly, D. Lunn, E. Ostrowski, A Handbook of Small Data Sets, CRC Press, 1994.
 [R134] Wikipedia, “Logarithmic-distribution”, http://en.wikipedia.org/wiki/Logarithmic-distribution

Examples

Draw samples from the distribution:

>> a = .6 >> s = np.random.logseries(a, 10000) >> count, bins, ignored = plt.hist(s)

# plot against distribution

>> def logseries(k, p): .. return -p**k/(k*log(1-p)) >> plt.plot(bins, logseries(bins, a)*count.max()/

logseries(bins, a).max(), ‘r’)

>> plt.show()

dask.array.random.negative_binomial(n, p, size=None)

Draw samples from a negative binomial distribution.

Samples are drawn from a negative binomial distribution with specified parameters, n trials and p probability of success where n is an integer > 0 and p is in the interval [0, 1].

Parameters: n : int Parameter, > 0. p : float Parameter, >= 0 and <=1. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : int or ndarray of ints Drawn samples.

Notes

The probability density for the negative binomial distribution is

$P(N;n,p) = \binom{N+n-1}{n-1}p^{n}(1-p)^{N},$

where $$n-1$$ is the number of successes, $$p$$ is the probability of success, and $$N+n-1$$ is the number of trials. The negative binomial distribution gives the probability of n-1 successes and N failures in N+n-1 trials, and success on the (N+n)th trial.

If one throws a die repeatedly until the third time a “1” appears, then the probability distribution of the number of non-“1”s that appear before the third “1” is a negative binomial distribution.

References

 [R135] Weisstein, Eric W. “Negative Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NegativeBinomialDistribution.html
 [R136] Wikipedia, “Negative binomial distribution”, http://en.wikipedia.org/wiki/Negative_binomial_distribution

Examples

Draw samples from the distribution:

A real world example. A company drills wild-cat oil exploration wells, each with an estimated probability of success of 0.1. What is the probability of having one success for each successive well, that is what is the probability of a single success after drilling 5 wells, after 6 wells, etc.?

>> s = np.random.negative_binomial(1, 0.1, 100000) >> for i in range(1, 11): .. probability = sum(s<i) / 100000. .. print i, “wells drilled, probability of one success =”, probability

dask.array.random.noncentral_chisquare(df, nonc, size=None)

Draw samples from a noncentral chi-square distribution.

The noncentral $$\chi^2$$ distribution is a generalisation of the $$\chi^2$$ distribution.

Parameters: df : int Degrees of freedom, should be > 0 as of Numpy 1.10, should be > 1 for earlier versions. nonc : float Non-centrality, should be non-negative. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

Notes

The probability density function for the noncentral Chi-square distribution is

$P(x;df,nonc) = \sum^{\infty}_{i=0} \frac{e^{-nonc/2}(nonc/2)^{i}}{i!} \P_{Y_{df+2i}}(x),$

where $$Y_{q}$$ is the Chi-square with q degrees of freedom.

In Delhi (2007), it is noted that the noncentral chi-square is useful in bombing and coverage problems, the probability of killing the point target given by the noncentral chi-squared distribution.

References

 [R137] Delhi, M.S. Holla, “On a noncentral chi-square distribution in the analysis of weapon systems effectiveness”, Metrika, Volume 15, Number 1 / December, 1970.
 [R138] Wikipedia, “Noncentral chi-square distribution” http://en.wikipedia.org/wiki/Noncentral_chi-square_distribution

Examples

Draw values from the distribution and plot the histogram

>> import matplotlib.pyplot as plt >> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000), .. bins=200, normed=True) >> plt.show()

Draw values from a noncentral chisquare with very small noncentrality, and compare to a chisquare.

>> plt.figure() >> values = plt.hist(np.random.noncentral_chisquare(3, .0000001, 100000), .. bins=np.arange(0., 25, .1), normed=True) >> values2 = plt.hist(np.random.chisquare(3, 100000), .. bins=np.arange(0., 25, .1), normed=True) >> plt.plot(values[1][0:-1], values[0]-values2[0], ‘ob’) >> plt.show()

Demonstrate how large values of non-centrality lead to a more symmetric distribution.

>> plt.figure() >> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000), .. bins=200, normed=True) >> plt.show()

dask.array.random.noncentral_f(dfnum, dfden, nonc, size=None)

Draw samples from the noncentral F distribution.

Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters > 1. nonc is the non-centrality parameter.

Parameters: dfnum : int Parameter, should be > 1. dfden : int Parameter, should be > 1. nonc : float Parameter, should be >= 0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : scalar or ndarray Drawn samples.

Notes

When calculating the power of an experiment (power = probability of rejecting the null hypothesis when a specific alternative is true) the non-central F statistic becomes important. When the null hypothesis is true, the F statistic follows a central F distribution. When the null hypothesis is not true, then it follows a non-central F statistic.

References

 [R139] Weisstein, Eric W. “Noncentral F-Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NoncentralF-Distribution.html
 [R140] Wikipedia, “Noncentral F distribution”, http://en.wikipedia.org/wiki/Noncentral_F-distribution

Examples

In a study, testing for a specific alternative to the null hypothesis requires use of the Noncentral F distribution. We need to calculate the area in the tail of the distribution that exceeds the value of the F distribution for the null hypothesis. We’ll plot the two probability distributions for comparison.

>> dfnum = 3 # between group deg of freedom >> dfden = 20 # within groups degrees of freedom >> nonc = 3.0 >> nc_vals = np.random.noncentral_f(dfnum, dfden, nonc, 1000000) >> NF = np.histogram(nc_vals, bins=50, normed=True) >> c_vals = np.random.f(dfnum, dfden, 1000000) >> F = np.histogram(c_vals, bins=50, normed=True) >> plt.plot(F[1][1:], F[0]) >> plt.plot(NF[1][1:], NF[0]) >> plt.show()

dask.array.random.normal(loc=0.0, scale=1.0, size=None)

Draw random samples from a normal (Gaussian) distribution.

The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [R142], is often called the bell curve because of its characteristic shape (see the example below).

The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [R142].

Parameters: loc : float Mean (“centre”) of the distribution. scale : float Standard deviation (spread or “width”) of the distribution. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

scipy.stats.distributions.norm
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Gaussian distribution is

$p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },$

where $$\mu$$ is the mean and $$\sigma$$ the standard deviation. The square of the standard deviation, $$\sigma^2$$, is called the variance.

The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at $$x + \sigma$$ and $$x - \sigma$$ [R142]). This implies that numpy.random.normal is more likely to return samples lying close to the mean, rather than those far away.

References

 [R141] Wikipedia, “Normal distribution”, http://en.wikipedia.org/wiki/Normal_distribution
 [R142] (1, 2, 3, 4) P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125.

Examples

Draw samples from the distribution:

>> mu, sigma = 0, 0.1 # mean and standard deviation >> s = np.random.normal(mu, sigma, 1000)

Verify the mean and the variance:

>> abs(mu - np.mean(s)) < 0.01 True

>> abs(sigma - np.std(s, ddof=1)) < 0.01 True

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> count, bins, ignored = plt.hist(s, 30, normed=True) >> plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * .. np.exp( - (bins - mu)**2 / (2 * sigma**2) ), .. linewidth=2, color=’r’) >> plt.show()

dask.array.random.pareto(a, size=None)

Draw samples from a Pareto II or Lomax distribution with specified shape.

The Lomax or Pareto II distribution is a shifted Pareto distribution. The classical Pareto distribution can be obtained from the Lomax distribution by adding 1 and multiplying by the scale parameter m (see Notes). The smallest value of the Lomax distribution is zero while for the classical Pareto distribution it is mu, where the standard Pareto distribution has location mu = 1. Lomax can also be considered as a simplified version of the Generalized Pareto distribution (available in SciPy), with the scale set to one and the location set to zero.

The Pareto distribution must be greater than zero, and is unbounded above. It is also known as the “80-20 rule”. In this distribution, 80 percent of the weights are in the lowest 20 percent of the range, while the other 20 percent fill the remaining 80 percent of the range.

Parameters: shape : float, > 0. Shape of the distribution. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

scipy.stats.distributions.lomax.pdf
probability density function, distribution or cumulative density function, etc.
scipy.stats.distributions.genpareto.pdf
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Pareto distribution is

$p(x) = \frac{am^a}{x^{a+1}}$

where $$a$$ is the shape and $$m$$ the scale.

The Pareto distribution, named after the Italian economist Vilfredo Pareto, is a power law probability distribution useful in many real world problems. Outside the field of economics it is generally referred to as the Bradford distribution. Pareto developed the distribution to describe the distribution of wealth in an economy. It has also found use in insurance, web page access statistics, oil field sizes, and many other problems, including the download frequency for projects in Sourceforge [R143]. It is one of the so-called “fat-tailed” distributions.

References

 [R143] (1, 2) Francis Hunt and Paul Johnson, On the Pareto Distribution of Sourceforge projects.
 [R144] Pareto, V. (1896). Course of Political Economy. Lausanne.
 [R145] Reiss, R.D., Thomas, M.(2001), Statistical Analysis of Extreme Values, Birkhauser Verlag, Basel, pp 23-30.
 [R146] Wikipedia, “Pareto distribution”, http://en.wikipedia.org/wiki/Pareto_distribution

Examples

Draw samples from the distribution:

>> a, m = 3., 2. # shape and mode >> s = (np.random.pareto(a, 1000) + 1) * m

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> count, bins, _ = plt.hist(s, 100, normed=True) >> fit = a*m**a / bins**(a+1) >> plt.plot(bins, max(count)*fit/max(fit), linewidth=2, color=’r’) >> plt.show()

dask.array.random.poisson(lam=1.0, size=None)

Draw samples from a Poisson distribution.

The Poisson distribution is the limit of the binomial distribution for large N.

Parameters: lam : float or sequence of float Expectation of interval, should be >= 0. A sequence of expectation intervals must be broadcastable over the requested size. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar The drawn samples, of shape size, if it was provided.

Notes

The Poisson distribution

$f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!}$

For events with an expected separation $$\lambda$$ the Poisson distribution $$f(k; \lambda)$$ describes the probability of $$k$$ events occurring within the observed interval $$\lambda$$.

Because the output is limited to the range of the C long type, a ValueError is raised when lam is within 10 sigma of the maximum representable value.

References

 [R147] Weisstein, Eric W. “Poisson Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/PoissonDistribution.html
 [R148] Wikipedia, “Poisson distribution”, http://en.wikipedia.org/wiki/Poisson_distribution

Examples

Draw samples from the distribution:

>> import numpy as np >> s = np.random.poisson(5, 10000)

Display histogram of the sample:

>> import matplotlib.pyplot as plt >> count, bins, ignored = plt.hist(s, 14, normed=True) >> plt.show()

Draw each 100 values for lambda 100 and 500:

>> s = np.random.poisson(lam=(100., 500.), size=(100, 2))

dask.array.random.power(a, size=None)

Draws samples in [0, 1] from a power distribution with positive exponent a - 1.

Also known as the power function distribution.

Parameters: a : float parameter, > 0 size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar The returned samples lie in [0, 1]. ValueError If a < 1.

Notes

The probability density function is

$P(x; a) = ax^{a-1}, 0 \le x \le 1, a>0.$

The power function distribution is just the inverse of the Pareto distribution. It may also be seen as a special case of the Beta distribution.

It is used, for example, in modeling the over-reporting of insurance claims.

References

 [R149] Christian Kleiber, Samuel Kotz, “Statistical size distributions in economics and actuarial sciences”, Wiley, 2003.
 [R150] Heckert, N. A. and Filliben, James J. “NIST Handbook 148: Dataplot Reference Manual, Volume 2: Let Subcommands and Library Functions”, National Institute of Standards and Technology Handbook Series, June 2003. http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/powpdf.pdf

Examples

Draw samples from the distribution:

>> a = 5. # shape >> samples = 1000 >> s = np.random.power(a, samples)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> count, bins, ignored = plt.hist(s, bins=30) >> x = np.linspace(0, 1, 100) >> y = a*x**(a-1.) >> normed_y = samples*np.diff(bins)[0]*y >> plt.plot(x, normed_y) >> plt.show()

Compare the power function distribution to the inverse of the Pareto.

>> from scipy import stats >> rvs = np.random.power(5, 1000000) >> rvsp = np.random.pareto(5, 1000000) >> xx = np.linspace(0,1,100) >> powpdf = stats.powerlaw.pdf(xx,5)

>> plt.figure() >> plt.hist(rvs, bins=50, normed=True) >> plt.plot(xx,powpdf,’r-‘) >> plt.title(‘np.random.power(5)’)

>> plt.figure() >> plt.hist(1./(1.+rvsp), bins=50, normed=True) >> plt.plot(xx,powpdf,’r-‘) >> plt.title(‘inverse of 1 + np.random.pareto(5)’)

>> plt.figure() >> plt.hist(1./(1.+rvsp), bins=50, normed=True) >> plt.plot(xx,powpdf,’r-‘) >> plt.title(‘inverse of stats.pareto(5)’)

dask.array.random.random(size=None)

Return random floats in the half-open interval [0.0, 1.0).

Results are from the “continuous uniform” distribution over the stated interval. To sample $$Unif[a, b), b > a$$ multiply the output of random_sample by (b-a) and add a:

(b - a) * random_sample() + a

Parameters: size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : float or ndarray of floats Array of random floats of shape size (unless size=None, in which case a single float is returned).

Examples

>> np.random.random_sample() 0.47108547995356098 >> type(np.random.random_sample()) <type ‘float’> >> np.random.random_sample((5,)) array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428])

Three-by-two array of random numbers from [-5, 0):

>> 5 * np.random.random_sample((3, 2)) - 5 array([[-3.99149989, -0.52338984],

[-2.99091858, -0.79479508], [-1.23204345, -1.75224494]])
dask.array.random.random_sample(size=None)

Return random floats in the half-open interval [0.0, 1.0).

Results are from the “continuous uniform” distribution over the stated interval. To sample $$Unif[a, b), b > a$$ multiply the output of random_sample by (b-a) and add a:

(b - a) * random_sample() + a

Parameters: size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : float or ndarray of floats Array of random floats of shape size (unless size=None, in which case a single float is returned).

Examples

>> np.random.random_sample() 0.47108547995356098 >> type(np.random.random_sample()) <type ‘float’> >> np.random.random_sample((5,)) array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428])

Three-by-two array of random numbers from [-5, 0):

>> 5 * np.random.random_sample((3, 2)) - 5 array([[-3.99149989, -0.52338984],

[-2.99091858, -0.79479508], [-1.23204345, -1.75224494]])
dask.array.random.rayleigh(scale=1.0, size=None)

Draw samples from a Rayleigh distribution.

The $$\chi$$ and Weibull distributions are generalizations of the Rayleigh.

Parameters: scale : scalar Scale, also equals the mode. Should be >= 0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

Notes

The probability density function for the Rayleigh distribution is

$P(x;scale) = \frac{x}{scale^2}e^{\frac{-x^2}{2 \cdotp scale^2}}$

The Rayleigh distribution would arise, for example, if the East and North components of the wind velocity had identical zero-mean Gaussian distributions. Then the wind speed would have a Rayleigh distribution.

References

 [R151] Brighton Webs Ltd., “Rayleigh Distribution,” http://www.brighton-webs.co.uk/distributions/rayleigh.asp
 [R152] Wikipedia, “Rayleigh distribution” http://en.wikipedia.org/wiki/Rayleigh_distribution

Examples

Draw values from the distribution and plot the histogram

>> values = hist(np.random.rayleigh(3, 100000), bins=200, normed=True)

Wave heights tend to follow a Rayleigh distribution. If the mean wave height is 1 meter, what fraction of waves are likely to be larger than 3 meters?

>> meanvalue = 1 >> modevalue = np.sqrt(2 / np.pi) * meanvalue >> s = np.random.rayleigh(modevalue, 1000000)

The percentage of waves larger than 3 meters is:

>> 100.*sum(s>3)/1000000. 0.087300000000000003

dask.array.random.standard_cauchy(size=None)

Draw samples from a standard Cauchy distribution with mode = 0.

Also known as the Lorentz distribution.

Parameters: size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar The drawn samples.

Notes

The probability density function for the full Cauchy distribution is

$P(x; x_0, \gamma) = \frac{1}{\pi \gamma \bigl[ 1+ (\frac{x-x_0}{\gamma})^2 \bigr] }$

and the Standard Cauchy distribution just sets $$x_0=0$$ and $$\gamma=1$$

The Cauchy distribution arises in the solution to the driven harmonic oscillator problem, and also describes spectral line broadening. It also describes the distribution of values at which a line tilted at a random angle will cut the x axis.

When studying hypothesis tests that assume normality, seeing how the tests perform on data from a Cauchy distribution is a good indicator of their sensitivity to a heavy-tailed distribution, since the Cauchy looks very much like a Gaussian distribution, but with heavier tails.

References

 [R153] NIST/SEMATECH e-Handbook of Statistical Methods, “Cauchy Distribution”, http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm
 [R154] Weisstein, Eric W. “Cauchy Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/CauchyDistribution.html
 [R155] Wikipedia, “Cauchy distribution” http://en.wikipedia.org/wiki/Cauchy_distribution

Examples

Draw samples and plot the distribution:

>> s = np.random.standard_cauchy(1000000) >> s = s[(s>-25) & (s<25)] # truncate distribution so it plots well >> plt.hist(s, bins=100) >> plt.show()

dask.array.random.standard_exponential(size=None)

Draw samples from the standard exponential distribution.

standard_exponential is identical to the exponential distribution with a scale parameter of 1.

Parameters: size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : float or ndarray Drawn samples.

Examples

Output a 3x8000 array:

>> n = np.random.standard_exponential((3, 8000))

dask.array.random.standard_gamma(shape, size=None)

Draw samples from a standard Gamma distribution.

Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale=1.

Parameters: shape : float Parameter, should be > 0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar The drawn samples.

scipy.stats.distributions.gamma
probability density function, distribution or cumulative density function, etc.

Notes

The probability density for the Gamma distribution is

$p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},$

where $$k$$ is the shape and $$\theta$$ the scale, and $$\Gamma$$ is the Gamma function.

The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.

References

 [R156] Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html
 [R157] Wikipedia, “Gamma-distribution”, http://en.wikipedia.org/wiki/Gamma-distribution

Examples

Draw samples from the distribution:

>> shape, scale = 2., 1. # mean and width >> s = np.random.standard_gamma(shape, 1000000)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> import scipy.special as sps >> count, bins, ignored = plt.hist(s, 50, normed=True) >> y = bins**(shape-1) * ((np.exp(-bins/scale))/ .. (sps.gamma(shape) * scale**shape)) >> plt.plot(bins, y, linewidth=2, color=’r’) >> plt.show()

dask.array.random.standard_normal(size=None)

Draw samples from a standard Normal distribution (mean=0, stdev=1).

Parameters: size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : float or ndarray Drawn samples.

Examples

>> s = np.random.standard_normal(8000) >> s array([ 0.6888893 , 0.78096262, -0.89086505, .., 0.49876311, #random

-0.38672696, -0.4685006 ]) #random

>> s.shape (8000,) >> s = np.random.standard_normal(size=(3, 4, 2)) >> s.shape (3, 4, 2)

dask.array.random.standard_t(df, size=None)

Draw samples from a standard Student’s t distribution with df degrees of freedom.

A special case of the hyperbolic distribution. As df gets large, the result resembles that of the standard normal distribution (standard_normal).

Parameters: df : int Degrees of freedom, should be > 0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar Drawn samples.

Notes

The probability density function for the t distribution is

$P(x, df) = \frac{\Gamma(\frac{df+1}{2})}{\sqrt{\pi df} \Gamma(\frac{df}{2})}\Bigl( 1+\frac{x^2}{df} \Bigr)^{-(df+1)/2}$

The t test is based on an assumption that the data come from a Normal distribution. The t test provides a way to test whether the sample mean (that is the mean calculated from the data) is a good estimate of the true mean.

The derivation of the t-distribution was first published in 1908 by William Gisset while working for the Guinness Brewery in Dublin. Due to proprietary issues, he had to publish under a pseudonym, and so he used the name Student.

References

 [R158] (1, 2) Dalgaard, Peter, “Introductory Statistics With R”, Springer, 2002.
 [R159] Wikipedia, “Student’s t-distribution” http://en.wikipedia.org/wiki/Student’s_t-distribution

Examples

From Dalgaard page 83 [R158], suppose the daily energy intake for 11 women in Kj is:

>> intake = np.array([5260., 5470, 5640, 6180, 6390, 6515, 6805, 7515, .. 7515, 8230, 8770])

Does their energy intake deviate systematically from the recommended value of 7725 kJ?

We have 10 degrees of freedom, so is the sample mean within 95% of the recommended value?

>> s = np.random.standard_t(10, size=100000) >> np.mean(intake) 6753.636363636364 >> intake.std(ddof=1) 1142.1232221373727

Calculate the t statistic, setting the ddof parameter to the unbiased value so the divisor in the standard deviation will be degrees of freedom, N-1.

>> t = (np.mean(intake)-7725)/(intake.std(ddof=1)/np.sqrt(len(intake))) >> import matplotlib.pyplot as plt >> h = plt.hist(s, bins=100, normed=True)

For a one-sided t-test, how far out in the distribution does the t statistic appear?

>> np.sum(s<t) / float(len(s)) 0.0090699999999999999 #random

So the p-value is about 0.009, which says the null hypothesis has a probability of about 99% of being true.

dask.array.random.triangular(left, mode, right, size=None)

Draw samples from the triangular distribution.

The triangular distribution is a continuous probability distribution with lower limit left, peak at mode, and upper limit right. Unlike the other distributions, these parameters directly define the shape of the pdf.

Parameters: left : scalar Lower limit. mode : scalar The value where the peak of the distribution occurs. The value should fulfill the condition left <= mode <= right. right : scalar Upper limit, should be larger than left. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar The returned samples all lie in the interval [left, right].

Notes

The probability density function for the triangular distribution is

$\begin{split}P(x;l, m, r) = \begin{cases} \frac{2(x-l)}{(r-l)(m-l)}& \text{for l \leq x \leq m},\\ \frac{2(r-x)}{(r-l)(r-m)}& \text{for m \leq x \leq r},\\ 0& \text{otherwise}. \end{cases}\end{split}$

The triangular distribution is often used in ill-defined problems where the underlying distribution is not known, but some knowledge of the limits and mode exists. Often it is used in simulations.

References

 [R160] Wikipedia, “Triangular distribution” http://en.wikipedia.org/wiki/Triangular_distribution

Examples

Draw values from the distribution and plot the histogram:

>> import matplotlib.pyplot as plt >> h = plt.hist(np.random.triangular(-3, 0, 8, 100000), bins=200, .. normed=True) >> plt.show()

dask.array.random.uniform(low=0.0, high=1.0, size=None)

Draw samples from a uniform distribution.

Samples are uniformly distributed over the half-open interval [low, high) (includes low, but excludes high). In other words, any value within the given interval is equally likely to be drawn by uniform.

Parameters: low : float, optional Lower boundary of the output interval. All values generated will be greater than or equal to low. The default value is 0. high : float Upper boundary of the output interval. All values generated will be less than high. The default value is 1.0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. out : ndarray Drawn samples, with shape size.

randint
Discrete uniform distribution, yielding integers.
random_integers
Discrete uniform distribution over the closed interval [low, high].
random_sample
Floats uniformly distributed over [0, 1).
random
Alias for random_sample.
rand
Convenience function that accepts dimensions as input, e.g., rand(2,2) would generate a 2-by-2 array of floats, uniformly distributed over [0, 1).

Notes

The probability density function of the uniform distribution is

$p(x) = \frac{1}{b - a}$

anywhere within the interval [a, b), and zero elsewhere.

Examples

Draw samples from the distribution:

>> s = np.random.uniform(-1,0,1000)

All values are within the given interval:

>> np.all(s >= -1) True >> np.all(s < 0) True

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> count, bins, ignored = plt.hist(s, 15, normed=True) >> plt.plot(bins, np.ones_like(bins), linewidth=2, color=’r’) >> plt.show()

dask.array.random.vonmises(mu, kappa, size=None)

Draw samples from a von Mises distribution.

Samples are drawn from a von Mises distribution with specified mode (mu) and dispersion (kappa), on the interval [-pi, pi].

The von Mises distribution (also known as the circular normal distribution) is a continuous probability distribution on the unit circle. It may be thought of as the circular analogue of the normal distribution.

Parameters: mu : float Mode (“center”) of the distribution. kappa : float Dispersion of the distribution, has to be >=0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : scalar or ndarray The returned samples, which are in the interval [-pi, pi].

scipy.stats.distributions.vonmises
probability density function, distribution, or cumulative density function, etc.

Notes

The probability density for the von Mises distribution is

$p(x) = \frac{e^{\kappa cos(x-\mu)}}{2\pi I_0(\kappa)},$

where $$\mu$$ is the mode and $$\kappa$$ the dispersion, and $$I_0(\kappa)$$ is the modified Bessel function of order 0.

The von Mises is named for Richard Edler von Mises, who was born in Austria-Hungary, in what is now the Ukraine. He fled to the United States in 1939 and became a professor at Harvard. He worked in probability theory, aerodynamics, fluid mechanics, and philosophy of science.

References

 [R161] Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972.
 [R162] von Mises, R., “Mathematical Theory of Probability and Statistics”, New York: Academic Press, 1964.

Examples

Draw samples from the distribution:

>> mu, kappa = 0.0, 4.0 # mean and dispersion >> s = np.random.vonmises(mu, kappa, 1000)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> from scipy.special import i0 >> plt.hist(s, 50, normed=True) >> x = np.linspace(-np.pi, np.pi, num=51) >> y = np.exp(kappa*np.cos(x-mu))/(2*np.pi*i0(kappa)) >> plt.plot(x, y, linewidth=2, color=’r’) >> plt.show()

dask.array.random.wald(mean, scale, size=None)

Draw samples from a Wald, or inverse Gaussian, distribution.

As the scale approaches infinity, the distribution becomes more like a Gaussian. Some references claim that the Wald is an inverse Gaussian with mean equal to 1, but this is by no means universal.

The inverse Gaussian distribution was first studied in relationship to Brownian motion. In 1956 M.C.K. Tweedie used the name inverse Gaussian because there is an inverse relationship between the time to cover a unit distance and distance covered in unit time.

Parameters: mean : scalar Distribution mean, should be > 0. scale : scalar Scale parameter, should be >= 0. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray or scalar Drawn sample, all greater than zero.

Notes

The probability density function for the Wald distribution is

$P(x;mean,scale) = \sqrt{\frac{scale}{2\pi x^3}}e^ \frac{-scale(x-mean)^2}{2\cdotp mean^2x}$

As noted above the inverse Gaussian distribution first arise from attempts to model Brownian motion. It is also a competitor to the Weibull for use in reliability modeling and modeling stock returns and interest rate processes.

References

 [R163] Brighton Webs Ltd., Wald Distribution, http://www.brighton-webs.co.uk/distributions/wald.asp
 [R164] Chhikara, Raj S., and Folks, J. Leroy, “The Inverse Gaussian Distribution: Theory : Methodology, and Applications”, CRC Press, 1988.
 [R165] Wikipedia, “Wald distribution” http://en.wikipedia.org/wiki/Wald_distribution

Examples

Draw values from the distribution and plot the histogram:

>> import matplotlib.pyplot as plt >> h = plt.hist(np.random.wald(3, 2, 100000), bins=200, normed=True) >> plt.show()

dask.array.random.weibull(a, size=None)

Draw samples from a Weibull distribution.

Draw samples from a 1-parameter Weibull distribution with the given shape parameter a.

$X = (-ln(U))^{1/a}$

Here, U is drawn from the uniform distribution over (0,1].

The more common 2-parameter Weibull, including a scale parameter $$\lambda$$ is just $$X = \lambda(-ln(U))^{1/a}$$.

Parameters: a : float Shape of the distribution. size : int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. samples : ndarray

scipy.stats.distributions.weibull_max, scipy.stats.distributions.weibull_min, scipy.stats.distributions.genextreme, gumbel

Notes

The Weibull (or Type III asymptotic extreme value distribution for smallest values, SEV Type III, or Rosin-Rammler distribution) is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. This class includes the Gumbel and Frechet distributions.

The probability density for the Weibull distribution is

$p(x) = \frac{a} {\lambda}(\frac{x}{\lambda})^{a-1}e^{-(x/\lambda)^a},$

where $$a$$ is the shape and $$\lambda$$ the scale.

The function has its peak (the mode) at $$\lambda(\frac{a-1}{a})^{1/a}$$.

When a = 1, the Weibull distribution reduces to the exponential distribution.

References

 [R166] Waloddi Weibull, Royal Technical University, Stockholm, 1939 “A Statistical Theory Of The Strength Of Materials”, Ingeniorsvetenskapsakademiens Handlingar Nr 151, 1939, Generalstabens Litografiska Anstalts Forlag, Stockholm.
 [R167] Waloddi Weibull, “A Statistical Distribution Function of Wide Applicability”, Journal Of Applied Mechanics ASME Paper 1951.
 [R168] Wikipedia, “Weibull distribution”, http://en.wikipedia.org/wiki/Weibull_distribution

Examples

Draw samples from the distribution:

>> a = 5. # shape >> s = np.random.weibull(a, 1000)

Display the histogram of the samples, along with the probability density function:

>> import matplotlib.pyplot as plt >> x = np.arange(1,100.)/50. >> def weib(x,n,a): .. return (a / n) * (x / n)**(a - 1) * np.exp(-(x / n)**a)

>> count, bins, ignored = plt.hist(np.random.weibull(5.,1000)) >> x = np.arange(1,100.)/50. >> scale = count.max()/weib(x, 1., 5.).max() >> plt.plot(x, weib(x, 1., 5.)*scale) >> plt.show()

dask.array.random.zipf(a, size=None)

Standard distributions

dask.array.core.map_blocks(func, *args, **kwargs)

Map a function across all blocks of a dask array.

Parameters: func : callable Function to apply to every block in the array. args : dask arrays or constants dtype : np.dtype, optional The dtype of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data. chunks : tuple, optional Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array. drop_axis : number or iterable, optional Dimensions lost by the function. new_axis : number or iterable, optional New dimensions created by the function. name : string, optional The key name to use for the array. If not provided, will be determined by a hash of the arguments. **kwargs : Other keyword arguments to pass to function. Values must be constants (not dask.arrays)

Examples

>>> import dask.array as da
>>> x = da.arange(6, chunks=3)

>>> x.map_blocks(lambda x: x * 2).compute()
array([ 0,  2,  4,  6,  8, 10])


The da.map_blocks function can also accept multiple arrays.

>>> d = da.arange(5, chunks=2)
>>> e = da.arange(5, chunks=2)

>>> f = map_blocks(lambda a, b: a + b**2, d, e)
>>> f.compute()
array([ 0,  2,  6, 12, 20])


If the function changes shape of the blocks then you must provide chunks explicitly.

>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))


You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.

>>> a = da.arange(18, chunks=(6,))
>>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))


If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.

>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1),
...                  new_axis=[0, 2])


Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.

>>> x = da.arange(1000, chunks=(100,))
>>> y = da.arange(100, chunks=(10,))


The relevant attribute to match is numblocks.

>>> x.numblocks
(10,)
>>> y.numblocks
(10,)


If these match (up to broadcasting rules) then we can map arbitrary functions across blocks

>>> def func(a, b):
...     return np.array([a.max(), b.max()])

>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8')

>>> _.compute()
array([ 99,   9, 199,  19, 299,  29, 399,  39, 499,  49, 599,  59, 699,
69, 799,  79, 899,  89, 999,  99])


Your block function can learn where in the array it is if it supports a block_id keyword argument. This will receive entries like (2, 0, 1), the position of the block in the dask array.

>>> def func(block, block_id=None):
...     pass


You may specify the name of the resulting task in the graph with the optional name keyword argument.

>>> y = x.map_blocks(lambda x: x + 1, name='increment')

dask.array.core.atop(func, out_ind, *args, **kwargs)

Tensor operation: Generalized inner and outer products

A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The atop function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of atop including elementwise, broadcasting, reductions, tensordot, and transpose.

Parameters: func : callable Function to apply to individual tuples of blocks out_ind : iterable Block pattern of the output, something like ‘ijk’ or (1, 2, 3) *args : sequence of Array, index pairs Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’) **kwargs : dict Extra keyword arguments to pass to function dtype : np.dtype Datatype of resulting array. concatenate : bool, keyword only If true concatenate arrays along dummy indices, else provide lists adjust_chunks : dict Dictionary mapping index to function to be applied to chunk sizes new_axes : dict, keyword only New indexes and their dimension lengths

top, contains

Examples

2D embarrassingly parallel operation from two arrays, x, and y.

>>> z = atop(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8')  # z = x + y


Outer product multiplying x by y, two 1-d vectors

>>> z = atop(operator.mul, 'ij', x, 'i', y, 'j', dtype='f8')


z = x.T

>>> z = atop(np.transpose, 'ji', x, 'ij', dtype=x.dtype)


The transpose case above is illustrative because it does same transposition both on each in-memory block by calling np.transpose and on the order of the blocks themselves, by switching the order of the index ij -> ji.

We can compose these same patterns with more variables and more complex in-memory functions

z = X + Y.T

>>> z = atop(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8')


Any index, like i missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead pass concatenate=True.

Inner product multiplying x by y, two 1-d vectors

>>> def sequence_dot(x_blocks, y_blocks):
...     result = 0
...     for x, y in zip(x_blocks, y_blocks):
...         result += x.dot(y)
...     return result

>>> z = atop(sequence_dot, '', x, 'i', y, 'i', dtype='f8')


Add new single-chunk dimensions with the new_axes= keyword, including the length of the new dimension. New dimensions will always be in a single chunk.

>>> def f(x):
...     return x[:, None] * np.ones((1, 5))

>>> z = atop(f, 'az', x, 'a', new_axes={'z': 5}, dtype=x.dtype)


If the applied function changes the size of each chunk you can specify this with a adjust_chunks={...} dictionary holding a function for each index that modifies the dimension size in that index.

>>> def double(x):
...     return np.concatenate([x, x])

>>> y = atop(double, 'ij', x, 'ij',
...          adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype)

dask.array.core.top(func, output, out_indices, *arrind_pairs, **kwargs)

Tensor operation

Applies a function, func, across blocks from many different input dasks. We arrange the pattern with which those blocks interact with sets of matching indices. E.g.:

top(func, 'z', 'i', 'x', 'i', 'y', 'i')


yield an embarrassingly parallel communication pattern and is read as

$$z_i = func(x_i, y_i)$$

More complex patterns may emerge, including multiple indices:

top(func, 'z', 'ij', 'x', 'ij', 'y', 'ji')

$$z_{ij} = func(x_{ij}, y_{ji})$$


Indices missing in the output but present in the inputs results in many inputs being sent to one function (see examples).

Examples

Simple embarrassing map operation

>>> inc = lambda x: x + 1
>>> top(inc, 'z', 'ij', 'x', 'ij', numblocks={'x': (2, 2)})
{('z', 0, 0): (inc, ('x', 0, 0)),
('z', 0, 1): (inc, ('x', 0, 1)),
('z', 1, 0): (inc, ('x', 1, 0)),
('z', 1, 1): (inc, ('x', 1, 1))}


Simple operation on two datasets

>>> add = lambda x, y: x + y
>>> top(add, 'z', 'ij', 'x', 'ij', 'y', 'ij', numblocks={'x': (2, 2),
...                                                      'y': (2, 2)})
{('z', 0, 0): (add, ('x', 0, 0), ('y', 0, 0)),
('z', 0, 1): (add, ('x', 0, 1), ('y', 0, 1)),
('z', 1, 0): (add, ('x', 1, 0), ('y', 1, 0)),
('z', 1, 1): (add, ('x', 1, 1), ('y', 1, 1))}


Operation that flips one of the datasets

>>> addT = lambda x, y: x + y.T  # Transpose each chunk
>>> #                                        z_ij ~ x_ij y_ji
>>> #               ..         ..         .. notice swap
>>> top(addT, 'z', 'ij', 'x', 'ij', 'y', 'ji', numblocks={'x': (2, 2),
...                                                       'y': (2, 2)})
{('z', 0, 0): (add, ('x', 0, 0), ('y', 0, 0)),
('z', 0, 1): (add, ('x', 0, 1), ('y', 1, 0)),
('z', 1, 0): (add, ('x', 1, 0), ('y', 0, 1)),
('z', 1, 1): (add, ('x', 1, 1), ('y', 1, 1))}


Dot product with contraction over j index. Yields list arguments

>>> top(dotmany, 'z', 'ik', 'x', 'ij', 'y', 'jk', numblocks={'x': (2, 2),
...                                                          'y': (2, 2)})
{('z', 0, 0): (dotmany, [('x', 0, 0), ('x', 0, 1)],
[('y', 0, 0), ('y', 1, 0)]),
('z', 0, 1): (dotmany, [('x', 0, 0), ('x', 0, 1)],
[('y', 0, 1), ('y', 1, 1)]),
('z', 1, 0): (dotmany, [('x', 1, 0), ('x', 1, 1)],
[('y', 0, 0), ('y', 1, 0)]),
('z', 1, 1): (dotmany, [('x', 1, 0), ('x', 1, 1)],
[('y', 0, 1), ('y', 1, 1)])}


Pass concatenate=True to concatenate arrays ahead of time

>>> top(f, 'z', 'i', 'x', 'ij', 'y', 'ij', concatenate=True,
...     numblocks={'x': (2, 2), 'y': (2, 2,)})
{('z', 0): (f, (concatenate_axes, [('x', 0, 0), ('x', 0, 1)], (1,)),
(concatenate_axes, [('y', 0, 0), ('y', 0, 1)], (1,)))
('z', 1): (f, (concatenate_axes, [('x', 1, 0), ('x', 1, 1)], (1,)),
(concatenate_axes, [('y', 1, 0), ('y', 1, 1)], (1,)))}


>>> top(add, 'z', 'ij', 'x', 'ij', 'y', 'ij', numblocks={'x': (1, 2),
...                                                      'y': (2, 2)})
{('z', 0, 0): (add, ('x', 0, 0), ('y', 0, 0)),
('z', 0, 1): (add, ('x', 0, 1), ('y', 0, 1)),
('z', 1, 0): (add, ('x', 0, 0), ('y', 1, 0)),
('z', 1, 1): (add, ('x', 0, 1), ('y', 1, 1))}


Support keyword arguments with apply

>>> def f(a, b=0): return a + b
>>> top(f, 'z', 'i', 'x', 'i', numblocks={'x': (2,)}, b=10)
{('z', 0): (apply, f, [('x', 0)], {'b': 10}),
('z', 1): (apply, f, [('x', 1)], {'b': 10})}


## Array Methods¶

class dask.array.Array(dask, name, chunks, dtype, shape=None)

A parallel nd-array comprised of many numpy arrays arranged in a grid.

This constructor is for advanced uses only. For normal use see the da.from_array function.

Parameters: dask : dict Task dependency graph name : string Name of array in dask shape : tuple of ints Shape of the entire array chunks: iterable of tuples block sizes along each dimension
all(axis=None, keepdims=False, split_every=None)

Test whether all array elements along a given axis evaluate to True.

Parameters: a : array_like Input array or object that can be converted to an array. axis : None or int or tuple of ints, optional Axis or axes along which a logical AND reduction is performed. The default (axis = None) is to perform a logical AND over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis. New in version 1.7.0. If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before. out : ndarray, optional Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if dtype(out) is float, the result will consist of 0.0’s and 1.0’s). See doc.ufuncs (Section “Output arguments”) for more details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. all : ndarray, bool A new boolean or array is returned unless out is specified, in which case a reference to out is returned.

ndarray.all
equivalent method
any
Test whether any element along a given axis evaluates to True.

Notes

Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.

Examples

>>> np.all([[True,False],[True,True]])
False

>>> np.all([[True,False],[True,True]], axis=0)
array([ True, False], dtype=bool)

>>> np.all([-1, 4, 5])
True

>>> np.all([1.0, np.nan])
True

>>> o=np.array([False])
>>> z=np.all([-1, 4, 5], out=o)
>>> id(z), id(o), z
(28293632, 28293632, array([ True], dtype=bool))

any(axis=None, keepdims=False, split_every=None)

Test whether any array element along a given axis evaluates to True.

Returns single boolean unless axis is not None

Parameters: a : array_like Input array or object that can be converted to an array. axis : None or int or tuple of ints, optional Axis or axes along which a logical OR reduction is performed. The default (axis = None) is to perform a logical OR over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis. New in version 1.7.0. If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before. out : ndarray, optional Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if it is of type float, then it will remain so, returning 1.0 for True and 0.0 for False, regardless of the type of a). See doc.ufuncs (Section “Output arguments”) for details. keepdims : bool, optional If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr. any : bool or ndarray A new boolean or ndarray is returned unless out is specified, in which case a reference to out is returned.

ndarray.any
equivalent method
all
Test whether all elements along a given axis evaluate to True.

Notes

Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.

Examples

>>> np.any([[True, False], [True, True]])
True

>>> np.any([[True, False], [False, False]], axis=0)
array([ True, False], dtype=bool)

>>> np.any([-1, 0, 5])
True

>>> np.any(np.nan)
True

>>> o=np.array([False])
>>> z=np.any([-1, 4, 5], out=o)
>>> z, o
(array([ True], dtype=bool), array([ True], dtype=bool))
>>> # Check now that z is a reference to o
>>> z is o
True
>>> id(z), id(o) # identity of z and o
(191614240, 191614240)

argmax(axis=None, split_every=None)

Returns the indices of the maximum values along an axis.

Parameters: a : array_like Input array. axis : int, optional By default, the index is into the flattened array, otherwise along the specified axis. out : array, optional If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. index_array : ndarray of ints Array of indices into the array. It has the same shape as a.shape with the dimension along axis removed.

ndarray.argmax, argmin

amax
The maximum value along a given axis.
unravel_index
Convert a flat index into an index tuple.

Notes

In case of multiple occurrences of the maximum values, the indices corresponding to the first occurrence are returned.

Examples

>>> a = np.arange(6).reshape(2,3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> np.argmax(a)
5
>>> np.argmax(a, axis=0)
array([1, 1, 1])
>>> np.argmax(a, axis=1)
array([2, 2])

>>> b = np.arange(6)
>>> b[1] = 5
>>> b
array([0, 5, 2, 3, 4, 5])
>>> np.argmax(b) # Only the first occurrence is returned.
1

argmin(axis=None, split_every=None)

Returns the indices of the minimum values along an axis.

Parameters: a : array_like Input array. axis : int, optional By default, the index is into the flattened array, otherwise along the specified axis. out : array, optional If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. index_array : ndarray of ints Array of indices into the array. It has the same shape as a.shape with the dimension along axis removed.

ndarray.argmin, argmax

amin
The minimum value along a given axis.
unravel_index
Convert a flat index into an index tuple.

Notes

In case of multiple occurrences of the minimum values, the indices corresponding to the first occurrence are returned.

Examples

>>> a = np.arange(6).reshape(2,3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> np.argmin(a)
0
>>> np.argmin(a, axis=0)
array([0, 0, 0])
>>> np.argmin(a, axis=1)
array([0, 0])

>>> b = np.arange(6)
>>> b[4] = 0
>>> b
array([0, 1, 2, 3, 0, 5])
>>> np.argmin(b) # Only the first occurrence is returned.
0

astype(dtype, **kwargs)

Copy of the array, cast to a specified type.

Parameters: dtype : str or dtype Typecode or data-type to which the array is cast. casting : {‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility. ‘no’ means the data types should not be cast at all. ‘equiv’ means only byte-order changes are allowed. ‘safe’ means only casts which can preserve values are allowed. ‘same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed. ‘unsafe’ means any data conversions may be done. copy : bool, optional By default, astype always returns a newly allocated array. If this is set to False and the dtype requirement is satisfied, the input array is returned instead of a copy.
cache(store=None, **kwargs)

Evaluate and cache array

Parameters: store: MutableMapping or ndarray-like Place to put computed and cached chunks kwargs: Keyword arguments to pass on to get function for scheduling

Examples

This triggers evaluation and store the result in either

1. An ndarray object supporting setitem (see da.store)
2. A MutableMapping like a dict or chest

It then returns a new dask array that points to this store. This returns a semantically equivalent dask array.

>>> import dask.array as da
>>> x = da.arange(5, chunks=2)
>>> y = 2*x + 1
>>> z = y.cache()  # triggers computation

>>> y.compute()  # Does entire computation
array([1, 3, 5, 7, 9])

>>> z.compute()  # Just pulls from store
array([1, 3, 5, 7, 9])


You might base a cache off of an array like a numpy array or h5py.Dataset.

>>> cache = np.empty(5, dtype=x.dtype)
>>> z = y.cache(store=cache)
>>> cache
array([1, 3, 5, 7, 9])


Or one might use a MutableMapping like a dict or chest

>>> cache = dict()
>>> z = y.cache(store=cache)
>>> cache
{('x', 0): array([1, 3]),
('x', 1): array([5, 7]),
('x', 2): array([9])}

clip(min=None, max=None)

Clip (limit) the values in an array.

Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.

Parameters: a : array_like Array containing elements to clip. a_min : scalar or array_like Minimum value. a_max : scalar or array_like Maximum value. If a_min or a_max are array_like, then they will be broadcasted to the shape of a. out : ndarray, optional The results will be placed in this array. It may be the input array for in-place clipping. out must be of the right shape to hold the output. Its type is preserved. clipped_array : ndarray An array with the elements of a, but where values < a_min are replaced with a_min, and those > a_max with a_max.

numpy.doc.ufuncs
Section “Output arguments”

Examples

>>> a = np.arange(10)
>>> np.clip(a, 1, 8)
array([1, 1, 2, 3, 4, 5, 6, 7, 8, 8])
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, 3, 6, out=a)
array([3, 3, 3, 3, 4, 5, 6, 6, 6, 6])
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, [3,4,1,1,1,4,4,4,4,4], 8)
array([3, 4, 2, 3, 4, 5, 6, 7, 8, 8])

copy()

Copy array. This is a no-op for dask.arrays, which are immutable

cumprod(axis, dtype=None)

See da.cumprod for docstring

cumsum(axis, dtype=None)

See da.cumsum for docstring

dot(a, b, out=None)

Dot product of two arrays.

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of a and the second-to-last of b:

dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])

Parameters: a : array_like First argument. b : array_like Second argument. out : ndarray, optional Output argument. This must have the exact kind that would be returned if it was not used. In particular, it must have the right type, must be C-contiguous, and its dtype must be the dtype that would be returned for dot(a,b). This is a performance feature. Therefore, if these conditions are not met, an exception is raised, instead of attempting to be flexible. output : ndarray Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. If out is given, then it is returned. ValueError If the last dimension of a is not the same size as the second-to-last dimension of b.

vdot
Complex-conjugating dot product.
tensordot
Sum products over arbitrary axes.
einsum
Einstein summation convention.
matmul
‘@’ operator as method with out parameter.

Examples

>>> np.dot(3, 4)
12


Neither argument is complex-conjugated:

>>> np.dot([2j, 3j], [2j, 3j])
(-13+0j)


For 2-D arrays it is the matrix product:

>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> np.dot(a, b)
array([[4, 1],
[2, 2]])

>>> a = np.arange(3*4*5*6).reshape((3,4,5,6))
>>> b = np.arange(3*4*5*6)[::-1].reshape((5,4,6,3))
>>> np.dot(a, b)[2,3,2,1,2,2]
499128
>>> sum(a[2,3,2,:] * b[1,2,:,2])
499128

flatten()

Return a contiguous flattened array.

A 1-D array, containing the elements of the input, is returned. A copy is made only if needed.

As of NumPy 1.10, the returned array will have the same type as the input array. (for example, a masked array will be returned for a masked array input)

Parameters: a : array_like Input array. The elements in a are read in the order specified by order, and packed as a 1-D array. order : {‘C’,’F’, ‘A’, ‘K’}, optional The elements of a are read using this index order. ‘C’ means to index the elements in row-major, C-style order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of axis indexing. ‘A’ means to read the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. ‘K’ means to read the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, ‘C’ index order is used. y : array_like If a is a matrix, y is a 1-D ndarray, otherwise y is an array of the same subtype as a. The shape of the returned array is (a.size,). Matrices are special cased for backward compatibility.

ndarray.flat
1-D iterator over an array.
ndarray.flatten
1-D array copy of the elements of an array in row-major order.
ndarray.reshape
Change the shape of an array without changing its data.

Notes

In row-major, C-style order, in two dimensions, the row index varies the slowest, and the column index the quickest. This can be generalized to multiple dimensions, where row-major order implies that the index along the first axis varies slowest, and the index along the last quickest. The opposite holds for column-major, Fortran-style index ordering.

When a view is desired in as many cases as possible, arr.reshape(-1) may be preferable.

Examples

It is equivalent to reshape(-1, order=order).

>>> x = np.array([[1, 2, 3], [4, 5, 6]])
>>> print(np.ravel(x))
[1 2 3 4 5 6]

>>> print(x.reshape(-1))
[1 2 3 4 5 6]

>>> print(np.ravel(x, order='F'))
[1 4 2 5 3 6]


When order is ‘A’, it will preserve the array’s ‘C’ or ‘F’ ordering:

>>> print(np.ravel(x.T))
[1 4 2 5 3 6]
>>> print(np.ravel(x.T, order='A'))
[1 2 3 4 5 6]


When order is ‘K’, it will preserve orderings that are neither ‘C’ nor ‘F’, but won’t reverse axes:

>>> a = np.arange(3)[::-1]; a
array([2, 1, 0])
>>> a.ravel(order='C')
array([2, 1, 0])
>>> a.ravel(order='K')
array([2, 1, 0])

>>> a = np.arange(12).reshape(2,3,2).swapaxes(1,2); a
array([[[ 0,  2,  4],
[ 1,  3,  5]],
[[ 6,  8, 10],
[ 7,  9, 11]]])
>>> a.ravel(order='C')
array([ 0,  2,  4,  1,  3,  5,  6,  8, 10,  7,  9, 11])
>>> a.ravel(order='K')
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

itemsize

Length of one array element in bytes

map_blocks(func, *args, **kwargs)

Map a function across all blocks of a dask array.

Parameters: func : callable Function to apply to every block in the array. args : dask arrays or constants dtype : np.dtype, optional The dtype of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data. chunks : tuple, optional Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array. drop_axis : number or iterable, optional Dimensions lost by the function. new_axis : number or iterable, optional New dimensions created by the function. name : string, optional The key name to use for the array. If not provided, will be determined by a hash of the arguments. **kwargs : Other keyword arguments to pass to function. Values must be constants (not dask.arrays)

Examples

>>> import dask.array as da
>>> x = da.arange(6, chunks=3)

>>> x.map_blocks(lambda x: x * 2).compute()
array([ 0,  2,  4,  6,  8, 10])


The da.map_blocks function can also accept multiple arrays.

>>> d = da.arange(5, chunks=2)
>>> e = da.arange(5, chunks=2)

>>> f = map_blocks(lambda a, b: a + b**2, d, e)
>>> f.compute()
array([ 0,  2,  6, 12, 20])


If the function changes shape of the blocks then you must provide chunks explicitly.

>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))


You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.

>>> a = da.arange(18, chunks=(6,))
>>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))


If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.

>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1),
...                  new_axis=[0, 2])


Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.

>>> x = da.arange(1000, chunks=(100,))
>>> y = da`