dask.array.stats.skew

dask.array.stats.skew

dask.array.stats.skew(a, axis=0, bias=True, nan_policy='propagate')[source]

This docstring was copied from scipy.stats.skew.

Some inconsistencies with the Dask version may exist.

Compute the sample skewness of a data set.

For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.

Parameters
andarray

Input array.

axisint or None, default: 0

If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output. If None, the input will be raveled before computing the statistic.

biasbool, optional

If False, then the calculations are corrected for statistical bias.

nan_policy{‘propagate’, ‘omit’, ‘raise’}

Defines how to handle input NaNs.

  • propagate: if a NaN is present in the axis slice (e.g. row) along which the statistic is computed, the corresponding entry of the output will be NaN.

  • omit: NaNs will be omitted when performing the calculation. If insufficient data remains in the axis slice along which the statistic is computed, the corresponding entry of the output will be NaN.

  • raise: if a NaN is present, a ValueError will be raised.

keepdimsbool, default: False (Not supported in Dask)

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

Returns
skewnessndarray

The skewness of values along an axis, returning NaN where all values are equal.

Notes

The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.

\[g_1=\frac{m_3}{m_2^{3/2}}\]

where

\[m_i=\frac{1}{N}\sum_{n=1}^N(x[n]-\bar{x})^i\]

is the biased sample \(i\texttt{th}\) central moment, and \(\bar{x}\) is the sample mean. If bias is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.

\[G_1=\frac{k_3}{k_2^{3/2}}= \frac{\sqrt{N(N-1)}}{N-2}\frac{m_3}{m_2^{3/2}}.\]

Beginning in SciPy 1.9, np.matrix inputs (not recommended for new code) are converted to np.ndarray before the calculation is performed. In this case, the output will be a scalar or np.ndarray of appropriate shape rather than a 2D np.matrix. Similarly, while masked elements of masked arrays are ignored, the output will be a scalar or np.ndarray rather than a masked array with mask=False.

References

1

Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 2.2.24.1

Examples

>>> from scipy.stats import skew  
>>> skew([1, 2, 3, 4, 5])  
0.0
>>> skew([2, 8, 0, 4, 1, 9, 9, 0])  
0.2650554122698573