dask_expr._collection.DataFrame.memory_usage
dask_expr._collection.DataFrame.memory_usage¶
- DataFrame.memory_usage(deep=False, index=True)[source]¶
Return the memory usage of each column in bytes.
This docstring was copied from pandas.core.frame.DataFrame.memory_usage.
Some inconsistencies with the Dask version may exist.
The memory usage can optionally include the contribution of the index and elements of object dtype.
This value is displayed in DataFrame.info by default. This can be suppressed by setting
pandas.options.display.memory_usage
to False.- Parameters
- indexbool, default True
Specifies whether to include the memory usage of the DataFrame’s index in returned Series. If
index=True
, the memory usage of the index is the first item in the output.- deepbool, default False
If True, introspect the data deeply by interrogating object dtypes for system-level memory consumption, and include it in the returned values.
- Returns
- Series
A Series whose index is the original column names and whose values is the memory usage of each column in bytes.
See also
numpy.ndarray.nbytes
Total bytes consumed by the elements of an ndarray.
Series.memory_usage
Bytes consumed by a Series.
Categorical
Memory-efficient array for string values with many repeated values.
DataFrame.info
Concise summary of a DataFrame.
Notes
See the Frequently Asked Questions for more details.
Examples
>>> dtypes = ['int64', 'float64', 'complex128', 'object', 'bool'] >>> data = dict([(t, np.ones(shape=5000, dtype=int).astype(t)) ... for t in dtypes]) >>> df = pd.DataFrame(data) >>> df.head() int64 float64 complex128 object bool 0 1 1.0 1.0+0.0j 1 True 1 1 1.0 1.0+0.0j 1 True 2 1 1.0 1.0+0.0j 1 True 3 1 1.0 1.0+0.0j 1 True 4 1 1.0 1.0+0.0j 1 True
>>> df.memory_usage() Index 128 int64 40000 float64 40000 complex128 80000 object 40000 bool 5000 dtype: int64
>>> df.memory_usage(index=False) int64 40000 float64 40000 complex128 80000 object 40000 bool 5000 dtype: int64
The memory footprint of object dtype columns is ignored by default:
>>> df.memory_usage(deep=True) Index 128 int64 40000 float64 40000 complex128 80000 object 180000 bool 5000 dtype: int64
Use a Categorical for efficient storage of an object-dtype column with many repeated values.
>>> df['object'].astype('category').memory_usage(deep=True) 5244