dask.dataframe.DataFrame.query#
- DataFrame.query(expr, **kwargs)[source]#
Filter dataframe with complex expression
Blocked version of pd.DataFrame.query
- Parameters:
- expr: str
The query string to evaluate. You can refer to column names that are not valid Python variable names by surrounding them in backticks. Dask does not fully support referring to variables using the ‘@’ character, use f-strings or the
local_dictkeyword argument instead.
See also
Examples
>>> import pandas as pd >>> import dask.dataframe as dd >>> df = pd.DataFrame({'x': [1, 2, 1, 2], ... 'y': [1, 2, 3, 4], ... 'z z': [4, 3, 2, 1]}) >>> ddf = dd.from_pandas(df, npartitions=2)
Refer to column names directly:
>>> ddf.query('y > x').compute() x y z z 2 1 3 2 3 2 4 1
Refer to column name using backticks:
>>> ddf.query('`z z` > x').compute() x y z z 0 1 1 4 1 2 2 3 2 1 3 2
Refer to variable name using f-strings:
>>> value = 1 >>> ddf.query(f'x == {value}').compute() x y z z 0 1 1 4 2 1 3 2
Refer to variable name using
local_dict:>>> ddf.query('x == @value', local_dict={"value": value}).compute() x y z z 0 1 1 4 2 1 3 2