dask.dataframe.to_orc

dask.dataframe.to_orc#

dask.dataframe.to_orc(df, path, engine='pyarrow', write_index=True, storage_options=None, compute=True, compute_kwargs=None)[source]#

Store Dask.dataframe to ORC files

Parameters:

dfdask.dataframe.DataFrame
pathstring or pathlib.Path: Destination directory for data. Prepend with protocol like s3:// or hdfs:// for remote data.
engine: ‘pyarrow’ or ORCEngine: Backend ORC engine to use for I/O. Default is “pyarrow”.
write_indexboolean, default True: Whether or not to write the index. Defaults to True.
storage_optionsdict, default None: Key/value pairs to be passed on to the file-system backend, if any.
computebool, default True: If True (default) then the result is computed immediately. If False then a dask.delayed object is returned for future computation.
compute_kwargsdict, default True: Options to be passed in to the compute method