Skip to content

tidypolars 0.10.0

Compare
Choose a tag to compare
@etiennebacher etiennebacher released this 31 Aug 13:06

tidypolars requires polars >= 0.19.1.

Breaking changes and deprecations

  • describe() is deprecated as of tidypolars 0.10.0 and will be removed in a
    future update. Use summary() with the same arguments instead (#127).

  • describe_plan() and describe_optimized_plan() are deprecated as of
    tidypolars 0.10.0 and will be removed in a future update. Use explain() with
    optimized = TRUE/FALSE instead (#128).

  • In sink_parquet() and sink_csv(), all arguments except for .data and
    path must be named (#136).

New features

  • Add support for more functions:

    • from package base: substr().
  • Better error message when a function can come from several packages but only
    one version is translated (#130).

  • row_number() now works without argument (#131).

  • New functions to import data as Polars DataFrames and LazyFrames (#136):

    • read_<format>_polars() to import data as a Polars DataFrame;
    • scan_<format>_polars() to import data as a Polars LazyFrame;
    • <format> can be "csv", "ipc", "json", "parquet".

    Those can replace functions from polars. For example,
    polars::pl$read_parquet(...) can be replaced by
    read_parquet_polars(...).

  • New functions to write Polars DataFrames to external files:
    write_<format>_polars() where <format> can be "csv", "ipc", "json",
    "ndjson", "parquet" (#136).

  • New function sink_ipc() that is similar to sink_parquet() and sink_csv()
    but for IPC files (#136).

  • across() now throws a better error message when the user passes an external
    list to .fns. This works with dplyr but cannot work with tidypolars
    (#135).

  • Added support for argument .add in group_by().

Bug fixes

  • stringr::str_sub() now works when both start and end are negative.

  • Fixed a bug in str_sub() when start was greater than 1.

  • stringr::str_starts() and stringr::str_ends() now work with a regex.

  • fill() doesn't error anymore when ... is empty. Instead, it returns the
    input data.

  • unite() now provides a proper error message when col is missing.

  • unite() doesn't error anymore when ... is empty. Instead, it uses all
    variables in the dataset.

  • filter(), mutate() and summarize() now work when using a column from
    another data.frame, e.g.

    my_polars_df |> 
      filter(x %in% some_data_frame$y)
  • replace_na() no longer converts the column to the datatype of the replacement,
    e.g. data |> replace_na("a") will error if the input data is numeric.

  • n_distinct() now correctly applies the na.rm argument when several columns
    are passed as input (#137).