Skip to content

Python Polars 0.20.20

Compare
Choose a tag to compare
@github-actions github-actions released this 13 Apr 17:56
8547d86

🚀 Performance improvements

  • Fix cross join batch size when one of the DataFrames is tiny (#14347)
  • Fix binview growable complexity O(n*m) -> O(n) (#15628)
  • Remove extra thread spawn from row group fetcher (#15626)
  • Use vertical parallelism if input is chunked for Filter,Select,WithColumns (#15608)
  • Refactor CSV serialization to not go thorough AnyValue (#15576)
  • don't use dynamic dispatch in visitors (#15607)
  • Improve Bitmap construction performance (#15570)
  • join by row-encoding (#15559)

✨ Enhancements

  • add Expr.dt.add_business_days and Series.dt.add_business_days (#15595)
  • Add str.head and str.tail (#14425)
  • Add union/or operator for pl.Enum (#14965)
  • Extended BytecodeParser to handle additional math functions, and imports from the global namespace (#15627)
  • Push down is_between expressions to Arrow (#15180)
  • add holidays argument to business_day_count (#15580)
  • change default to write parquet statistics (#15597)
  • Expressify to_integer (#15604)
  • Optimizer; remove double SORT and redundant projections (#15573)
  • Add null_on_oob parameter to expr.array.get (#15426)
  • support weekend argument in business_day_count (#15544)
  • Enable is_first/last_distinct for not nested non-numeric list (#15552)
  • Turn off cse if cache node found (#15554)
  • Tag concat list as elementwise (#15545)

🐞 Bug fixes

  • Return appropriate data type for time mean and median (#14471)
  • Fix issue in write_excel that could lead to incorrect spanning range determination (#15631)
  • Output correct dtype for mean_horizontal on a single column (#15118)
  • Recompute RowIndex schema after projection pd (#15625)
  • Mean of boolean in streaming group_by incorrectly always gave NULL (#15616)
  • Include cloud creds in cache key (#15609)
  • Fix elementwise-apply if any input is AggregatedScalar (#15606)
  • Explode list should take validity into account (#15572)
  • use larger recursive stack in debug mode (#15593)
  • SQL interface "off-by-one' indexing error with GROUP BY clauses that use position ordinals (#15584)
  • Enable missing features in polars-time (#15558)
  • Handle quoted identifiers when registering CTEs in the SQL engine (#15564)
  • Decompress moved out of schema initialization (#15550)
  • Turn off cse if cache node found (#15554)

📖 Documentation

  • Add legacy CPU install instructions in user guide (#13676)
  • Examples for errors (#13724)
  • Add docstring examples for reading json (#14481)
  • Add security warning in LazyFrame.deserialize() docstring (#15282)
  • Various minor updates to User Guide's SQL intro section (#15557)

🛠️ Other improvements

  • Replace most deprecated calls with bounded version (#15632)
  • use bound api (#15630)
  • Initial PyO3 0.21 support (#15622)
  • Don't run streaming group-by in partitionable gb (#15611)
  • pref(rust!, python): Unify sort with SortOptions and SortMultipleOptions (#15590)
  • Set up CodSpeed (#15537)

Thank you to all our contributors for making this release possible!
@CanglongCl, @ChayimFriedman2, @Fokko, @JamesCE2001, @MarcoGorelli, @NedJWestern, @TrevorWinstral, @alexander-beedie, @deanm0000, @douglas-raillard-arm, @eitsupi, @filabrazilska, @i-aki-y, @itamarst, @leoforney, @mcrumiller, @nameexhaustion, @orlp, @ozgrakkurt, @reswqa, @ritchie46 and @stinodego