Skip to content

Releases: pola-rs/polars

Python Polars 0.15.13

06 Jan 07:33
640fbe2
Compare
Choose a tag to compare

✨ Enhancements

  • Improve iterating over GroupBy (#6051)
  • much faster lazy type-checks (#6064)
  • support array-expansion of scalars on frame init from dict (#6034)
  • improve error message when writing nested data to… (#6040)

🐞 Bug fixes

  • bound complex type from 3.8 to 3.11 (#6071)
  • deal with unnest schema expansion in projection pd (#6063)
  • correct output dtype for cummin/cumsum/cummax (#6062)
  • block streaming on literal series/range (#6058)
  • improve handling of dict-type "columns" param on frame-init (#6045)
  • Fix typing for DataFrame.select (#6047)
  • ndjson struct inference (#6049)
  • fix stringcache. latest refactor introduced a hashing error (#6056)
  • allow mixed field order and availability in apply that r… (#6041)
  • deal with empty structs (#6039)
  • fix aggregation that filters out all data (#6036)
  • fix diff overflow (#6033)
  • keep column names in is_null/is_not_null (#6032)
  • keep name when sorting categorical in lexial order (#6029)
  • tweaked property/accessor behaviour (#6021)
  • properly set null anyvalue if categorical is neste… (#6025)
  • Fix from_epoch function signature (#6024)
  • Validate estimated_size parameter (#6018)

🛠️ Other improvements

  • suggest forward fill in cumsum/cummax (#6061)
  • Fix SIM105 issues. (#6042)
  • Remove trailing spaces in glimpse output (#6037)
  • Remove unnecessary noqa's (#6035)
  • Fix flake8-pytest-style errors in tests. (#6031)
  • update read_sql and row docstrings (#6028)
  • Enable the isort-style import autofix via ruff (#6020)
  • Update py-polars/Cargo.lock (#6013)
  • Refactor pivot tests (#6012)
  • Use ruff instead of isort, flake8 and pyupgrade (#5916)
  • Properly deprecate groupby.pivot (#6000)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ghuls, @ritchie46, @stinodego and @universalmind303

Python Polars 0.15.11

03 Jan 07:59
35cf1c1
Compare
Choose a tag to compare

🚀 Performance improvements

  • ensure set_at_idx is O(1) (#5977)

✨ Enhancements

  • allow eq,ne,lt etc (#5995)
  • Improve Expr.is_between API (#5981)
  • large speedup for df.iterrows (~200-400%) (#5979)
  • updated default table format from "UTF8_FULL" to "UTF8_FULL_CONDENSED" (#5967)
  • Access rows as namedtuples (#5966)
  • Improve assert_frame_equal messages (#5962)

🐞 Bug fixes

  • make weekday tz-aware (#5989)
  • fix categorical in struct anyvalue issue (#5987)
  • fix invalid boolean simplification (#5976)
  • allow empty sort on any dtype (#5975)
  • properly deal with categoricals in streaming queries (#5974)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ritchie46 and @stinodego

Python Polars 0.15.9

31 Dec 09:31
5895424
Compare
Choose a tag to compare

🚀 Performance improvements

  • improve reducing window function performance ~33% (#5878)

✨ Enhancements

  • str.strip with multiple chars (#5929)
  • add iterrrows (#5945)
  • read decimal as f64 (#5938)
  • improve query plan scan formatting (#5937)
  • allow all null cast (#5933)
  • allow objects in struct types (#5925)
  • handle Series init from python sequence of numpy arrays (#5918)
  • merge sorted dataframes (#5817)
  • impl hex and base64 for binary (#5892)
  • Add datatype hierarchy (#5901)
  • Add .item() on DataFrame and Series (#5893)
  • make get_any_value fallible (#5877)
  • Add string representation for data types (#5861)
  • directly push all operator result into sink, prev… (#5856)

🐞 Bug fixes

  • don't panic on ignored context (#5958)
  • don't allow named expression in arr.eval (#5957)
  • error on invalid dtype (#5956)
  • fix panic in join expressions (#5954)
  • block ordered predicates before explode (#5951)
  • adhere to schema in arr.eval of empty list (#5947)
  • fix from_dict schema_inference=0 (#5948)
  • fix arrow nested null conversion (#5946)
  • allow None in arr.slice length (#5934)
  • fix time to duration cast (#5932)
  • error on addition with datetime/time (#5931)
  • don't create categoricals in streaming (#5926)
  • object filter should keep single chunk (#5913)
  • csv, read escaped "" as missing (#5912)
  • fix pivot of signed integers (#5909)
  • don't allow duplicate columns in read_csv arg (#5908)
  • fix latest oob in streaming convertion (#5902)
  • adapt k to len in topk (#5888)
  • fix lazy swapping rename (#5884)
  • fix window function with nullable values; regression due… (#5874)
  • improve equality consistency between types (#5873)
  • evaluate whole branch expression to determine if r… (#5864)
  • fix top_k on empty (#5865)
  • fix slice in streaming (#5854)
  • Fix type hint for IO *_options arguments (#5852)

🛠️ Other improvements

  • Fix docs for sink_parquet (#5952)
  • Fix misspelling in LazyFrame docstring (#5917)
  • add bin, series.is_sorted and merge_sorted (#5914)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @alexander-beedie, @cannero, @chitralverma, @dannyvankooten, @johngunerli, @ozgrakkurt, @ritchie46, @stinodego, @winding-lines and @zundertj

Rust Polars 0.26.0

22 Dec 11:55
295fd7a
Compare
Choose a tag to compare

⚠️ Breaking changes

  • remove Series::append_array (#5681)
  • iso weekday (#5598)

🚀 Performance improvements

  • improve reducing window function performance ~33% (#5878)
  • impove performance reducing window functions with numeric output ~-14% (#5841)
  • set_sorted flag when creating from literal (#5728)
  • use sorted fast path in streaming groupby (#5727)
  • ensure fast_explode propagates (#5676)
  • fix quadratic time complexity of groupby in stream… (#5614)
  • Aggregate projection pushdown (#5556)
  • improve streaming primitve groupby (#5575)
  • vectorize integer vec-hash by using very simple, … (#5572)
  • specialized utf8 groupby in streaming (#5535)

✨ Enhancements

  • make get_any_value fallible (#5877)
  • directly push all operator result into sink, prev… (#5856)
  • add sink_parquet (#5480)
  • Support parsing more float string representations. (#5824)
  • implement mean aggregation for duration (#5807)
  • implement sensible boolean aggregates (#5806)
  • allow expression as quantile input (#5751)
  • accept expression in str.extract_all (#5742)
  • tz-aware strptime (#5736)
  • Add "fmt_no_tty" feature for formatting support without r… (#5725)
  • lazy diagonal concat. (#5647)
  • to_struct add upper_bound (#5714)
  • inversely scale chunk_size with thread count in s… (#5699)
  • add streaming minmax (#5693)
  • improve dynamic inference of anyvalues and structs (#5690)
  • support is_in for boolean dtype (#5682)
  • add a cache to strptime (#5628)
  • add nearest interpolation strategy (#5626)
  • make cast recursive (#5596)
  • add arg_min/arg_max for series of dtype boolean (#5592)
  • prefer streaming groupby if partitionable (#5580)
  • make map_alias fallible (#5532)
  • pl.min & pl.max accept wildcard similar to pl.sum (#5511)
  • add predicate pushdown to anonymous_scan (#5467)
  • make streaming work with multiple sinks in a sing… (#5474)
  • add streaming slice operation (#5466)
  • run partial streaming queries (#5464)
  • streaming left joins (#5456)
  • file statistics so we only (try to) keep smallest table in memory (#5454)
  • streaming inner joins. (#5400)
  • build_info() provides detailed information how polars was built (#5423)
  • add missing width property to LazyFrame (#5431)
  • allow regex and wildcard in groupby (#5425)
  • Streaming joins architecture and Cross join implementation. (#5339)
  • add support for am/pm notation in parse_dates read_csv (#5373)
  • add reduce/cumreduce expression as an easier fold (#5364)

🐞 Bug fixes

  • fix lazy swapping rename (#5884)
  • improve equality consistency between types (#5873)
  • evaluate whole branch expression to determine if r… (#5864)
  • fix top_k on empty (#5865)
  • fix slice in streaming (#5854)
  • correct invalid type in struct anyvalue access (#5844)
  • don't set fast_explode if null values in list (#5838)
  • duration formatting (#5837)
  • respect fetch in union (#5836)
  • keep f32 dtype in fill_null by int (#5834)
  • err on epoch on time dtype (#5831)
  • fix panic in hmean (#5808)
  • asof join by logical groups (#5805)
  • fix parquet regression upstream in arrow2 (#5797)
  • Fix lazy cumsum and cumprod result types (#5792)
  • fix nested writer (#5777)
  • fix(rust, python) Summation on empty series evaluates to Some(0) (#5773)
  • empty concat utf8 (#5768)
  • projection pushdown with union and asof join (#5763)
  • check null values in asof_join + groupby (#5756)
  • fix generic streaming groupby on logical types (#5752)
  • fix date_range on expressions (#5750)
  • fix dtypes in join_asof_by (#5746)
  • fix group order in binary aggregation (#5744)
  • implement min/max aggregation for utf8 in groupby (#5737)
  • fix all_null/sorted into_groups panic (#5733)
  • asof join 'by', 'forward' combination (#5720)
  • fix pivot on floating point indexes (#5704)
  • fix arange with column/literal input (#5703)
  • fix double projection that leads to uneven union d… (#5700)
  • Fix a bug in floating regex handling used in CSV type inference (#5695)
  • fix asof join schema (#5686)
  • fix owned arithmetic schema (#5685)
  • take glob into account in scan_csv 'with_schema_mo… (#5683)
  • fix boolean schema in agg_max/min (#5678)
  • fix boolean arg-max if all equal (#5680)
  • early error on duplicate names in streaming groupby (#5638)
  • fix streaming groupby aggregate types (#5636)
  • convert panic to err in concat_list (#5637)
  • fix dot diagram of single nodes (#5624)
  • fix dynamic struct inference (#5619)
  • keep dtype when eval on empty list (#5597)
  • fix ternary with list output on empty frame (#5595)
  • fix tz-awareness of truncate (#5591)
  • check chunks before doing chunked_id join optimiza… (#5589)
  • invert cast_time_zone conversion (#5587)
  • asof join ensure join column is not dropped when '… (#5585)
  • fix ub due to invalid dtype on splitting dfs (#5579)
  • fix(rust, python); fix projection pushdown in asof joins (#5542)
  • streaming hstack allow duplicates (#5538)
  • fix streaming empty join panic (#5534)
  • fix duplicate caches in cse and prevent quadratic … (#5528)
  • allow appending categoricals that are all null (#5526)
  • tz-aware strftime (#5525)
  • make 'truncate' tz-aware (#5522)
  • fix coalesce expreession expansion (#5521)
  • fix nested aggregatin in when then and window expr… (#5520)
  • fix sort_by expression if groups already aggregated (#5518)
  • fix bug in batched parquet reader that dropped dfs… (#5506)
  • fix bugs in skew and kurtosis (#5484)
  • compute correct offset for streaming join on multi… (#5479)
  • return error on invalid sortby expression (#5478)
  • add missing AnyValueBuffer specialisation for Duration dtype (#5436)
  • fix freeze/stall when writing more than 2^31 string values to parquet (#5366)
  • properly handle json with unclosed strings (#5427)
  • fix null poisoning in rank operation (#5417)
  • correct expr::diff dtype for temporal columns (#5416)
  • fix cse for nested caches (#5412)
  • don't set sorted flag in argsort (#5410)
  • explicit nan comparison in min/max agg (#5403)
  • Correct CSV row indexing (#5385)

🛠️ Other improvements

  • Update rustc and fix clippy (#5880)
  • update arrow (#5862)
  • move join dispatch to polars-ops (#5809)
  • Remove dbg statement from union (#5791)
  • Continue removing compilation warnings (#5778)
  • shrink anyvalue size (#5770)
  • update arrow (#5766)
  • chore(rust,python) Change allow_streaming to streaming (#5747)
  • remove rev-map from ChunkedArray (#5721)
  • simplify fast projection by schema (#5716)
  • Reindent df! docs code (#5698)
  • remove Series::append_array (#5681)
  • Remove unused symbols and uneeded mut qualifier (#5672)
  • Include license files in Rust crates (#5675)
  • Use NaiveTime::from_hms_opt instead of NaiveTime::from_hms (#5664)
  • use xxhash3 for string types (#5617)
  • iso weekday (#5598)
  • Improve contributing guide (#5558)
  • streaming improvements (#5541)
  • Refer to DataFrame::unique instead of distinct (#5482)
  • don't panic if part of query cannot run strea… (#5458)
  • make generic join builder more dry (#5439)
  • use IdHash for streaming groupby generic (#5435)
  • fix freeze/stall when writing more than 2^31 string values to parquet (#5366)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @CalOmnie, @Kuhlwein, @MarcoGorelli, @OneRaynyDay, @YuRiTan, @alexander-beedie, @andrewpollack, @ankane, @braaannigan, @chitralverma, @dannyvankooten, @ghais, @ghuls, @jjerphan, @matteosantama, @messense, @owrior, @pickfire, @ritchie46, @s1ck, @sa-, @slonik-az, @sorhawell, @stinodego, @universalmind303 and @zundertj

Python Polars 0.15.7

19 Dec 08:44
652623d
Compare
Choose a tag to compare

🚀 Performance improvements

  • impove performance reducing window functions with numeric output ~-14% (#5841)

✨ Enhancements

  • allow more pyarrow literals (#5842)
  • add sink_parquet (#5480)
  • release GIL when writing (#5830)
  • Support parsing more float string representations. (#5824)
  • implement mean aggregation for duration (#5807)
  • implement sensible boolean aggregates (#5806)

🐞 Bug fixes

  • correct invalid type in struct anyvalue access (#5844)
  • don't set fast_explode if null values in list (#5838)
  • duration formatting (#5837)
  • respect fetch in union (#5836)
  • keep f32 dtype in fill_null by int (#5834)
  • fix(python): fix delta issues (#5802)
  • err on epoch on time dtype (#5831)
  • fix panic in hmean (#5808)
  • asof join by logical groups (#5805)

🛠️ Other improvements

  • lazily import connectorx (#5835)

Thank you to all our contributors for making this release possible!
@chitralverma, @ghuls and @ritchie46

Python Polars 0.15.6

14 Dec 07:26
a60a59e
Compare
Choose a tag to compare

🐞 Bug fixes

  • fix struct dataset (#5798)
  • fix parquet regression upstream in arrow2 (#5797)

🛠️ Other improvements

  • remove unused cmake-rs patch (#5794)

Thank you to all our contributors for making this release possible!
@OneRaynyDay, @messense, @ritchie46 and @universalmind303

Python Polars 0.15.3

12 Dec 13:06
1f500a6
Compare
Choose a tag to compare

🚀 Performance improvements

  • set_sorted flag when creating from literal (#5728)
  • use sorted fast path in streaming groupby (#5727)

✨ Enhancements

  • push down predicates to pyarrow datasets (#5780)
  • Support for reading delta lake tables (#5761)
  • Add DataFrame.glimpse() (#5622)
  • allow expression as quantile input (#5751)
  • accept expression in str.extract_all (#5742)
  • tz-aware strptime (#5736)
  • lazy diagonal concat. (#5647)
  • to_struct add upper_bound (#5714)

🐞 Bug fixes

  • fix(rust, python) Summation on empty series evaluates to Some(0) (#5773)
  • empty concat utf8 (#5768)
  • projection pushdown with union and asof join (#5763)
  • check null values in asof_join + groupby (#5756)
  • fix generic streaming groupby on logical types (#5752)
  • fix date_range on expressions (#5750)
  • fix dtypes in join_asof_by (#5746)
  • fix group order in binary aggregation (#5744)
  • implement min/max aggregation for utf8 in groupby (#5737)
  • fix all_null/sorted into_groups panic (#5733)
  • address several edge-cases found when asserting NaN equality (#5732)
  • asof join 'by', 'forward' combination (#5720)

🛠️ Other improvements

  • add DataFrame.pearson_corr to reference (#5772)
  • Parse fixed timezone offsets without pytz (#5769)
  • chore(rust,python) Change allow_streaming to streaming (#5747)
  • Remove pyarrow nightlies requirement. (#5719)
  • fix incorrect accepted type in df.write_csv (#5715)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @MarcoGorelli, @alexander-beedie, @andrewpollack, @braaannigan, @chitralverma, @ghuls, @ritchie46, @sa- and @zundertj

Python Polars 0.15.2

02 Dec 19:03
3ff9c56
Compare
Choose a tag to compare

🚀 Performance improvements

  • ensure fast_explode propagates (#5676)

✨ Enhancements

  • Series.get_chunks (#5701)
  • inversely scale chunk_size with thread count in s… (#5699)
  • add streaming minmax (#5693)
  • Support large page sizes on aarch64 linux builds (#5694)
  • improve dynamic inference of anyvalues and structs (#5690)
  • support is_in for boolean dtype (#5682)
  • add notebook html repr for Series (#5653)

🐞 Bug fixes

  • fix pivot on floating point indexes (#5704)
  • fix arange with column/literal input (#5703)
  • fix double projection that leads to uneven union d… (#5700)
  • Fix Series -> Expr dispatch for @Property methods (#5689)
  • fix asof join schema (#5686)
  • fix owned arithmetic schema (#5685)
  • take glob into account in scan_csv 'with_schema_mo… (#5683)
  • fix boolean schema in agg_max/min (#5678)
  • fix boolean arg-max if all equal (#5680)
  • respect python objects read method even if filename is f… (#5677)
  • Fix DataFrame.n_chunks return type (#5650)

🛠️ Other improvements

  • Parametrize test_parquet_datetime (#5696)
  • Function and lazy function doctrings (#5657)
  • Fix formatting (#5658)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ankane, @braaannigan, @ghais, @ghuls, @jjerphan, @pickfire, @ritchie46, @stinodego and @zundertj

Python Polars 0.15.1

26 Nov 13:39
9eaf247
Compare
Choose a tag to compare

⚠️ Breaking changes

  • Update Expr.sample signature and change random seeding (#4648)
  • rollup breaking changes (#5602)
  • iso weekday (#5598)
  • Change null_equal default to True for Series.series_equal (#5051)
  • rollup breaking changes (#5602)

🚀 Performance improvements

  • fix quadratic time complexity of groupby in stream… (#5614)
  • Improve performance of indexing operations on Series. (#5610)
  • Aggregate projection pushdown (#5556)

✨ Enhancements

  • add a cache to strptime (#5628)
  • add nearest interpolation strategy (#5626)
  • Update Expr.sample signature and change random seeding (#4648)
  • Change null_equal default to True for Series.series_equal (#5051)
  • make cast recursive (#5596)
  • add arg_min/arg_max for series of dtype boolean (#5592)

🐞 Bug fixes

  • early error on duplicate names in streaming groupby (#5638)
  • fix streaming groupby aggregate types (#5636)
  • convert panic to err in concat_list (#5637)
  • fix dot diagram of single nodes (#5624)
  • fix dynamic struct inference (#5619)
  • tz-aware filtering (#5603)
  • keep dtype when eval on empty list (#5597)
  • fix ternary with list output on empty frame (#5595)
  • fix tz-awareness of truncate (#5591)
  • check chunks before doing chunked_id join optimiza… (#5589)
  • invert cast_time_zone conversion (#5587)
  • asof join ensure join column is not dropped when '… (#5585)

🛠️ Other improvements

  • Remaining docstring examples for frame and lazyframe (#5630)
  • use xxhash3 for string types (#5617)
  • only trigger build.rs file if that file itself has cha… (#5618)
  • iso weekday (#5598)
  • Merge release workflows (#5564)
  • Fix broken lint workflow (#5584)

Thank you to all our contributors for making this release possible!
@Kuhlwein, @braaannigan, @ghuls, @matteosantama, @ritchie46 and @stinodego

Python Polars 0.14.31

22 Nov 12:37
12c21e5
Compare
Choose a tag to compare

🚀 Performance improvements

  • improve streaming primitve groupby (#5575)
  • vectorize integer vec-hash by using very simple, … (#5572)

✨ Enhancements

  • prefer streaming groupby if partitionable (#5580)

🐞 Bug fixes

  • fix ub due to invalid dtype on splitting dfs (#5579)

🛠️ Other improvements

  • Remove old Python changelog file (#5577)
  • namespace registration docs update (#5565)
  • Improve contributing guide (#5558)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ghuls, @ritchie46 and @stinodego