Skip to content

8.0.0

Compare
Choose a tag to compare
@ibis-project-bot ibis-project-bot released this 05 Feb 19:31
· 774 commits to main since this release

8.0.0 (2024-02-05)

⚠ BREAKING CHANGES

  • backends: Columns with Ibis date types are now returned as object dtype containing datetime.date objects when executing with the pandas backend.
  • impala: Direct HDFS integration is removed and support for ingesting pandas DataFrames directly is as well. The Impala backend still works with HDFS, but data in HDFS must be managed outside of ibis.
  • api: replace ibis.show_sql(expr) calls with print(ibis.to_sql(expr)) or if using Jupyter or IPython ibis.to_sql(expr)
  • bigquery: nullifzero is removed; use nullif(0) instead
  • bigquery: zeroifnull is removed; use fillna(0) instead
  • bigquery: list_databases is removed; use list_schemas instead
  • bigquery: the bigquery current_database method returns the data_project instead of the dataset_id. Use current_schema to retrieve dataset_id. To explicitly list tables in a given project and dataset, you can use f"{con.current_database}.{con.current_schema}"

Features

  • api: define RegexSplit operation and re_split API (07beaed)
  • api: support median and quantile on more types (#7810) (49c75a8)
  • clickhouse: implement RegexSplit (e3c507e)
  • datafusion: implement ops.RegexSplit using pyarrow UDF (37b6b7f)
  • datafusion: set ops (37abea9)
  • datatypes: add decimal and basic geospatial support to the sqlglot type parser/generator (59783b9)
  • datatypes: make intervals round trip through sqlglot type mapper (d22f97a)
  • duckdb-geospatial: add support for flipping coordinates (d47088b)
  • duckdb-geospatial: enable use of literals (23ad256)
  • duckdb: implement RegexSplit (229a1f4)
  • examples: add zones geojson example (#8040) (2d562b7), closes #7958
  • flink: add new temporal operators (dfef418)
  • flink: add primary key support (da04679)
  • flink: export result to pyarrow (9566263)
  • flink: implement array operators (#7951) (80e13b4)
  • flink: implement struct field, clean up literal, and adjust timecontext test markers (#7997) (2d5e108)
  • impala: rudimentary date support (d4bcf7b)
  • mssql: add hashbytes and test for binary output hash fns (#8107) (91f60cd), closes #8082 #8082
  • mssql: use odbc (f03ad0c)
  • polars: implement ops.RegexSplit using pyarrow UDF (a3bed10)
  • postgres: implement RegexSplit (c955b6a)
  • pyspark: implement RegexSplit (cfe0329)
  • risingwave: init impl for Risingwave (#7954) (351747a), closes #8038
  • snowflake: implement RegexSplit (2c1a726)
  • snowflake: implement insert method (2162e3f)
  • trino: implement RegexSplit (9d1295f)

Bug Fixes

  • api: deferred values are not truthy (00b3ece)
  • backends: ensure that returned date results are actually proper date values (0626fb2)
  • backends: preserve order_by position in window function when subsequent expressions are duplicated (#7943) (89056b9), closes #7940
  • common: do not convert callables to resolveable objects (9963705)
  • datafusion: work around lack of support for uppercase units in intervals (ebb6cde)
  • datatypes: ensure that array construction supports literals and infers their shape from its inputs (#8049) (899dce1), closes #8022
  • datatypes: fix bad references in to_numpy() (6fd4550)
  • deps: remove filelock from required dependencies (76dded5)
  • deps: update dependency black to v24 (425f7b1)
  • deps: update dependency datafusion to v34 (601f889)
  • deps: update dependency datafusion to v35 (#8224) (a34af25)
  • deps: update dependency oracledb to v2 (e7419ca)
  • deps: update dependency pyarrow to v15 (ef6a9bd)
  • deps: update dependency pyodbc to v5 (32044ea)
  • docs: surround executable code blocks with interactive mode on/off (4c660e0)
  • duckdb: allow table creation from expr with geospatial datatypes (#7818) (ecac322)
  • duckdb: ensure that casting to floating point values produces valid types in generated sql (424b206)
  • examples: use anonymous access when reading example data from GCS (8e5c0af)
  • impala: generate memtables using UNION ALL to work around sqlglot bug (399a5ef)
  • mutate/select: ensure that unsplatted dictionaries work in mutateandselect APIs (#8014) (8ed19ea), closes #8013
  • mysql: catch PyMySQL OperationalError exception (#7919) (f2c2664), closes #6010 #7918
  • pandas: support non-string categorical columns (5de08c7)
  • polars: avoid using unnecessary subquery for schema inference (0f43667)
  • polars: handle integers coming out of high precision numpy datetime64 values (bcf36cb)
  • postgres: ensure that no timezone conversion takes place on timestamptz columns when selecting them out (7b79ec8)
  • repr: default to pa.binary for all geospatial dtypes (#7817) (066d3fc)
  • repr: force exception message to console in IPython in interactive mode (414c49a)
  • snowflake: insert into the correct object (5e1efe3)
  • sqlalchemy: properly handle aliases of extracted subqueries (38aaf8f)
  • sqlglot: stop using removed singletons for true, false, null (4fb0aad)

Documentation

  • add composable data ecosystem concept (#7898) (d78a887), closes #6618
  • add exasol to list of supported backends (4fae620)
  • add ibis.join() to docs (#7913) (de2e282), closes #7895
  • add image preview for index page (#7920) (ac2375a)
  • add post about move to Zulip chat (#7889) (88f1ee8), closes #7888
  • add quotes around install in 1brc post (#8065) (5998143)
  • add user testimonials page (#7897) (c0714f8), closes #7341
  • blog for the 1 billion row challenge (#8004) (141edea)
  • blog-post: replicate spatial dev guru blog (4b73c3b)
  • blog: redux array blog with equivalent duckdb and bq expressions (5bde8da)
  • blog: show how to install geospatial dependencies (951a169)
  • blog: update geospatial - no need to_array() (78434a0)
  • contrib: add pull request template (effd461)
  • deps: bump quarto version to pick up dashboard feature (79657db)
  • dev: update maintainers guide (d67409c)
  • document possible range of seed values to Table.sample (6a652ec)
  • duckdb: correct wording for empty path logic (72b2cde)
  • fix formatting for note on _name, _dtype (#7911) (e58be2e)
  • fix rolling date on bigquery/duckdb array blog (#8059) (fb09b78)
  • flink: add to the set of documented backends (83eab61)
  • flink: override default install instructions (4fc8e75)
  • geospatial: add examples for duckdb supported methods (#8128) (2a92306), closes #7959
  • geospatial: fix flaky ci geo-literals doctests (417e81d)
  • hyphenate "properly formatted" and add colon (5ab1c27)
  • ibis-analytics blog post (#7990) (17a1ef2)
  • improve UDF signature docs (#8194) (3cdc6ce)
  • include American spelling usage in style guide (#8163) (ac72157), closes #8162
  • kedro blog post link (#8150) (1ffe435)
  • meta: add goatcounter to header of all quarto pages (fd2e6c9)
  • minor edit to the who supports ibis doc (#7896) (d5a0779)
  • minor update to composable data ecosystem concept (a46bd4a)
  • pandas: fix format for kwarg warning callout (0f6d45d)
  • pyspark: document ibis.connect using a URL (d6049f8)
  • pyspark: mention using ibis.connect (33c855a)
  • random: document behavior of repeated use of ibis.random() instance (f4b67e5)
  • row_number always starts at 0 (#8209) (5a26c05)
  • security: add a security policy (33e9f26)
  • sql-tutorial: fix minor typo in union section of SQL user tutorial (ca6c2a5)
  • style: add style guide to contributing (#8092) (b807555), closes #7094
  • support-matrix: replace the backend info streamlit app with a static quarto dashboard (f9da637)
  • update quickstart to use rename (#8196) (9ed4e92)
  • update release date on Ibis geospatial dev guru post (175f141)
  • who supports Ibis (#7892) (1a5a420), closes #7743

Refactors

  • api: remove show_sql in favor of print(to_sql) (36da8c1)
  • bigquery: remove list_databases (22e5ada)
  • bigquery: remove nullifzero (8447b9a)
  • bigquery: remove zeroifnull (8be3c25)
  • bigquery: return data_project as database, not dataset_id (05608eb)
  • deps: make pins an optional dependency through an examples extra (#7878) (3d6c3f1), closes #7844
  • flink: expose raw_sql over _exec_sql (0b66b94)
  • impala: modernize the impala backend (252833d)

Deprecations

  • deprecate Value.least() and Value.greatest() (f711337)