01 May 17:19

edgararuiz

286a7d7

sparklyr 1.8.6 Latest

Latest

Sparklyr 1.8.6

Addresses issues with R 4.4.0. The root cause was that version checking functions
changed how the work.
- package_version() no longer accepts numeric_version() output. Wrapped
  the package_version() function to coerce the argument if it's a
  numeric_version class
- Comparison operators (<, >=, etc.) for packageVersion() do no longer accept numeric values.
  The changes were to pass the version as a character
Adding support for Databricks "autoloader" (format: cloudFiles) for streaming ingestion of files(stream_read_cloudfiles)(@zacdav-db #3432):
- stream_write_table()
- stream_read_table()
Made changes to stream_write_generic (@zacdav-db #3432):
- toTable method doesn't allow calling start, added to_table param that adjusts logic
- path option not propagated when to_table is TRUE
Upgrades to Roxygen version 7.3.1

Contributors

zacdav-db

Assets 2

26 Mar 12:02

edgararuiz

v1.8.5

1123de1

sparklyr 1.8.5

Fixes

Fixes quoting issue with dbplyr 2.5.0 (#3429)
Fixes Windows OS identification (#3426)

Package improvements

Removes dependency on tibble, all calls are now redirected to dplyr (#3399)
Removes dependency on rapddirs (#3401):
- Backwards compatibility with sparklyr 0.5 is no longer needed
- Replicates selection of cache directory
Converts spark_apply() to a method (#3418)

Spark improvements

Spark 2.3 is no longer considered maintained as of September 2019
- Removes Java folder for versions 2.3 and below
- Merges Scala file sets into Spark version 2.4
- Re-compiles JARs for version 2.4 and above
Updates Delta-to-Spark version matching when using delta as one of the
packages when connecting (#3414)

Assets 2

30 Oct 15:17

edgararuiz

v1.8.4

9fe4405

sparklyr 1.8.4

Sparklyr 1.8.4

Compatability with new `dbplyr` version

Fixes db_connection_describe() S3 consistency error (@t-kalinowski)
Addresses new error from dbplyr that fails when you try to access
components from a remote tbl using $
Bumps the version of dbplyr to switch between the two methods to create
temporary tables
Addresses new translate_sql() hard requirement to pass a con object. Done
by passing the current connection or simulate_hive()

Fixes

Small fix to spark_connect_method() arguments. Removes 'hadoop_version'
Improvements to handling pysparklyr load (@t-kalinowski)
Fixes 'subscript out of bounds' issue found by pysparklyr (@t-kalinowski)
Updates available Spark download links

Improvements

Removes dependency on the following packages:
- digest
- base64enc
- ellipsis
Converts ml_fit() into a S3 method for pysparklyr compatibility

Test improvements

Improvements and fixes to tests (@t-kalinowski)
Fixes test jobs that include should have included Arrow but did not
Updates to the Spark versions to be tested
Re-adds tests for development dbplyr

Contributors

t-kalinowski

Assets 2

05 Sep 13:12

edgararuiz

v1.8.3

73fee2f

sparklyr 1.8.3

Sparklyr 1.8.3

Improvements

Spark error message relays are now cached instead of the entire content
displayed as an R error. This used to overwhelm the interactive session's
console or Notebook, because of the amount of lines returned by the
Spark message. Now, by default, it will return the top of the Spark
error message, which is typically the most relevant part. The full error can
still be accessed using a new function called spark_last_error()
Reduces redundancy on several tests
Handles SQL quoting when the table reference contains multiple levels. The
common time someone would encounter an issue is when a table name is passed
using in_catalog(), or in_schema().

Java

Adds Scala scripts to handle changes in the upcoming version of Spark (3.5)
Adds new JAR file to handle Spark 3.0 to 3.4
Adds new JAR file to handle Spark 3.5 and above

Fixes

It prevents an error when na.rm = TRUE is explicitly set within pmax() and
pmin(). It will now also purposely fail if na.rm is set to FALSE. The
default of these functions in base R is for na.rm to be FALSE, but ever
since these functions were released, there has been no warning or error. For now,
we will keep that behavior until a better approach can be figured out. (#3353)
spark_install() will now properly match when a partial version is passed
to the function. The issue was that passing '2.3' would match to '3.2.3', instead
of '2.3.x' (#3370)

Package integration

Adds functionality to allow other packages to provide sparklyr additional
back-ends. This effort is mainly focused on adding the ability to integrate
with Spark Connect and Databricks Connect through a new package.
New exported functions to integrate with the RStudio IDE. They all have the
same spark_ide_ prefix
Modifies several read functions to become exported methods, such as
sdf_read_column().
Adds spark_integ_test_skip() function. This is to allow other packages
to use sparklyr's test suite. It enables a way to the external package to
indicate if a given test should run or be skipped.
If installed, sparklyr will load the pysparklyr package

Assets 2

01 Jul 18:44

edgararuiz

v1.8.2

8c29f3f

sparklyr 1.8.2

New Features

Adds Azure Synapse Analytics connectivity (@Bob-Chou , #3336)
Adds support for "parameterized" queries now available in Spark 3.4 (@gregleleu #3335)
Adds new DBI methods: dbValid and dbDisconnect (@alibell, #3296)
Adds overwrite parameter to dbWriteTable() (@alibell, #3296)
Adds database parameter to dbListTables() (@alibell, #3296)
Adds ability to turn off predicate support (where(), across()) using
options("sparklyr.support.predicates" = FALSE). Defaults to TRUE. This should
accelerate dplyr commands because it won't need to process column types
for every single piped command

Fixes

Fixes Spark download locations (#3331)
Fix various rlang deprecation warnings (@mgirlich, #3333).

Misc

Switches upper version of Spark to 3.4, and updates JARS (#3334)

Contributors

alibell, mgirlich, and 2 other contributors

Assets 2

22 Mar 14:45

edgararuiz

v1.8.1

38f8bcf

sparklyr 1.8.1

Bug Fixes

Fixes consistency issues with dplyr's sample_n(), slice(), op_vars(), and sample_frac()

Internal functionality

Adds R-devel to GHA testing

Assets 2

21 Mar 19:07

edgararuiz

v1.8.0

70deae1

sparklyr 1.8.0

Bug Fixes

Addresses Warning from CRAN checks
Addresses option(stringsAsFactors) usage
Fixes root cause of issue processing pivot wider and distinct (#3317 & #3320)
Updates local Spark download sources

Assets 2

16 Aug 20:45

edgararuiz

v1.7.8

2cc7e04

sparklyr 1.7.8

New features

Adds new metric extraction functions: ml_metrics_binary(),
ml_metrics_regression() and ml_metrics_multiclass(). They work closer to
how yardstick metric extraction functions work. They expect a table with
the predictions and actual values, and returns a concise tibble with the
metrics. (#3281)
Adds new spark_insert_table() function. This allows one to insert data into
an existing table definition without redefining the table, even when overwriting
the existing data. (#3272 @jimhester)

Bug Fixes

Restores "validator" functions to regression models. Removing them in a previous
version broke ml_cross_validator() for regression models. (#3273)

Spark

Adds support to Spark 3.3 local installation. This includes the ability to
enable and setup log4j version 2. (#3269)
Updates the JSON file that sparklyr uses to find and download Spark for
local use. It is worth mentioning that starting with Spark 3.3, the Hadoop
version number is no longer using a minor version for its download link. So,
instead of requesting 3.2, the version to request is 3.

Internal functionality

Removes workaround for older versions of arrow. Bumps arrow version
dependency, from 0.14.0 to 0.17.0 (#3283 @nealrichardson)
Removes code related to backwards compatibility with dbplyr. sparklyr
requires dbplyr version 2.2.1 or above, so the code is no longer needed.
(#3277)
Begins centralizing ML parameter validation into a single function that will
run the proper cast function for each Spark parameter. It also starts using
S3 methods, instead of searching for a concatenated function name, to find the
proper parameter validator. Regression models are the first ones to use this
new method. (#3279)
sparklyr compilation routines have been improved and simplified.
spark_compile() now provides more informative output when used. It also adds
tests to compilation to make sure. It also adds a step to install Scala in the
corresponding GHAs. This is so that the new JAR build tests are able to run.
(#3275)
Stops using package environment variables directly. Any package level variable
will be handled by a genv prefixed function to set and retrieve values. This
avoids the risk of having the exact same variable initialized on more than on
R script. (#3274)
Adds more tests to improve coverage.

Misc

Addresses new CRAN HTML check NOTEs. It also adds a new GHA action to run the
same checks to make sure we avoid new issues with this in the future.

Contributors

jimhester and nealrichardson

Assets 2

27 May 14:17

edgararuiz

v1.7.6

07db4bc

sparklyr 1.7.6

Ensures compatibility with Spark version 3.2 (#3261)
Compatibility with new dbplyr version (@mgirlich)
Removes stringr dependency
Fixes augment() when the model was fitted via parsnip (#3233)

Contributors

mgirlich

Assets 2

03 Feb 15:35

edgararuiz

v1.7.5

6e8d910

sparklyr 1.7.5

Misc

Addresses both CRAN Check Results warnings:
- Un-exported object rlang::is_env()
- pivot_wider() S3 consistency issue

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparklyr 1.8.6

Contributors

Fixes

Package improvements

Spark improvements

Sparklyr 1.8.4

Compatability with new `dbplyr` version

Fixes

Improvements

Test improvements

Contributors

Sparklyr 1.8.3

Improvements

Java

Fixes

Package integration

New Features

Fixes

Misc

Contributors

Bug Fixes

Internal functionality

Bug Fixes

New features

Bug Fixes

Spark

Internal functionality

Misc

Contributors

Contributors

Misc

Releases: sparklyr/sparklyr

sparklyr 1.8.6

Sparklyr 1.8.6

Contributors

sparklyr 1.8.5

Fixes

Package improvements

Spark improvements

sparklyr 1.8.4

Sparklyr 1.8.4

Compatability with new dbplyr version

Fixes

Improvements

Test improvements

Contributors

sparklyr 1.8.3

Sparklyr 1.8.3

Improvements

Java

Fixes

Package integration

sparklyr 1.8.2

New Features

Fixes

Misc

Contributors

sparklyr 1.8.1

Bug Fixes

Internal functionality

sparklyr 1.8.0

Bug Fixes

sparklyr 1.7.8

New features

Bug Fixes

Spark

Internal functionality

Misc

Contributors

sparklyr 1.7.6

Contributors

sparklyr 1.7.5

Misc

Compatability with new `dbplyr` version