Skip to content

Releases: svenkreiss/pysparkling

v0.6.2

13 Nov 18:39
Compare
Choose a tag to compare
  • make dependencies optional: boto, requests
  • compatibility

v0.6.1

10 Jan 21:14
0815791
Compare
Choose a tag to compare

testing continuous deployment

v0.6.0

13 Jul 07:22
Compare
Choose a tag to compare
  • Broadcast, Accumulator and AccumulatorParam by @alexprengere
  • support for increasing partition numbers in coalesce and repartition by @tools4origins

v0.5.0

03 May 10:34
Compare
Choose a tag to compare
  • fixes for HDFS thanks to @tools4origins
  • fix for empty partitions by @tools4origins
  • api fixes by @artem0 and @tools4origins
  • various updates for streaming submodule
  • various updates to lint and test system
  • logging: converted some info messages to debug
  • ... documentation for some point releases is missing

v0.4.1

27 May 20:24
Compare
Choose a tag to compare
  • retries for failed partitions
  • improve pysparkling.streaming.DStream
  • updates to docs

v0.4.0

11 Mar 14:09
Compare
Choose a tag to compare
  • major addition: pysparkling.streaming
  • updates to RDD.sample()
  • reorganized scripts and tests
  • added RDD.partitionBy()
  • minor updates to pysparkling.fileio

v0.3.23

06 Aug 19:57
Compare
Choose a tag to compare

small improvements to fileio and better documentation

v0.3.22

18 Jun 18:04
Compare
Choose a tag to compare
  • reimplement RDD.groupByKey()
  • clean up of docstrings

v0.3.21

31 May 12:28
Compare
Choose a tag to compare
  • faster text file reading by using io.TextIOWrapper for decoding

v0.3.20

01 May 23:38
Compare
Choose a tag to compare
* Google Storage file system (using ``gs://``)
* dependencies: ``requests`` and ``boto`` are not optional anymore
* ``aggregateByKey()`` and ``foldByKey()`` return RDDs
* Python 3: use ``sys.maxsize`` instead of ``sys.maxint``
* flake8 linting