Skip to content

Releases: civisanalytics/civis-python

v1.9.0

25 Apr 20:35
60a86aa
Compare
Choose a tag to compare

1.9.0 - 2018-04-25

Fixed

  • Added more robust parsing for tablename parsing in io. You may now
    pass in tables like schema."tablename.with.periods".
  • Adding in missing documentation for civis_file_to_table
  • Include JSON files with pip distributions (#244)
  • Added flush to civis_to_file when passed a user-created buffer,
    ensuring the buffer contains the entire file when subsequently read.
  • Fix several tests in the test_io module (#248)
  • Travis tests for Python 3.4 are now restricted to pandas<=0.20, the
    last version which supported Python 3.4 (#249)

Added

  • Added a utility function which can robustly split a Redshift schema name
    and table name which are presented as a single string joined by a "." (#225)
  • Added docstrings for civis.find and civis.find_one. (#224)
  • Executors in futures (and the joblib backend, which uses them) will now
    add "CIVIS_PARENT_JOB_ID" and "CIVIS_PARENT_RUN_ID" environment variables
    to the child jobs they create (#236)
  • Update default CivisML version to v2.2. This includes a new function
    ModelPipeline.register_pretrained_model which allows users to train
    a model outside of Civis Platform and use CivisML to score it at scale (#242, #247).
  • Added a new parameter dvs_to_predict to civis.ml.ModelPipeline.predict.
    This allows users to select a subset of a model's outputs for scoring (#241).
  • Added civis.io.export_to_civis_file to store results of a SQL query
    to a Civis file
  • Surfaced civis.find and civis.find_one in the Sphinx docs. (#250)

Changed

  • Moved "Optional Dependencies" doc section to top of ML docs, and
    added clarifications for pre-defined models with non-sklearn
    estimators (#238)
  • Switched to pip install-ing dependencies for building the documentation (#230)
  • Added a merge rule for the changelog to .gitattributes (#229)
  • Default to "all" API resources rather than "base".
  • Updated documentation on algorithm hyperparameters to reflect changes with
    CivisML v2.2 release (#240)

v1.8.1

01 Feb 15:54
4233867
Compare
Choose a tag to compare

1.8.1 - 2018-02-01

Fixed

  • Added missing string formatting to a log emit in file multipart upload and
    correct ordering of parameters in another log emit (#217)

Changed

  • Updated CivisML 2.0 notebook (#214)
  • Reworded output of civis notebooks new CLI command (#215)

Added

  • Added a script for integration tests (smoke tests) (#216)

v1.8.0

23 Jan 14:42
fdee708
Compare
Choose a tag to compare

1.8.0

2018-01-23

https://github.com/civisanalytics/civis-python/milestone/13

Added

  • Documentation updated to reflect CivisML 2.1 features (#209)
  • civis.io.dataframe_to_civis, civis.io.csv_to_civis, and civis.io.civis_file_to_table functions now support the diststyle parameter.
  • New notebook-related CLI commands: "new", "up", "down", and "open".
  • Additional documentation for using the Civis joblib backend (#199)
  • Documented additional soft dependencies for CivisML (#203)

Changed

  • Changed ModelPipeline.train default for n_jobs from 4 to None, so that n_jobs will be dynamically calculated by default (#203)
  • Use "feather"-formatted files to send data from users to CivisML, if possible. Require this when using pd.Categorical types, since CSVs require us to re-infer column types, and this can fail. Using feather should also give a speed improvement; it reads and writes faster than CSVs and produces smaller files (#200).
  • ModelFuture objects will emit any warnings which occurred during their corresponding CivisML job (#204)
  • Removed line setting "n_jobs" from an example of CivisML prediction. Recommended use is to let CivisML determine the number of jobs itself (#211).
  • Update maximum CivisML version to v2.1; adjust fallback logic such that users get the most recent available release (#212).

Fixed

  • Restored the pre-v1.7.0 default behavior of the joblib backend by setting the remote_backend parameter default to 'sequential' as opposed to 'civis'. The default of 'civis' would launch additional containers in nested calls to joblib.Parallel. (#205)
  • If validation metadata are missing, ModelFuture objects will return None for metrics or validation metadata, rather than issuing an exception (#208)
  • Allowed callers to pass index and encoding arguments to the to_csv method through dataframe_to_civis.

Performance Enhancements

  • civis.io.file_to_civis now uses additional file handles for multipart upload instead of writing to disk to reduce disk usage
  • civis.io.dataframe_to_civis writes dataframes to disk instead of using an in memory buffer

v1.7.2

09 Jan 21:23
Compare
Choose a tag to compare

1.7.2 - 2018-01-09

Fixed

  • Relaxed requirement on cloudpickle version number (#187)
  • Restore previous behavior of civis.io.civis_to_csv when using "compression='gzip'" (#195)

v1.7.1

16 Nov 18:58
1b9065f
Compare
Choose a tag to compare

1.7.1 - 2017-11-16

Fixed

  • Specify escape character in civis.io.read_civis_sql when performing parallel unload
  • Issue uploading files in civis.io.file_to_civis
  • Revert performance enhancement that will change format of file produced by civis.io.civis_to_csv

v1.7.0

15 Nov 16:20
df6d465
Compare
Choose a tag to compare

1.7.0 - 2017-11-15

Changed

  • Updated CivisML template ids to v2.0 (#139)
  • Optional arguments to API endpoints now display in function signatures.
    Function signatures show a default value of "DEFAULT"; arguments will still
    only be transmitted to the Civis Platform API when explicitly provided. (#140)
  • APIClient.feature_flags has been deprecated to avoid a name collision
    with the feature_flags endpoint. In v2.0.0, APIClient.featureflags
    will be renamed to APIClient.feature_flags.
  • The following APIClient attributes have been deprecated in favor of the
    attribute that includes underscores:
    APIClient.bocceclusters -> APIClient.bocce_clusters
    APIClient.matchtargets -> APIClient.match_targets
    APIClient.remotehosts -> APIClient.remote_hosts
  • civis.io.csv_to_civis and civis.io.dataframe_to_civis functions now use
    civis.io.file_to_civis and civis.io.civis_file_to_table functions instead
    of separate logic
  • civis.io.file_to_civis, civis.io.csv_to_civis and civis.io.dataframe_to_civis
    now support files over 5GB
  • Refactor internals of CivisFuture and PollableResult to centralize handling
    of threads and pubnub subscription.
  • Updated API specification and base resources to include all general
    availability endpoints.
  • Changed civis.io.file_to_civis and civis.io.civis_to_file to allow
    strings for paths to local files in addition to just file/buffer objects.

Fixed

  • Fixed parsing of multiword endpoints. Parsing no longer removes underscores
    in endpoint names.
  • In civis.futures.ContainerFuture, return False when users attempt to cancel
    an already-completed job. Previously, the object would sometimes give a CivisAPIError
    with a 404 status code. This fix affects the executors and joblib backend, which
    use the ContainerFuture.
  • Tell flake8 to ignore a broad except in a CivisFuture callback.
  • Close open sockets (in both the APIClient and CivisFuture) when they're no
    longer needed, so as to not use more system file handles than necessary (#173).
  • Correct treatment of FileNotFoundError in Python 2 (#176).
  • Fixed parsing of endpoints containing hyphens. Hyphens are replaced with
    underscores.
  • Use civis.compat.TemporaryDirectory in civis.io.file_to_civis to be
    compatible with Python 2.7
  • Catch notifications sent up to 30 seconds before the CivisFuture connects.
    Fixes a bug where we would sometimes miss an immediate error on SQL scripts (#174).

Added

  • civis.resources.cache_api_spec function to make it easier to record the
    current API spec locally (#141).
  • Autospecced mock of the APIClient for use in testing third-party code which
    uses this library (#141).
  • Added etl, n_jobs, and validation_data arguments to
    ModelPipeline.train (#139).
  • Added cpu, memory, and disk arguments to ModelPipeline.predict
    (#139).
  • Added remote_backend keyword to the civis.parallel.make_backend_factory
    and civis.parallel.infer_backend_factory in order to set the joblib
    backend in the container for nested calls to joblib.Parallel.
  • Added the PyPI trove classifiers for Python 3.4 and 3.6 (#152).
  • civis.io.civis_file_to_table function to import an existing Civis file
    to a table
  • civis.io.file_to_civis function will now automatically retry uploads to
    the Civis Platform up to 5 times if is there is an HTTPError, ConnectionError
    or ConnectionTimeout
  • Additional documentation about the use case for the Civis joblib backend.
  • Added a note about serializing ModelPipeline APIClient objects to the docstring.
  • Added civis notebooks download command-line interface command to facilitate
    downloading notebooks.

Performance Enhancements

  • civis.io.file_to_civis now takes advantage of multipart uploads to chunk
    files and perform I/O in parallel
  • civis.io.civis_to_csv and civis.io.read_civis_sql will always request
    data with gzip compression to reduce I/O. Also, they will attempt to fetch
    headers in a separate query so that data can be unloaded in parallel
  • civis.io.civis_to_csv with compression='gzip' currently returns a file
    with no compression. In a future release, compression='gzip' will return a
    gzip compressed file.

v1.6.2

08 Sep 12:58
Compare
Choose a tag to compare

1.6.2 - 2017-09-08

Changed

  • Added explanatory text to CivisML_parallel_training.ipynb (#126).

Fixed

  • Added ResourceWarning for Python 2.7 (#128).
  • Added TypeError for multi-indexed dataframes when used as input to
    CivisML (#131).
  • ModelPipeline.from_existing will warn if users attempt to recreate
    a model trained with a newer version of CivisML, and fall back on the
    most recent prediction template it knows of (#134).
  • Make the PaginatedResponse returned by LIST endpoints a full iterator.
    This also makes the iterator=True parameter work in Python 2.
  • When using civis.io.civis_to_csv, emit a warning on SQL queries which
    return no results instead of allowing a cryptic IndexError to surface (#135).
  • Fixed the example code snippet for civis.io.civis_to_multifile_csv.
    Also provided more details on its return dict in the docstring.
  • Pinned down sphinx_rtd_theme and numpydoc in dev-requirements.txt
    for building the documentation.

Added

  • Jupyter notebook with demonstrations of use patterns and abstractions in the Python API client (#127).

v1.6.1

22 Aug 17:52
Compare
Choose a tag to compare

1.6.1 - 2017-08-22

Changed

  • Catch unnecessary warning while importing xgboost in CivisML_parallel_training.ipynb (#121)

Fixed

  • Fixed bug where instantiating a new model via ModelPipeline.from_existing from an existing model with empty "PARAMS" and "CV_PARAMS" boxes fails (#122).
  • Users can now access the ml and parallel namespaces from the base civis namespace (#123).
  • Parameters in the Civis API documentation now display in the proper order (#124).

v1.6.0

27 Jul 14:39
Compare
Choose a tag to compare

1.6.0 - 2017-07-27

Changed

  • Edited example for safer null value handling (#101).
  • Make pubnub and joblib hard dependencies instead of optional dependencies (#110).
  • Retry network errors and wait for API rate limit refresh when using the CLI (#117).
  • The CLI now provides a User-Agent header which starts with "civis-cli" (#117)
  • Include pandas and sklearn-dependent code in Travis CI tests (#119).

Added

  • Version 1.1 of CivisML, with custom dependency installation from remote git hosting services (i.e., Github, Bitbucket). (#103)
  • Added email notifications option to ModelPipeline (#99).
  • Added custom joblib backend for multiprocessing in the Civis Platform. Public-facing functions are make_backend_factory, make_backend_template_factory, and infer_backend_factory. Includes a new hard dependency on cloudpickle to facilitate code transport (#102, #104, #106, #107, #111, #112, #113, #114).

Fixed

  • Fixed a bug where the version of a dependency for Python 2.7 usage was incorrectly specified (#96).
  • Non-seekable file-like objects can now be provided to civis.io.file_to_civis. Only seekable file-like objects will be streamed (#115).
  • The civis.ml.ModelFuture no longer raises an exception if its model job is cancelled (#116).
  • The CLI's API spec cache now expires after 24 hours instead of 10 seconds (#118).

v1.5.2

17 May 20:41
Compare
Choose a tag to compare

Fixed

  • Fixed a bug where ModelFuture.validation_metadata would not source training job metadata for a ModelFuture corresponding to prediction job (#90).
  • Added more locks to improve thread safety in the PollableResult and CivisFuture.
  • Fix issue with Python 2/3 dependency management (#89).