Releases: civisanalytics/civis-python
Releases · civisanalytics/civis-python
v1.9.0
1.9.0 - 2018-04-25
Fixed
- Added more robust parsing for tablename parsing in io. You may now
pass in tables like schema."tablename.with.periods". - Adding in missing documentation for civis_file_to_table
- Include JSON files with pip distributions (#244)
- Added flush to
civis_to_file
when passed a user-created buffer,
ensuring the buffer contains the entire file when subsequently read. - Fix several tests in the
test_io
module (#248) - Travis tests for Python 3.4 are now restricted to pandas<=0.20, the
last version which supported Python 3.4 (#249)
Added
- Added a utility function which can robustly split a Redshift schema name
and table name which are presented as a single string joined by a "." (#225) - Added docstrings for
civis.find
andcivis.find_one
. (#224) - Executors in
futures
(and the joblib backend, which uses them) will now
add "CIVIS_PARENT_JOB_ID" and "CIVIS_PARENT_RUN_ID" environment variables
to the child jobs they create (#236) - Update default CivisML version to v2.2. This includes a new function
ModelPipeline.register_pretrained_model
which allows users to train
a model outside of Civis Platform and use CivisML to score it at scale (#242, #247). - Added a new parameter
dvs_to_predict
tocivis.ml.ModelPipeline.predict
.
This allows users to select a subset of a model's outputs for scoring (#241). - Added
civis.io.export_to_civis_file
to store results of a SQL query
to a Civis file - Surfaced
civis.find
andcivis.find_one
in the Sphinx docs. (#250)
Changed
- Moved "Optional Dependencies" doc section to top of ML docs, and
added clarifications for pre-defined models with non-sklearn
estimators (#238) - Switched to pip install-ing dependencies for building the documentation (#230)
- Added a merge rule for the changelog to .gitattributes (#229)
- Default to "all" API resources rather than "base".
- Updated documentation on algorithm hyperparameters to reflect changes with
CivisML v2.2 release (#240)
v1.8.1
v1.8.0
1.8.0
2018-01-23
https://github.com/civisanalytics/civis-python/milestone/13
Added
- Documentation updated to reflect CivisML 2.1 features (#209)
civis.io.dataframe_to_civis
,civis.io.csv_to_civis
, andcivis.io.civis_file_to_table
functions now support thediststyle
parameter.- New notebook-related CLI commands: "new", "up", "down", and "open".
- Additional documentation for using the Civis joblib backend (#199)
- Documented additional soft dependencies for CivisML (#203)
Changed
- Changed
ModelPipeline.train
default forn_jobs
from 4 toNone
, so thatn_jobs
will be dynamically calculated by default (#203) - Use "feather"-formatted files to send data from users to CivisML, if possible. Require this when using
pd.Categorical
types, since CSVs require us to re-infer column types, and this can fail. Using feather should also give a speed improvement; it reads and writes faster than CSVs and produces smaller files (#200). ModelFuture
objects will emit any warnings which occurred during their corresponding CivisML job (#204)- Removed line setting "n_jobs" from an example of CivisML prediction. Recommended use is to let CivisML determine the number of jobs itself (#211).
- Update maximum CivisML version to v2.1; adjust fallback logic such that users get the most recent available release (#212).
Fixed
- Restored the pre-v1.7.0 default behavior of the
joblib
backend by setting theremote_backend
parameter default to 'sequential' as opposed to 'civis'. The default of 'civis' would launch additional containers in nested calls tojoblib.Parallel
. (#205) - If validation metadata are missing,
ModelFuture
objects will returnNone
for metrics or validation metadata, rather than issuing an exception (#208) - Allowed callers to pass
index
andencoding
arguments to theto_csv
method throughdataframe_to_civis
.
Performance Enhancements
civis.io.file_to_civis
now uses additional file handles for multipart upload instead of writing to disk to reduce disk usagecivis.io.dataframe_to_civis
writes dataframes to disk instead of using an in memory buffer
v1.7.2
v1.7.1
1.7.1 - 2017-11-16
Fixed
- Specify escape character in
civis.io.read_civis_sql
when performing parallel unload - Issue uploading files in
civis.io.file_to_civis
- Revert performance enhancement that will change format of file produced by
civis.io.civis_to_csv
v1.7.0
1.7.0 - 2017-11-15
Changed
- Updated CivisML template ids to v2.0 (#139)
- Optional arguments to API endpoints now display in function signatures.
Function signatures show a default value of "DEFAULT"; arguments will still
only be transmitted to the Civis Platform API when explicitly provided. (#140) APIClient.feature_flags
has been deprecated to avoid a name collision
with the feature_flags endpoint. In v2.0.0,APIClient.featureflags
will be renamed toAPIClient.feature_flags
.- The following APIClient attributes have been deprecated in favor of the
attribute that includes underscores:
APIClient.bocceclusters
->APIClient.bocce_clusters
APIClient.matchtargets
->APIClient.match_targets
APIClient.remotehosts
->APIClient.remote_hosts
civis.io.csv_to_civis
andcivis.io.dataframe_to_civis
functions now use
civis.io.file_to_civis
andcivis.io.civis_file_to_table
functions instead
of separate logiccivis.io.file_to_civis
,civis.io.csv_to_civis
andcivis.io.dataframe_to_civis
now support files over 5GB- Refactor internals of
CivisFuture
andPollableResult
to centralize handling
of threads andpubnub
subscription. - Updated API specification and base resources to include all general
availability endpoints. - Changed
civis.io.file_to_civis
andcivis.io.civis_to_file
to allow
strings for paths to local files in addition to just file/buffer objects.
Fixed
- Fixed parsing of multiword endpoints. Parsing no longer removes underscores
in endpoint names. - In
civis.futures.ContainerFuture
, returnFalse
when users attempt to cancel
an already-completed job. Previously, the object would sometimes give aCivisAPIError
with a 404 status code. This fix affects the executors and joblib backend, which
use theContainerFuture
. - Tell
flake8
to ignore a broad except in aCivisFuture
callback. - Close open sockets (in both the
APIClient
andCivisFuture
) when they're no
longer needed, so as to not use more system file handles than necessary (#173). - Correct treatment of
FileNotFoundError
in Python 2 (#176). - Fixed parsing of endpoints containing hyphens. Hyphens are replaced with
underscores. - Use
civis.compat.TemporaryDirectory
incivis.io.file_to_civis
to be
compatible with Python 2.7 - Catch notifications sent up to 30 seconds before the
CivisFuture
connects.
Fixes a bug where we would sometimes miss an immediate error on SQL scripts (#174).
Added
civis.resources.cache_api_spec
function to make it easier to record the
current API spec locally (#141).- Autospecced mock of the
APIClient
for use in testing third-party code which
uses this library (#141). - Added
etl
,n_jobs
, andvalidation_data
arguments to
ModelPipeline.train (#139). - Added
cpu
,memory
, anddisk
arguments to ModelPipeline.predict
(#139). - Added
remote_backend
keyword to thecivis.parallel.make_backend_factory
andcivis.parallel.infer_backend_factory
in order to set the joblib
backend in the container for nested calls tojoblib.Parallel
. - Added the PyPI trove classifiers for Python 3.4 and 3.6 (#152).
civis.io.civis_file_to_table
function to import an existing Civis file
to a tablecivis.io.file_to_civis
function will now automatically retry uploads to
the Civis Platform up to 5 times if is there is an HTTPError, ConnectionError
or ConnectionTimeout- Additional documentation about the use case for the Civis joblib backend.
- Added a note about serializing
ModelPipeline
APIClient
objects to the docstring. - Added
civis notebooks download
command-line interface command to facilitate
downloading notebooks.
Performance Enhancements
civis.io.file_to_civis
now takes advantage of multipart uploads to chunk
files and perform I/O in parallelcivis.io.civis_to_csv
andcivis.io.read_civis_sql
will always request
data with gzip compression to reduce I/O. Also, they will attempt to fetch
headers in a separate query so that data can be unloaded in parallelcivis.io.civis_to_csv
withcompression='gzip'
currently returns a file
with no compression. In a future release,compression='gzip'
will return a
gzip compressed file.
v1.6.2
1.6.2 - 2017-09-08
Changed
- Added explanatory text to CivisML_parallel_training.ipynb (#126).
Fixed
- Added
ResourceWarning
for Python 2.7 (#128). - Added
TypeError
for multi-indexed dataframes when used as input to
CivisML (#131). ModelPipeline.from_existing
will warn if users attempt to recreate
a model trained with a newer version of CivisML, and fall back on the
most recent prediction template it knows of (#134).- Make the
PaginatedResponse
returned by LIST endpoints a full iterator.
This also makes theiterator=True
parameter work in Python 2. - When using
civis.io.civis_to_csv
, emit a warning on SQL queries which
return no results instead of allowing a crypticIndexError
to surface (#135). - Fixed the example code snippet for
civis.io.civis_to_multifile_csv
.
Also provided more details on its return dict in the docstring. - Pinned down
sphinx_rtd_theme
andnumpydoc
indev-requirements.txt
for building the documentation.
Added
- Jupyter notebook with demonstrations of use patterns and abstractions in the Python API client (#127).
v1.6.1
1.6.1 - 2017-08-22
Changed
- Catch unnecessary warning while importing xgboost in CivisML_parallel_training.ipynb (#121)
Fixed
- Fixed bug where instantiating a new model via
ModelPipeline.from_existing
from an existing model with empty "PARAMS" and "CV_PARAMS" boxes fails (#122). - Users can now access the
ml
andparallel
namespaces from the basecivis
namespace (#123). - Parameters in the Civis API documentation now display in the proper order (#124).
v1.6.0
1.6.0 - 2017-07-27
Changed
- Edited example for safer null value handling (#101).
- Make
pubnub
andjoblib
hard dependencies instead of optional dependencies (#110). - Retry network errors and wait for API rate limit refresh when using the CLI (#117).
- The CLI now provides a User-Agent header which starts with "civis-cli" (#117)
- Include
pandas
andsklearn
-dependent code in Travis CI tests (#119).
Added
- Version 1.1 of CivisML, with custom dependency installation from remote git hosting services (i.e., Github, Bitbucket). (#103)
- Added email notifications option to
ModelPipeline
(#99). - Added custom
joblib
backend for multiprocessing in the Civis Platform. Public-facing functions aremake_backend_factory
,make_backend_template_factory
, andinfer_backend_factory
. Includes a new hard dependency oncloudpickle
to facilitate code transport (#102, #104, #106, #107, #111, #112, #113, #114).
Fixed
- Fixed a bug where the version of a dependency for Python 2.7 usage was incorrectly specified (#96).
- Non-seekable file-like objects can now be provided to
civis.io.file_to_civis
. Only seekable file-like objects will be streamed (#115). - The
civis.ml.ModelFuture
no longer raises an exception if its model job is cancelled (#116). - The CLI's API spec cache now expires after 24 hours instead of 10 seconds (#118).