This file documents the version history of OpenDP. The links on each version number will take you to a comparison showing the source changes from the previous version.
- Polars:
- Polars: add
make_private_quantile_expr
#908 - Polars: bounded-DP mean via postprocessing #890
- Polars: add
make_expr_laplace
#829 - Polars: add
make_expr_sum
#819 - Polars: add
make_expr_clip
#868 - Polars: add
make_private_aggregate
#847 - Polars: initial LazyFrame and Expr parsers #1454
- Polars: add
ExprDomain
#795 - Polars:
lazyframe_domain
ffi #769 - Polars:
series_domain
ffi #767 - Polars: add
FrameDomain
#765 - Polars: add
SeriesDomain
#763 - Polars: add
make_expr_col
#797
- Polars: add
- Usability:
- Steer users in the right direction if they try to call a domain descriptor #1512
- Warn if large priv loss #1457
- xfail usability tests #1465
- Measure
__str__
->__repr__
#1401 - Runtime error if non-function is passed to
new_function
#1355 - Error if missing
arg
param on measurement in R #1559 - Specialized message for mismatched domain #1511
- Python linting and typing:
- Call mypy and flake8 as subprocesses from pytest #1359
- More Python typing on context module #1472
- Require explicit imports #1220
- Use isinstance where appropriate #1221
- Fix unneeded f-strings #1217
- Leave only the ignores that we actually need #1219
- No more bare
except
#1303 - Fix masked mypy errors #1265
- Fix the flake8 warnings that really need it #1261
- Remove
Any
from generated python #1507 - Python typing: float implies int #1486
- Fix return signature on
loss_of
#1524
- Sphinx docs and examples:
- Grouping columns example #1508
- Python measurement examples #1550
- Use
import opendp.prelude as dp
in docs #1442 - Add link to Reference page #1297
- Python and R examples in tabs on quickstart (just CI) #1262
- link to language specific docs #1516
- Python example in API docs proposal #1439
- links between user guide and api reference #1458
- Update introductory paragraph, remove last outdated section #1446
- API ToC just down to modules #1447
- Enhance context docs #1386
- parallel directory for ancillary doc files #1371
- use sphinx-design; avoid raw html #1351
- List production applications #1352
- Update 404 template #1354
- Documentation reorg #1177
- Use appropriate shell syntax in notebook examples #1406
- In Python API docs, use the same examples for
make
andthen
#1576 #1575
- R docs and linting:
- R examples in docs via stand-alone files #1494
- Fix r-doc comment with missing closing tag #1493
- Add R example to "typical workflow" #1466
- One measurement example for R #1557
- R favicons #1298
- A concept on every R function, so the page is better organized #1299
- R doc README #1247
- Tidy up R docs header #1246
- Make the dependency on r-docs explicit... at the cost of slowing down the docs build #1248
- Subheadings in R docs #1245
- Do not generate NEWS.md #1302
- R linting #1344 #1408
- Renaming:
- Mechanisms:
- Developer docs and comments:
- For Rust example in getting-started docs, use normal
cargo run
rather than trying to run as script with nightly #1612 - Devs should install all optional dependencies #1522
- Explain docs build in each target language #1499
- Update LICENSE #1455
- Update and fix typos on dev instructions #1231
- Explain relationship between
bindings
andderive
#1229 - Consolidate tools requirements #1444
- Add a badge for docs.rs #1435
- Explain extra installs for R / Add "is" on homepage #1448
- Explain that R install does not require pre-compiled code #1423
- Just add a note to explain duplication #1484
- For Rust example in getting-started docs, use normal
- CI and utilities:
- util script for RST to NB #1483
- replace list of "rm" with "git clean" #1301
- manylinux2014 -> manylinux_2_24 #1268
- Now if "stable" and "dry-run" are selected, will append "-dev" #1267
- Upgrade github actions #1378
- Minimal cargo test #1356
- Add
number_of_spaces
param toindent
#1368 - LaTeX cache, temp, and output files #1361
- Speed up smoke-test, mostly by no longer freeing disk space #1579
- Package Python with cibuildwheel and setuptools-rust #1519
- Move Rust tests to standalone files #1533 #1548
- For consistency and simplicity, use
--all-features
#1526- Follow-up with separate builds in smoke-test to fix R/Polars CI #1515
- Minimum Python version:
- Remove build_tool.py #1300
- Remove sphinx doctest tags #1450
- Remove
dead_code
markers that do not cause warnings in IDE #1380 - Remove putting-it-together.rst, and move its diagram #1353
- Remove
@versioned
from generated code #1263 - Fix Rust build warning by removing reference to
poly
#1226
- CI
- Python
- Regenerate python code to include example #1615
- Add
get_np_csprng
wrapper function, so we can remove the lastskipif
#1562 - Replace datetime.now() with constant: Previously, tests would only pass within a certain date range #1561
- Fix split_by_weights #1456
- Address numpy test failures #1348
- Add setuptools to requirements #1415
- Add setuptools, fix nightly? #1427
- Fix subcontext metric space #1443
- Rust
- R
0.9.2-dev - TBD
- Ignore nitpicky Sphinx warnings on old library versions #1218
0.9.1 - 2024-02-07
- Fix CI for GitHub release #1215
0.9.0 - 2024-02-07
- R language bindings #679
- All library functionality is available, except for defining your own library primitives in R code
- New transformations/measurements
- Expanded functionality of user-defined library primitives
- Proofs from Vicki Xu, Hanwen Zhang, Zachary Ratliff and Michael Shoemate
- The OpenDP Python package now supports PEP 561 type information #738
- The OpenDP Rust crate is now thread-safe #874
- Documentation, Typing and CI improvements from Chuck McCallum
- CI: MyPy type-checking, link-checking in docs, code coverage, Rust formatting
- Rust stack traces are now hidden by default #1138
- FFI module in Rust is now public, allowing you to write your own lightweight FFI #1150
- C dependencies on GMP/MPFR have been replaced with dashu #1141
- The OpenDP Rust library can now be built easily on Windows and is a much more lightweight Rust dependency
TO
argument on user-defined measurements is now optional #1147- raw functions can now be chained as postprocessors onto measurements
- Imports in the Python
context
module no longer pollute the prelude #1187
0.8.0 - 2023-08-11
- Partial constructors: each
make_*
constructor now has athen_*
variant #689 #761- all
make_*
have gained two leading arguments:input_domain
andinput_metric
- all
then_*
have same arguments asmake_*
, sansinput_domain
andinput_metric
- when chaining,
then_*
tunes to the previous transformation/metric space
- when chaining,
- to migrate, replace
make_*
withthen_*
, and then remove redundant arguments - #687 #690 #692 #712 #713 #798 #799 #802 #803 #804 #808 #810 #813 #815 #816
- all
- (preview) Context API for Python, giving a more succinct alternative to
>>
#750context.query().clamp(bounds).sum().laplace().release()
- automatically tunes a free parameter (like the scale) to satisfy privacy-loss bound
- mediates queries to the interactive compositor/dataset inside
context
- #749
- Support for
aarch64
architecture on Linux #843 - Nightly builds can now be downloaded from PyPi:
pip install opendp --pre
#879 #880 - Proofs for
make_row_by_row
#688,make_clamp
#512 - Transformations throughout library support any valid combination of domain descriptors
- for example, all data preprocessors now also work under bounded DP
- Changed constructor names:
make_base_laplace
,make_base_discrete_laplace
->make_laplace
#736make_base_gaussian
,make_base_discrete_gaussian
->make_gaussian
#800make_sized_bounded_sum
,make_bounded_sum
->make_sum
#801make_sized_bounded_mean
->make_mean
#806make_sized_bounded_variance
->make_variance
#807dp.c.make_user_measurement
->dp.m.make_user_measurement
#884dp.c.make_user_transformation
->dp.m.make_user_transformation
#884dp.c.make_user_postprocessor
->dp.new_function
#884make_base_ptr
->make_base_laplace_threshold
#849- changed the privacy map to emit fixed (ε, δ) pairs
- Reordered arguments to
make_user_transformation
andmake_user_measurement
input_domain
andinput_metric
now leading to enablethen_*
variants
make_identity
is nowhonest-but-curious
in Python, but is general over all choices of domains/metrics #814- (Rust-only) sparse histogram APIs have been updated to prepare for Python #756
make_base_alp_with_hashers
->make_alp_state_with_hashers
make_base_alp
->make_alp_state
make_alp_histogram_post_process
->make_alp_queryable
- thank you Christian Lebeda! (https://github.com/ChristianLebeda)
- (Rust-only) Transformations and Measurements made read-only #706
- Infinite loop converting from ρ to ε when δ=0 #845
- All dataframe transformations, in anticipation of a new Polars backend in an upcoming release
0.7.0 - 2023-05-18
- FFI and Python interfaces for creating and accessing Domains, Metrics, and Measures (#637)
- Queryables and supporting infrastructure for interactive Measurements (#618), (#675)
- Constructor for sequential composition of Measurements (#674)
- Checks for compatibility between pairings of Domains and Metrics/Measures (#604)
- Python
opendp.extrinsics
module for code contributions and proofs outside of Rust (#693) - Docs: First Look at DP notebook (#666)
- Docs: Compositors notebook, with usage of interactive Measurements (#735)
- Incorporated Domain instances into some constructor signatures (#650)
- Simplified postprocessors to Function (from previous full Transformation) (#648)
- Moved some Domain logic from type-inherent constraints to runtime checks of more general types (#645), (#696)
- Remove SizedDomain in favor of a runtime size descriptor on VectorDomain
- Remove BoundedDomain in favor of a runtime bounds descriptor on AtomDomain
- Remove InherentNullDomain in favor of a runtime nullity descriptor on AtomDomain
- Removed the default Domain limitation on user-defined callbacks,
and renamed constructors from
make_default_user_XXX()
tomake_user_XXX
(#650) - Docs: Improved the clarity of the User Guide based on feedback (#639)
- Docs: Renamed the Developer Guide to Contributor Guide (#639)
- AllDomain in the Python bindings, with a warning to switch to AtomDomain (#645)
- The
output_domain
field of Measurement struct (#647)
- Switched to from
backtrace
crate tostd::backtrace
, and fixed some corner cases, for much faster backtrace resolution (#691) - Whole-codebase reformat using
rustfmt
to minimize spurious churn in the future (#669)
0.6.2 - 2023-02-06
- support for user-defined callbacks under explicit opt-in
- researchers may construct their own transformations, measurements and postprocessors in Python
- these "custom" components may be interleaved with other components in the library
- expanded docs.opendp.org User Guide with more explanatory notebooks
- "contrib" proofs for CKS20 sampler algorithms
- "contrib" proof for ρ-zCDP to ε(δ)-DP conversion
- CITATION.cff #552
- cleanup of accuracy utilities #626
discrete_gaussian_scale_to_accuracy
returns an accuracy one too large when the scale is on the lower edge- improve float precision of
laplacian_scale_to_accuracy
andaccuracy_to_laplacian_scale
- Reported by Alex Whitworth (@alexWhitworth). Thank you!
- clamp negative epsilon in
make_zCDP_to_approxDP
when delta is large #621- Reported by Marika Swanberg and Shlomi Hod. Thank you!
- resolve build warnings from metadata in version tags
0.6.1 - 2022-10-27
- docs.rs failed to render due to Katex dependency
0.6.0 - 2022-10-26
- Restructured and expanded documentation on docs.opendp.org
- Moved notebooks into the documentation site
- Updated developer documentation and added introductions to Rust and proof-writing
- Much more thorough API documentation and links to corresponding Rust documentation
- Documentation throughout the Rust library, as well as proof definition stubs
- Additional combinators for converting the privacy measure
make_pureDP_to_fixed_approxDP
to convert ε to (ε, 0)-approx DPmake_pureDP_to_zCDP
to convert ε to ρ
- Additional accuracy functions for discrete noise mechanisms
discrete_laplacian_scale_to_accuracy
discrete_gaussian_scale_to_accuracy
accuracy_to_discrete_laplacian_scale
accuracy_to_discrete_gaussian_scale
make_b_ary_tree
Lipschitz transformation. Use in conjunction with:make_consistent_b_ary_tree
to retrieve consistent leaf node countsmake_quantiles_from_counts
to retrieve quantile estimatesmake_cdf
to estimate a discretized cumulative distribution function
make_subset_by
,make_df_is_equal
andmake_df_cast_default
transformations- used for simple dataframe subsetting
make_chain_tm
combinator for postprocessing- Updates for proof-writing:
rust/src/lib.sty
contains a collection of latex macros to aid in cross-linking and maintenance- See the proof-writing section of the developer documentation
- PRs with .tex proof documents are rendered by a bot
- Documentation will now embed links to proof documents that are adjacent to source files
- Proof documents are automatically hosted and versioned on docs.opendp.org
- An initial proof for
make_count
(by @silviacasac, @cwagaman @gracetian6).
- Renamed
meas
tomeasurements
,trans
totransformations
andcomb
tocombinators
- Added an
honest-but-curious
feature flag tomake_population_amplification
- Python bindings check that C integers do not overflow
- Fixed clamping behaviour on
make_lipschitz_float_mul
- Let the type of the sensitivity supplied to
make_base_discrete_gaussian
vary according to typeQI
- Fix FFI dispatch in fixed approximate DP composition
0.5.0 - 2022-08-23
- Account for finite data types in aggregators based on our paper CSVW22
- For the sum #467, variance #475 and mean #476
- Formalize privacy analysis of data ordering #465 #466
- Stability/privacy relations replaced with maps #463
- You can now call
.map
on transformations and measurements to directly get the tightestd_out
- You can now call
- Composition of measurements #482
- Permits arbitrary nestings of compositions of an arbitrary number of measurements
- Discrete noise mechanisms from CKS20
make_base_discrete_laplace
is equivalent tomake_base_geometric
, but executes in a constant-time number of operationsmake_base_discrete_gaussian
for the discrete gaussian mechanism
- Add zero-concentrated differential privacy to the gaussian and discrete gaussian mechanisms
- Output measure is now always
ZeroConcentratedDivergence<Q>
, and output distance is in terms of rho
- Output measure is now always
- Add combinator to cast a measurement's output measure from
ZeroConcentratedDivergence<Q>
toSmoothedMaxDivergence<Q>
meas_smd = opendp.comb.make_zCDP_to_approxDP(meas_zcd)
- The
SmoothedMaxDivergence<Q>
measure represents distances as anε(δ)
privacy curve:- Can construct a curve by invoking the map:
curve = meas_smd.map(d_in)
- Can evaluate a curve at a given delta
epsilon = curve.epsilon(delta)
- Can construct a curve by invoking the map:
- Add
make_fix_delta
combinator to fix the delta parameter in aSmoothedMaxDivergence<Q>
measure- The resulting measure is
FixedSmoothedMaxDivergence<Q>
, where the output distance is an(ε, δ)
pair eps, delta = make_fix_delta(meas_smd, delta=1e-8).map(d_in)
- The fixed measure supports composition (unlike the curve measure)
- The resulting measure is
- Utility functions
set_default_float_type
andset_default_int_type
to set the default bit depth of ints and floats - Exponential search when bounds are not specified in binary search utilities #453
- Support for Apple silicon (
aarch64-apple-darwin
target)
- Switched to a single Rust crate (merged
opendp-ffi
intoopendp
) - Updated documentation to reflect feedback from users and added more example notebooks
- Packaging for Contributor License Agreements
- Improved formatting of rust stack traces in Python
- Expanded error-indexes
make_base_geometric
in favor of the more efficientmake_base_discrete_laplace
- Constant-time execution can still be accessed via
make_base_discrete_laplace_linear
- Constant-time execution can still be accessed via
make_base_analytic_gaussian
in favor of the (now generally tighter)make_base_gaussian
- This would have been a deprecation, but updating to be consistent with forward maps is nontrivial
- Rust documentation on docs.rs is built with "untrusted" flag enabled
- Python documentation for historical versions is rebuilt on correct tag
- Avoid potential infinite loop in binary search utility
- Replace the underlying implementation of
make_base_laplace
andmake_base_gaussian
to address precision-based attacks- Both measurements map input floats exactly to an integer discretization, apply discrete laplace or discrete gaussian noise, and then postprocess back to floats
- The discretization is on ℤ*2^k, where k can be configured, similar to the Google Differential Privacy Library
- In contrast to the Google library, the approximation to real sampling continues to improve as k is chosen to be smaller than -45. We choose a k of -1074, which matches the subnormal ULP, giving a tight privacy map
- Fixed function in
make_randomized_response_bool
- from proofwriting by Vicki Xu and Hanwen Zhang #481
- Multiplicative difference in probabilities in linear-time discrete laplace sampler are now exact around zero
- eliminates an un-accounted δ < ulp(e^-(1/scale)) from differing conservative roundings
- Biased bernoulli sampler on float probabilities is now exact
- eliminates an un-accounted δ < 2^-500 in RR and linear-time discrete laplace sampler
- from proofwriting by Vicki Xu and Hanwen Zhang #496
- Added conservative rounding when converting between MFPR floats and native floats
- MFPR has a different exponent range, which could lead to unintended rounding of floats that are out of exponent range
make_base_gaussian
's output measure is now ZeroConcentratedDivergence.- This means the output distance is now a single scalar, rho (it used to be an (ε, δ) tuple)
- Use
adp_meas = opendp.comb.make_zCDP_to_approxDP(zcdp_meas)
to convert to an ε(δ) curve. - Use
fadp_meas = opendp.comb.make_fix_delta(adp_meas)
to change output distance from an ε(δ) curve to an (ε, δ) tuplefadp_meas.check(d_in, (ε, δ))
is equivalent to the check onmake_base_gaussian
in 0.4
- replace
make_base_analytic_gaussian
withmake_base_gaussian
- replace
make_base_geometric
withmake_base_discrete_laplace
make_basic_composition
accepts a list of measurements as its first argument (it used to have two arguments)- slight increase in sensitivities/privacy utilization across the library as a byproduct of floating-point attack mitigations
0.4.0 - 2021-12-10
make_randomized_response_bool
andmake_randomized_response
for local differential privacy.make_base_analytic_gaussian
for a tighter, analytic calibration of the gaussian mechanism.make_population_amplification
combinator for privacy amplification by subsampling.make_drop_null
transformation for dropping null values in nullish data.make_find
,make_find_bin
andmake_index
transformations for categorical relabeling and binning.make_base_alp
for histograms via approximate laplace projections from Christian Lebeda (https://github.com/ChristianLebeda)make_base_ptr
for stability histograms via propose-test-release.- Added floating-point numbers to the admissible output types on integer queries like
make_count
,make_count_by
,make_count_by_categories
andmake_count_distinct
. - Simple attack notebook from Oren Renard (https://github.com/orespo)
- Support for Numpy data types.
- Release helper script
- Resolved memory leaks in FFI
- moved windows patch directory into
/rust
- added minimum rust version of 1.56 and updated to the 2021 edition.
- dropped sized-ness domain requirements from
make_count_by
make_base_stability
underestimated the sensitivity of queries. Removed in favor ofmake_base_ptr
.- Floating-point arithmetic throughout the library now has explicit rounding modes such that the budget is always slightly overestimated. There is still some potential for small floating-point leaks via rounding in floating-point aggregations.
- Fixed integer truncation issue in the sized bounded sum privacy relation.
- The resize relation is now looser to account for a worst-case situation where d_in records removed, and d_in new records are imputed.
0.3.0 - 2021-09-21
- All unvetted modules (which is currently all modules) are tagged with the "contrib" feature
- Programs must explicitly opt-in to access the "contrib" feature
0.2.4 - 2021-09-20
- Version tag
0.2.3 - 2021-09-20
- Version tag
0.2.2 - 2021-09-20
- User guide, developer guide, and general focus on documentation
- Examples folder has complete notebooks for getting started with the library
- Usability issues in the FFI layer for
make_count_by_categories
andmake_count_by
- The FFI for make_identity ensures proper domain metric pairing
0.2.1 - 2021-09-09
- Functions to convert between accuracy and noise scale for laplace, gaussian and geometric noise
- Error messages when chaining include a plaintext description of the mismatched domains or metrics
0.2.0 - 2021-08-31
- User guide outline
- Initial exemplar python notebooks
- Binary search utilities in Python
Vec<String>
andHashMap<K, V>
data loaders- Resize transformation for making
VectorDomain<D>
sized - TotalOrd trait for consistency with proofs
- General renaming of library interfaces. See issue #181.
- Scalar clamping
- Adjust output domain on
make_count_by_categories
to make it chainable with measurements
0.1.0 - 2021-08-05
- Initial release.
The format of this file is based on Keep a Changelog. It is processed by scripts when generating a release, so please maintain the existing format.
Whenever you're preparing a significant commit, add a bullet list entry summarizing the change under the X.Y.Z-dev heading at the top. Entries should be grouped in sections based on the kind of change. Please use the following sections, maintaining the same ordering. If the appropriate section isn't present yet, just add it by copying from those below.
When a new version is released, a script will turn the Unreleased heading into a new heading with appropriate values for the version, date, and link. Then the script will generate a new Unreleased section for future work. Please keep the existing dummy heading and link as they are, so that things operate correctly. Thanks!