Releases: parquet-go/parquet-go
Releases · parquet-go/parquet-go
v0.25.1
What's Changed
New features
- ability to register extra encodings by @MichaHoffmann in #184
- Configuration for using another encoding as default by @hbernardo in #259
- Add struct tags for altering the logical type of binary columns by @neilaram1 in #263
- Write sorted map keys by @achille-roussel in #283
- Add a Close method on the column writer by @fpetkovski in #285
Bug fixes
- drop tablewriter dependency by @achille-roussel in #287
- Fix #275 (panic after writer reset with multiple row groups) by @abbat in #292
Other changes
- Consistent test naming for legacy issues (issue numbers from segmentio repo) by @jhump in #250
- Writer.ColumnWriters now returns concrete ColumnWriter type instead of ValueWriter by @jhump in #248
- Don't allow a schema with zero columns by @jhump in #247
- Expand make format with modernize by @fpetkovski in #252
- Patch up two gaps on buffer reuse by @joe-elliott in #261
- ci: update versions by @rockwotj in #256
- fix(buffer): Improve Buffer resilience to invalid sort column names by @iamrajiv in #273
- Perf: Skip dictionary decode if we've already done it by @joe-elliott in #271
- AsyncPages: Stop reading after err by @joe-elliott in #269
- Fix seeking to row start when row values break across two or more data pages by @vbekiaris in #277
- feat: SortingWriter.File() returns FileView by @jacobmarble in #282
- fix: missing int32 and int64 LogicalTypes by @jacobmarble in #278
- Fix #293 (apply SkipPageBounds to SortingWriter[T]) by @abbat in #296
Internal
- modernize by @achille-roussel in #251
New Contributors
- @MichaHoffmann made their first contribution in #184
- @hbernardo made their first contribution in #259
- @vbekiaris made their first contribution in #277
- @neilaram1 made their first contribution in #263
- @abbat made their first contribution in #292
Full Changelog: v0.25.0...v0.25.1
v0.25.0
What's Changed
New features
- validate size of fixed-length values written by @achille-roussel in #201
- Lazy load schema state by @achille-roussel in #205
- Support TIME logical format struct tags by @MichaelUrman in #143
- write bloom filters near page index by @achille-roussel in #219
- reader: Add ability to read back Key Value metadata by @saswatamcode in #203
- improve async mode by @achille-roussel in #222
- expose more file internals by @achille-roussel in #230
- lazily read async pages by @achille-roussel in #229
- expose column chunk statistics by @achille-roussel in #233
Bug fixes
- fix definition levels of typed optional list by @achille-roussel in #197
- fix: change r15 usage so its not clobbered with go plugin buildmode by @rhnvrm in #211
- Revert "fix: change r15 usage so its not clobbered with go plugin buildmode" by @achille-roussel in #214
- fix issue 199 by @achille-roussel in #215
- ensure we error instead of silently misconstructing rows when we encounter pages of repeated columns that do not start at the beginnig of a row by @achille-roussel in #220
- test opening files with optimistic read by @achille-roussel in #227
- fix performance regression when reading rows from async pages by @achille-roussel in #246
Documentation
- doc: fix invalid comment mentioning errors that don't exist by @achille-roussel in #190
- docs: Fix MergeRowGroups example by @KasonBraley in #208
- CODEOWNERS: update for 2025 by @achille-roussel in #236
Other changes
- file: add option to skip reading magic bytes by @achille-roussel in #189
- optimistically read file footer by @achille-roussel in #192
- add read mode to NewRowGroupRowReader by @achille-roussel in #194
- add optimistic read file option by @achille-roussel in #195
- optimize parquet.(*Column).Fields by @achille-roussel in #198
- fix: Correct conversion order in convertToType function by @yigal100 in #209
- Expose
FileColumnChunk
with methods to provide aReaderAt
by @jpugliesi in #212 - feat: parameterize Time/Timestamp IsAdjustedToUTC by @jacobmarble in #216
- feat: export NodesAreEqual by @jacobmarble in #217
- feat: [Generic]Writer.FileMetaData() by @jacobmarble in #218
- release async pages when seeking by @achille-roussel in #221
- Add tag option to support isAdjustedToUTC by @jlordiales in #231
- ci(tempo): Ensure
parquet-go
compatibility with Tempo via CI tests by @iamrajiv in #238 - Rudimentary support for Variants when building a schema by @jhump in #245
- Add Writer.ColumnWriters by @jhump in #242
Internal
- use slices package by @achille-roussel in #191
- simplify the column index implementation by @achille-roussel in #226
New Contributors
- @KasonBraley made their first contribution in #208
- @yigal100 made their first contribution in #209
- @jpugliesi made their first contribution in #212
- @rhnvrm made their first contribution in #211
- @jacobmarble made their first contribution in #216
- @saswatamcode made their first contribution in #203
- @jlordiales made their first contribution in #231
- @iamrajiv made their first contribution in #238
Full Changelog: v0.24.0...v0.25.0
v0.24.0
What's Changed
Bug fixes
Other changes
- deps: copy files from segmentio/encoding by @achille-roussel in #159
- deps: remove testify dependency by @achille-roussel in #160
- Fix panic in newIndexedPage by @jhump in #169
- parquet: use generic implementation of ordering functions in purego build by @achille-roussel in #171
- Use atomic pointer for column and offset indexes, so they can be lazily populated in a thread-safe way by @jhump in #168
- Enable parquet package support on s390x - rebased by @pavolloffay in #173
- Improve performance after s390x support by @pavolloffay in #174
- chore(readme): Add badges and refactor readme header by @iamanjali1003 in #177
- support backwards compatible map schemas by @rockwotj in #181
Internal
- maintenance: reduce the amount of unsafe constructs by @achille-roussel in #165
- simplify big endian compat by @achille-roussel in #182
New Contributors
- @jhump made their first contribution in #169
- @pavolloffay made their first contribution in #173
- @iamanjali1003 made their first contribution in #177
- @rockwotj made their first contribution in #181
Full Changelog: v0.23.0...v0.24.0
v0.23.0
What's Changed
Bug fixes
Other changes
- hashprobe: remove dependency on runtime.aeskeysched by @asubiotto in #142
- add SkipBounds WriterOption to omit min/max bounds by @derekperkins in #147
- rm min/max funcs in favor of built-ins by @derekperkins in #148
New Contributors
Full Changelog: v0.22.0...v0.23.0
v0.22.0
What's Changed
This release contains critical bug fixes and upgrades for the code and the repository automation, as well as a new feature to decode logical parquet maps as Go map values when reading parquet values into any
values.
Bug fixes
- fix parquet specification version in file metadata by @achille-roussel in #132
- Sorting merge data corruption by @thorfour in #140
Other changes
- Read logical maps as go maps in Readany by @MichaelUrman in #128
- dependency: upgrade github.com/klauspost/compress to v1.17.8 by @achille-roussel in #133
- CI: parquet-mr => parquet-java by @achille-roussel in #141
New Contributors
- @MichaelUrman made their first contribution in #128
Full Changelog: v0.21.0...v0.22.0
v0.21.0
What's Changed
Bug fixes
- The minvalue of the column statistic incorrectly ignores empty string by @Shinena1998 in #131
- Fix columnPages SeekToRow by @forsaken628 in #126
Improvements
- Bump google.golang.org/protobuf from 1.30.0 to 1.33.0 by @dependabot in #116
- Fix s/condution/conduct/ by @grantwwu in #117
- Allow specifying delta on field of type time.Time by @infogulch in #115
- zstd: add concurrency option to codec by @derekperkins in #125
- CONTRIBUTING.md: try to codify merge policy by @kevinburkesegment in #41
- Use one page buffer per column rather than one buffer per page by @zolstein in #61
Internal
- add release workflow by @achille-roussel in #114
New Contributors
- @achille-roussel made their first contribution in #114
- @dependabot made their first contribution in #116
- @grantwwu made their first contribution in #117
- @infogulch made their first contribution in #115
- @derekperkins made their first contribution in #125
- @forsaken628 made their first contribution in #126
- @Shinena1998 made their first contribution in #131
Full Changelog: v0.20.1...v0.21.0
v0.20.1
What's Changed
- Estimate dictionary decode size by @fpetkovski in #102
- Bug: Respect current page offset in reslice by @joe-elliott in #109
Full Changelog: v0.20.0...v0.20.1
v0.20.0
What's Changed
- Fix slicing a missing page by @fpetkovski in #88
- Fix NumValues for missingColumnChunk by @fpetkovski in #86
- Fix column index reuse by @thorfour in #92
- Lazy-load column index by @fpetkovski in #84
- Pool column index buffers when performing comparisons by @asubiotto in #97
- fix SortingWriter by @gernest in #100
- Return errors for missing page indices by @fpetkovski in #94
New Contributors
- @fpetkovski made their first contribution in #88
- @thorfour made their first contribution in #92
- @asubiotto made their first contribution in #97
Full Changelog: v0.19.0...v0.20.0
v0.19.0
What's Changed
- docs: fix repo links by @haoxins in #62
- chore: Add cover flag to go test by @haoxins in #63
- Fix panic reading/writing
any
types by @Jeffail in #51 - Remove format.PageHeader caching by @mapno in #71
- fix merging empty row groups by @gernest in #72
- Fix minimum Go version in README documentation by @robertino in #77
- add support for field id by @gernest in #73
- Error when uncompressed page size exceeds the max int32 value by @stoewer in #81
New Contributors
- @haoxins made their first contribution in #62
- @Jeffail made their first contribution in #51
- @mapno made their first contribution in #71
- @robertino made their first contribution in #77
- @stoewer made their first contribution in #81
Full Changelog: v0.18.0...v0.19.0
v0.18.0
What's Changed
- Update README to state its community maintained by @gernest in #37
- Fix data corruption in column statistics by @zolstein in #54
- Add CODEOWNERS by @joe-elliott in #39
- compress/lz4: Use default Compressor for Fast LZ4 level by @ngotchac in #10
- Migrate WriterTest to use parquet-cli by @zolstein in #52
New Contributors
- @zolstein made their first contribution in #54
- @joe-elliott made their first contribution in #39
- @ngotchac made their first contribution in #10
Full Changelog: v0.17.0...v0.18.0