Skip to content

Releases: eBay/tsv-utils

tsv-summarize bug fixes

18 Feb 02:22
Compare
Choose a tag to compare

tsv-summarize

  • Newest operators were not hooked up correctly to command line args.
  • Correction to bash-completion

tsv-summarize: Missing field support

13 Feb 04:11
Compare
Choose a tag to compare

Two enhancements:

  • tsv-summarize: Missing value support. New command line options to either exclude or replace empty fields. This is a common pattern in some data sets. Also, some new operators related to missing values.
  • Bash-completion: Definitions enabling command option completion in bash shells. Needs to be manually installed. Details in the Tips & Tricks document.

Improved numeric printing in tsv-summarize

08 Feb 00:38
Compare
Choose a tag to compare

Fixed some edge cases where printing was being done using exponential notation where not intended.

tsv-summarize: numeric formating; Doc updates

06 Feb 08:34
Compare
Choose a tag to compare
  • tsv-summarize: Improved formatting of numeric values, especially when using the --p|float-precision option.
  • Reorganization of the documentation.

tsv-sample: Weighted reservoir sampling

21 Jan 23:58
Compare
Choose a tag to compare

New tool, tsv-sample, does sampling and randomization of data file lines. Both uniform and weighted random sampling is supported. Weighted sampling gets weights from a field in the input data. Implemented using reservoir sampling.

Copyright notice updates for 2017

09 Jan 10:04
Compare
Choose a tag to compare
v1.0.12

Copyright notice date updates. (#18)

tsv-filter and tsv-summarize updates

09 Jan 09:24
Compare
Choose a tag to compare
  • tsv-filter: New tests --is-numeric, --is-finite, --is-nan, is-infinity. These are useful to ensure a numeric test like --gt (greater than) are run only on field values with validly formatted numbers.
  • tsv-summarize: Take advantage the faster topN in DMD/Phobos version 2.073.

First release of tsv-append

04 Jan 02:29
Compare
Choose a tag to compare

tsv-append concatenates multiple TSV files, similar to the Unix cat utility. It is header aware, writing the header from only the first file. It also supports source tracking, adding a column indicating the original file to each row.

Concatenation with header support is useful when preparing data for traditional Unix utilities like sort and sed or applications that read a single file.

Source tracking is useful when creating long/narrow form tabular data. This format is used by many statistics and data mining packages.

Better command options: --help/help-verbose; --H|header

21 Dec 09:00
Compare
Choose a tag to compare

Minor improvements to command line arguments:

  • Use --help/help-verbose rather than --help/help-brief. The short version (-h | --help) is generally more useful than the long form.
  • Add -H as a short form of --header

Fix DUB build error

17 Dec 19:11
Compare
Choose a tag to compare
Fix DUB build error. Add common directory to tsv-filter build. Fix DM…

…D warning. (#10)