Skip to content

Releases: NVIDIA/aistore

3.23

28 May 14:49
Compare
Choose a tag to compare

Version 3.23 arrives three months after the previous one. In addition to datapath optimizations and bug fixes, most of the other changes are enumerated in the following

Table of Contents

  • List Objects; Bucket Inventory
  • Selecting Primary at startup; Restarting cluster when node IPs change (K8s)
  • S3 (backend, frontend)
  • BLOBs
  • Mountpath labels
  • Reading shards; Reading from shards

See also:

List Objects; Bucket Inventory

  • S3 backend: S3 ListObjectsV2 may return a directory !6672
  • list very large buckets using bucket inventory !6682, !6684, !6686, !6689, !6692
  • list-objects: optimize for prefix; add 'dont-optimize' feature flag !6685
  • list very large buckets using bucket inventory (major update, API changes) !6695, !6698
  • list very large buckets using bucket inventory !6704
  • list-objects: support non-recursive operation (new) !6711, !6712
  • refactor and code-generate (message pack) list-objects results !6714
  • bucket inventory; generic no-recursion helper !6715
  • bucket inventory: support arbitrary schema; add validation !6769
  • list-objects: micro-optimize setting custom properties of remote objects !6770
  • list very large buckets using bucket inventory !6775, !6776, !6777, !6778
  • list very large buckets using bucket inventory (major) !6810, !6811
  • list very large buckets using bucket inventory !6815
  • list-objects: skip virtual directories !6835
  • list very large buckets using bucket inventory !6847, !6851, !6853

Selecting Primary at startup; Restarting cluster when node IPs change (K8s)

  • primary role: add 'is-secondary' environment; precedence !6746
  • 'original' & 'discovery' URLs (major) !6747, !6749
  • cluster config: new convention for primary URL; role of the primary during: initial deployment, cluster restart !6752, !6755
  • cluster restart with simultaneous change of primary (major) !6758, !6760, !6761
  • primary startup: always update node net-infos !6762
  • all proxies to store RMD (previously, only primary) !6764
  • node join: remove duplicate IP check (is redundant) !6783
  • K8s startup with proxies change their network infos !6785
  • primary startup: initial version of the cluster map !6787
  • non-primary startup: retry and refactor; factor in !6788
  • K8s: primary startup when net-infos change !6789

S3 (backend, frontend)

  • backend put-object interface; presigned S3 (refactoring & cleanup) !6662
  • default AWS region (cleanup) !6679
  • s3cmd: add negative testing !6681
  • backend: S3 ListObjectsV2 may return a directory !6672
  • backend: consolidate environment and defaults !6678
  • backend: retain S3-specific error code !6688, !6691
  • move presigned URLs code to backend package !6801
  • multipart upload: read and send next part in parallel !6803
  • backend: refactor and simplify !6819
  • new feature flag to enable (older) path-style addressing !6821

BLOBs

  • config change: assorted feature flags now have bucket scope (major) !6664, !6666
  • Python: blob-download API !6687
  • Python: get and prefetch with blob-download !6708
  • blob downloader (minor ref) !6793
  • blob-downloader: finalize control structures; refactor !6812
  • GET via blob-download !6873
  • multiple blob-download jobs (fixes) !6876
  • prefetch via blob-downloader !6882

Mountpath labels

  • override-config, fspaths section (minor ref) !6718
  • config change, API change: mountpath labels (major) !6721, !6722, !6725, !6726, !6733, !6734, !6735, !6736, !6738
  • backward compatibility v3.22 and prior; bump CLI version !6740, !6742
  • log: mountpath labels vs shared filesystems; memory pressure !6744

Reading shards; Reading from shards

  • reading (from) shards: add read-until, read-one, and read-regex methods !6823
  • reading shards: read-until, read-one, read-regex !6824
  • WebDataset: add wds-key; add comments !6826
  • reading .TAR, .TGZ, etc. formatted objects (a.k.a. shards) - multiple selection !6827
  • GET request to select multiple archived files (feature) !6859
  • GET multiple archived files in one shot (feature) !6861, !6862, !6863, !6864, !6866
  • Python: GET multiple files from an archive (shard) !6860

Core

  • backend put-object interface (refactoring & cleanup) !6662
  • get-stats API vs attach/detach mountpaths !6669
  • unwrap URL errors; remove mux.unhandle; CLI: more tips !6673
  • removing a node from a 2-node cluster (in re: rebalance) !6674
  • POST /v1/buckets handler: add one more check to URI validation !6690
  • last byte (minor ref) !6694
  • project layout: move and consolidate all scripts !6699
  • extend RMD to reinforce cluster integrity checking !6702
  • micro-optimize fast-path fqn parsing !6707
  • continued refactoring !6709, !6710
  • security dependabot: fix #15 and #16 !6713
  • aisnode: remove logs from conf !6727
  • extract and unify cluster information; add flags !6741
  • copy shared FS capacity; color high/low usage pct; up cli !6743
  • node flags in a cluster map vs (node | cluster) restart; node equality !6765
  • receive cluster-level metadata (minor ref) !6766
  • dsort: write compressed tar !6771
  • dsort: read compressed tar; add linter !6772
  • backend: uniform naming, common base !6774
  • remove AIS_IS_PRIMARY environment (is obsolete) !6781
  • nlog: allow setting logging to STDERR flag in config !6791
  • feature flags fsync-put will now have (also) bucket scope !6804
  • cold GET: write locally and transmit in parallel (new) !6805, !6807
  • move atomic 'stopping' (ref) !6817
  • aisloader: add 's3-use-path-style' command line, to use older path-style addressing !6822
  • cold GET (fast): fclose and check !6825
  • speed-up batch jobs (prefetch, archive, copy/transform, multi-object evict/delete) !6830
  • LOM: add open-file method !6836
  • nlog: while stopping !6837
  • multi-object TCB/TCO; not in-cluster objects; multi-page fix !6840, !6842
  • xaction registry: when hk call is premature !6843
  • add metrics: get-size and put-size !6849
  • memsys/SGL: add compliant 'write-to' interface impl.; amend fast/simplified 'write-to' !6854, !6856, !6857
  • stats and metrics: report cumulative GET and PUT sizes in bytes !6855
  • datapath query parameters: preparse, reduce size !6858
  • stats: fix Prometheus label for total size !6871
  • imports (ref) !6878
  • move and rename 'node-state-info' and 'node-state-flags' (ref) !6879
  • new metric: node-state-flags (bitwise, gauge) !6880
  • add management alerts: out-of-space & low-capacity (major) !6883
  • add management alerts: out-of-memory & low-on-memory !6885
  • microbench: use math/rand/v2 !6886
  • transition to Go 1.22 math/rand/v2; crypto/rand reader !6887
  • dsort test: use rand.v2 !6888
  • transition to Go 1.22 math/rand/v2; add seeded-reader !6890
  • cleanup 'cos/math' (ref) !6891
  • tests: fix prefix-test for remote ais cluster !6893

CLI

  • 'more' fixes !6665
  • more tips !6673
  • warn when switching cluster to operate in reverse proxy mode !6703
  • show feature flags symbolically !6705
  • backward compatibility v3.22 and prior; bump CLI version !6740
  • 'ais show cluster' to highlight nodes that are low on memory !6745
  • 'ls' and 'show object' to support size units (raw, SI, IEC) !6795
  • progress bar decorators; elapsed time !6797
  • fix used and available capacity !6806
  • fix 'show throughput' to not show throughput when !6813
  • quiet 'show cluster', 'show performance'; misplaced flags !6814
  • 'ais ls' help and inline examples; native GET: add query params !6816
  • copying remote objects; progress bar; usability !6839
  • extend 'ais gen-shards' to generate WD-formatted shards !6865
  • add '--count-and-time-only' option !6868, !6869
  • max-pages and limit !6870
  • stopping jobs !6875

Python

  • add test for invalid bucket name !6683
  • blob-download API !6687
  • add timeout option to client + version bump !6693
  • get and prefetch with blob-download !6708
  • tests constants and refactoring !6717
  • prefetch blob-download tests !6719
  • cluster performance API !6724
  • remote enabled tests cleanup refactored !6731
  • add missing job tests !6737
  • fix formatting issues !6753
  • PyTorch: add Iterable-style datasets for AIS Backend !6759
  • writer for image dataset !6767
  • AISSource: list all objects !6779
  • add example for dataset_writer !6794
  • add tests for dataset writer !6799
  • log missing attributes in write_dataset !6820
  • update docs !6844
  • add MultiShard Stream to PyTorch !6852
  • GET multiple files from an archive !6860

Build, CI

  • transition to Go 1.22 !6675
  • upgrade OSS packages !6680, !6750, !6768
  • lint: upgrade; Go 1.22 int range !6728, !6732
  • CI: MacOS fix !6729
  • remove HDFS backend !6773
  • upgrade golang.org/x/net !6831
  • lint; min/max shadow !6850
  • build: transition to Go 1.22 math/rand/v2 !6892
  • CI: maintenance !6838
  • lint: golangci-lint !6894

Documentation

  • docs: fix https getting-started !6668
  • docs: amend getting started !6670
  • docs: fix the broken table of contents link !6677
  • blog: Very large !6874

3.22

25 Feb 18:14
Compare
Choose a tag to compare

Highlights

  • Blob downloader
  • Multi-homing: support multiple user-facing network interfaces
  • Versioning and remote sync
    • execute in presence of out-of-band changes/deletions
    • support latest version: the capability to check in-cluster metadata and, possibly, GET, download, prefetch, and/or copy the latest remote (object) version
    • remote synch: same as above, plus: remove in-cluster object if its remote counterpart is not present (any longer)
    • both latest version and remote sync are supported in a variety of APIs (including GET primitive) and tools (CLI, aisloader)
  • Intra-cluster n-way mirroring
    • to withstand a loss of node(s) erasure coding is now optional
  • AWS S3 (frontend) API
    • multipart V2 (major upgrade); other productization
    • listing very large S3 datasets
    • support presigned S3 requests (beta)
  • List objects (job): show diff: in-cluster vs. remote
  • Prefetch (job): V2 (major upgrade)
  • Copy/transform (jobs): V2 (major upgrade)
  • AWS S3: migrate AWS backend to AWS SDK V2
  • Azure Blob Storage: transition to latest stable native SDK

See also: aistore features and brief overview.

Core

  • NVMe multipathing: pick alternative block-stats location !6432
  • rotate logs; remove redundant interfaces, other refactoring !6433
  • cold GET: add stats !6435
  • http(s) clients: unify naming, construction; reduce code !6438, !6439
  • don't escape URL paths; up cli !6441
  • dsort: sort records (minor) !6445
  • core: micro-optimize copy-buffer !6447
  • list-objects utilities and helpers; rerun list-objects code-gen: refactor and optimize; cleanup !6450, !6451
  • intra-cluster transport: zero-copy header !6455
  • Go API: (object, multi-object): ref !6456
  • add 'read header timeout'; docs: aistore environment variables !6459
  • core: support target multi-homing - comma-separated IPs (part one) !6464
  • package 'ais': continued refactoring; up cli !6466
  • support multiple user-facing network interfaces (multi-homing) !6467, !6468
  • when setting backend two (or more) times a row !6469
  • core: (begin, abort, commit) job - corner cases !6470
  • in-cluster K8s environment: prune and cleanup, comment, and document !6471
  • multi-object PUT - variations !6473, !6474
  • unify PUT and PROMOTE destination naming !6475
  • APPEND (verb) to append if exists; amend metadata (major) !6476
  • EC: refactor and simplify erasure-coding datapath; docs: remove all gitlab references !6477
  • list-objects: enforce intra-cluster access, validate !6480
  • EC: remove redundant state; simplify !6481
  • Go API get-bmd; follow-up !6483
  • EC: cleanup manager: remove rlock and unused map - micro-optimize !6490
  • copy bucket: extend the command to sync remote bucket !6491
  • extend 'copy bucket' to sync remote !6494, !6495, !6497, !6498, !6499
  • don't compare checksums of different (checksum) types !6496
  • when deleting non-present (remote) object !6502
  • move transform/copy-bucket from 'mirror' package to 'xs' !6503
  • don't create data mover in a single-node cluster !6504
  • multi-object transform/copy (job): add missing cleanup !6506
  • multi-object transform & copy !6507
  • core: abort all (jobs) of a given kind; CLI 'ais stop'; strings: Damerau-Levensthein !6508
  • revamp target initialization !6509
  • copy/transform remote, non-present !6510
  • revamp target initialization !6512, !6513
  • [API change] get latest version (feature) !6516
  • amend Prefetch; flush atime cache when shutting down !6517
  • amend metadata cache flushing logic (atime, prefetch, is-dirty) !6518
  • core: remote reader to support 'latest version' !6519
  • extend config ROM; follow-up !6520
  • Prefetch v2 !6521
  • backend error formatting; notification-listener name !6522
  • [API change] Prefetch v2; multi-object operations !6523
  • Prefetch v2; cold-get stats; put size !6524
  • [config change] versioning vs remote version changed or deleted !6525, !6526
  • add 'remote-deleted' stats counter; Prefetch: test more !6528
  • AWS backend not-found; job status; other cleanup !6529
  • core: refactor 'copy-object' interface, prep to sync remote => in-cluster !6531
  • [Cluster Config change] versioning vs remote version: remote changed, deleted !6532
  • copy/transform (bucket | multi-object); intra-cluster notifications !6533
  • revise/simplify 'is-not-exist' check; ldp.reader to honor sync-remote option !6537
  • pre-parse (log-modules, log-level); micro-optimize !6538
  • amend error handling: not-found vs list iterator; OOS !6539
  • jobs ("xactions"): add and log non-critical errors; join(error) and fiends !6540
  • [API change] list-objects to report 'version-changed' (new) !6541
  • list-objects to report 'version-changed' (new) !6543, !6545
  • list-objects to report: 'version-changed', 'deleted' !6546
  • list-objects to support (in-cluster <=> remote) diff !6547, !6548
  • copy/transform with an option to sync remote: prune destination !6549
  • copy/transform --sync: add stress test, extract "pruning" logic !6550
  • revise and refine object write transaction (OWT) !6554, !6555
  • Go API: amend 'wait-idle' helper method !6558
  • copy/transform '--sync': use probabilistic filtering !6559
  • refactor list-range-prefix iterator !6560
  • multi-object copy/transform with '--sync' option !6561
  • S3 API (on the front): fix list-objects !6562, !6563
  • multi-object copy/transform with '--sync' option !6564
  • core: reset idle timer; xaction names (micro-optimizations) !6565
  • core: ETag in response headers !6569
  • S3 API (frontend): validate object names; multipart pathnames !6570
  • copy/transform with '--sync' option: add scripted test !6571, !6573
  • backend: special case to return 404 instead of 403 !6575
  • productize Azure backend !6576, !6578, !6580
  • S3 multipart: write-through all parts !6585
  • multipart upload: write-through all parts !6586
  • multipart upload: add extended error message; add stress test !6587
  • all supported backends: revisit range read (make it consistent across) !6589
  • introduce blob downloader (new) !6592
  • xaction (job) descriptor: remove unused specifiers !6593
  • blob downloader: add dedicated (non-generic) control path !6595
  • blob downloader (new) !6596, !6599, !6603
  • multipart upload: fix s3cmd to run elsewhere !6600, !6601
  • blob downloader (new) !6605, !6606, !6608
  • blob downloader (new); remote AIS cluster !6613
  • silent HEAD(bucket) !6614
  • leverage erasure coding to provide intra-cluster mirroring (new) !6615, !6616
  • blob downloader (new) !6618
  • S3 (frontend): support presigned S3 requests (new) !6621
  • intra-cluster mirroring: add integration test (no limit) !6622
  • blob downloader (new) !6628, !6629, !6631, !6632, !6633, !6639
  • add target's get-cold-blob interface; refactoring !6634
  • AWS backend: nil client !6636
  • Prefetch via blob-downloader: add 'blob-threshold' option !6637, !6638
  • blob-downloader: user abort; expected checksum !6646
  • Azure: ETag as object version; build !6647
  • Azure: transition from preview to stable 1.x (major) !6648
  • AWS backend: use sync.Map instead !6649, !6651
  • (AWS, GCP) backend: log extended error info; RC5 !6653
  • S3: presigned S3 requests; bucket config: add max-page-size !6657

Python

  • v1.4.17 release !6431
  • add support for self-signed certificates with or without verification !6465
  • add 'latest' flag for GET !6536
  • latest flag for prefetch and copy !6542
  • release 1.4.19 !6544
  • stress test for copy w/ '--sync' !6552
  • fix pylint to pass !6556
  • test multi-object copy with '--sync' flag !6567
  • fix black formatter issues in github CI !6582
  • github-CI lint - follow up !6583
  • support range read (offset, length) !6588
  • update common requirements !6609
  • bump SDK version !6610
  • lint: add more !6454

Bench

  • aisloader-composer: install docker alongside latest cri-o on CentOS !6436
  • aisloader-composer: fix install-docker and update OCI inventory !6446, !6449
  • aisloader-composer: update OCI inventory; avoid using reserved variables in playbooks !6452
  • aisloader-composer: update dashboard with k8s only networking visualization !6453
  • aisloader: support latest-version !6581
  • aisloader: add '--cached' flag !6623

Build, CI

  • refactor common 'k8s' package; up cli mod; docs !6434
  • build/minikube: skip making cli !6437
  • gitlab-CI: scheduled pipeline changes !6442
  • upgrade OSS packages !6443
  • lint: enable gocritic "huge-param" !6457
  • lint: add gosec linter !6462
  • gitlab: add etl label & rule !6488
  • github-CI: publish pypi package for aistore !6492
  • build: upgrade all minors !6501
  • rename 'cluster' package !6514
  • 'api' package not to import 'core' !6515
  • tests, tests, and more tests !6530
  • CI: fix HDFS docker image !6566
  • CI: remove HDFS build and tests !6572
  • deployment: add jq to init container for parsing JSON in Bash scripts !6577
  • CI: update tgt cnt for test short !6579
  • gitlab CI: add short test for cloud providers and long test for Azure !6584
  • build: new linter !6624
  • add github issue templates !6630
  • build: release candidate 4 (rc4) !6640
  • build: rc7; fixes !6658

Documentation

Read more

3.21

05 Nov 22:20
Compare
Choose a tag to compare

Highlights

  • cold GET: extract and micro-optimize the flow
  • sync Cloud bucket
    • leverage validate-warm-GET bucket config, and
    • extend it to support non-versioned Cloud buckets, and
    • optionally, delete (remotely deleted) objects
  • bucket sizing and counting:
    • support very large buckets that are not necessarily present in the cluster;
    • unify ais ls --summary and ais storage summary to utilize the same control message and flags
  • list, summarize, and lookup the properties of remote buckets without adding them to cluster's BMD
  • HTTPS:
    • support TLS configuration to authenticate clients
    • switch cluster from HTTP to HTTPS, and vice versa
  • optimize metadata cache
  • optimize capacity management
  • bug fixes, performance improvements

Core

  • set prime-time to amend local generation of globally unique IDs !6325
  • multi-object (archive, copy, transform) jobs: transport endpoint !6326
  • core: (maintenance, decommission, shutdown) transition w/ rebalancing !6327
  • core: (maintenance, decommission, shutdown) transition w/ rebalancing !6328
  • intra-cluster transport: make receive-side stats optional !6329
  • intra-cluster transport: reduce receive side contention !6330
  • fix channel full condition; rebalance-cluster; transport !6331
  • feature flags: add limited-coexistence; transport: track closed endpoints !6334
  • fix prime-time: add caller-is-primary; up cli module !6335
  • switch existing cluster between HTTPS and HTTP !6336
  • Go 1.21: use built-in min and max functions !6337
  • list-objects(remote-bucket-and-only-remote-props); Go 1.21 clear built-in !6339
  • Go 1.20: use typed atomic pointer, remove unsafe !6343
  • core: assorted micro-optimizations; remove read locks !6346
  • tweak multi-error join-err, remove error channel (minor) !6347
  • [API change] capacity management !6348
  • xxhash; field-align vol package !6349
  • bucket: new-query help; silent GET; test tools !6350
  • etl: adding fqn param to spec templates !6351
  • low-level control structs: bucket, namespace !6352
  • etl: Keras template fix !6355
  • etl: fix hello-world ais-etl tests !6356
  • core: don't recompute uname hash !6359
  • repackage HRW methods !6361
  • core: lom cache v2 (major update) !6362
  • refactor: downloader's diff resolver; control plane (receive BMD) !6363
  • core: lom metadata cache (cont-ed) !6365
  • dsort: error handling, assorted cleanups, more scripted tests !6366
  • core transactions: concurrency !6368
  • downloader: throttle; wait !6369
  • optimize cold GET !6370
  • global rebalance: log; minor edits !6373
  • core: update backend 'get-reader' API (all supported backends) !6374
  • core: validate-warm-get to support non-versioned buckets, and more !6375
  • validate-warm-get to support non-versioned buckets !6376
  • [API change] silent HEAD(object) request !6378
  • core: add load-unsafe (the faster way to load local metadata) !6382
  • total disk size: compute at startup, recompute on change !6383
  • [API change] new bucket summary; unify list-objects and summary !6384, !6386, !6387
  • add config.Rom to consolidate assorted "read-mostly" config values; refactor and unify !6388
  • [API change] new bucket summary (major update) !6390
  • mountpath jogger: support bucket query !6392
  • backend providers: do not include (checksum, version) if not asked to !6394
  • python: updated bucket info API !6395
  • feature flags: don't-add-remote & don't-head-remote; log: add s3 module; verbosity; !6398
  • support listing remote buckets without adding them to cluster's BMD !6399
  • concurrent HEAD(object) vs evict/create bucket - fix the race !6400
  • [API change] list and summarize remote buckets without adding remote buckets to cluster's BMD !6401
  • datapath query (dpq) !6402
  • Go-based API: response header to error message !6403
  • [API change] new bucket summary !6405, !6406
  • downloader: streamline and cleanup initialization sequence !6409
  • HTTPS: support TLS configuration !6410, !6411, !6412, !6413, !6414, !6415, !6416
  • assorted minor fixes !6417, !6418
  • core: cold GET: fast path & slow path !6419
  • cluster configuration: flip validate-cold-get !6420
  • downloader (major update); [API change]: xaction registry !6422
  • validate-warm-get: add scripted test utilizing remote ais cluster !6423
  • core: cold GET: fast path & slow path !6424, !6427
  • feature flags: add disable-fast-cold-get; show performance latency; up cli module !6425
  • refactor ais/utils !6429

Bench: aisloader and aisloader-composer

  • skip list objects for 100% put load !6332
  • composer: add playbook and script for intial aisloader copy !6333
  • composer: add support for aisloader --filelist option !6345
  • default value for duration should be infinite if num-epochs value is defined !6353
  • composer: add epochs option for GET workloads !6354
  • composer: add cluster name prefix to netdata sources for easier filtering !6357
  • new bucket not to be listed; usability !6358

CLI

  • typed does-not-exist error; misc !6358
  • always print dsort job description !6367
  • show cluster to report total num disks !6371
  • show performance: usability fixes, improvements !6379
  • show performance not to filter regex-selected zero columns !6380
  • attempt to copy/transform an empty remote bucket !6393
  • new bucket summary; evict multiple buckets in one shot; pretty print !6396, !6397
  • ais show bucket with an option to add remote bucket to cluster's BMD (effectively, create bucket) !6404
  • ais search: CLI command search results to include idiomatic extensions !6428

Build, CI, Deployment

  • tests: upon node shutdown: wait for the node to stop (tcp) listening !6338
  • CI: add gather-logs template for K8s tests !6340
  • deploy: ais with HTTPS in minikube !6364
  • build: bump urllib version !6372
  • tests: validate-warm-get (scripted) !6423
  • K8s playbooks: update kill aisloader command !6385
  • docs: validate-warm-get; assorted !6377
  • docs: add performance.md; inline help; rm all-columns flag (redundant) !6381
  • build: upgrade all minors !6389
  • CI: add checkmarx scan !6391
  • build: upgrade golangci-lint, add linters !6407, !6408

3.20

12 Sep 15:14
Compare
Choose a tag to compare

Core

  • tweak stop-maintenance logic; rebalance: cleanup log messages; assorted minor fixes !6288
  • do not timestamp err-aborted message !6290
  • [API change] dsort: remove extended metrics; add new counters; revise and refactor !6297
  • list-objects; house-keeper; aisloader, logger (assorted fixes) !6298
  • core stats: remove mutex and work channel - speed up !6299
  • slab allocator: remove stats mutex, do not sort !6300
  • consolidate and revise OOM handling !6301
  • ETL: require admin access to create & delete; add feature flag !6302
  • remove unused heartbeat tracker w/ minor ref !6308
  • reimplement keep-alive mechanism (major) !6309
  • keep-alive v2 (major update) !6312
  • keep-alive v2: remove timeout stats (control structure and code) !6317
  • keep-alive v2: add fast path !6320
  • micro-optimize get-all-running (jobs); atomic heard-from/timed-out !6321
  • node-restarted: remove 'lsof', use net dialer; fix node-decommissioning tests !6322

Tools and tests

  • CI: update fspath (aka mountpath) config for minikube-based aistore deployments !6289
  • aisloader: list and read s3 buckets directly !6291
  • aisloader: list, read, and write s3 buckets directly !6292
  • tests: K8s long tests (EchoGolang) fix !6293
  • aisloader: fix cleanup option for s3 bucket benchmarks !6294
  • aisloader: reimplement direct get from s3 - use SDK !6295
  • aisloader: show progress when listing s3 directly !6296
  • CLI: add show details param to etl !6304
  • tools: add check for ais etl deployment !6305
  • tools: add ETL_NAME var for CLI tests !6310
  • aisloader-composer: add playbook and script for clearing Linux Page Cache on all AIS targets !6311
  • aisloader-composer: add playbook for copying aws credentials !6314
  • tools: update check for aistore Kubernetes deployment !6315
  • CI: update github action version (all modules) !6316
  • CLI/ETL: support enumerated arg-type !6287, !6323

Build

  • upgrade all OSS packages (minor versions) !6313
  • transition to Go 1.21 !6318

3.19

29 Aug 17:31
Compare
Choose a tag to compare

Core

  • [API change] archive and download logs (feature) !6172, !6175
  • [API change] dsort: extend input format !6181
  • [API change] dsort spec; CLI: print job spec !6204
  • [API change] revise request spec (major upd) !6217
  • [API change] dsort: is now 'xaction' as well !6253
  • (downloader, dsort, ETL): disallow to run when out of space !6235
  • handle "DNS lookup fail" as one of the unreachable err types; nlog flush-exit !6164
  • when electing new primary; when joining nodes at startup !6165
  • k8s: Change prod k8s and docker default to not log all to stderr !6166
  • revise GFN !6167
  • stats runner is now responsible to periodically flush logs !6170
  • core: fail user attempt to abort global rebalance when !6184
  • new Go API; assorted fixes !6189
  • metasync BMD; up modules !6190
  • downloader: return not-found when not found !6196
  • start using scripted integration tests; CLI: 'dsort src dst spec' !6198
  • support S3 AWS profiles with alternative creds (feature) !6214
  • core: state transition => rebalance => (point of no return) !6216
  • amend low-level Go API check-response routine; add error type-code !6228, !6229
  • control plane: deserialize original error from call result !6230
  • xactions: when checking inactivity ("is idle") !6242 !6243
  • primary readiness vs cluster shutdown !6244
  • Go API: wait for xaction-related conditions !6245
  • assorted tuneups: space cleanup; housekeeping (HK) callback; log !6246
  • access control: when copying/transforming/dsorting to non-existing 'ais://' destination !6255
  • core: a call to update stats should never block !6257
  • core stats: add fast counters !6258 !6259 !6261
  • sparsify latency stats !6260
  • ETL: refactor and cleanup construction !6267
  • deploy/dev: updated minikube scripts !6272
  • new option to add Cloud bucket to aistore without checking accessibility !6275, !6277
  • un-throttle PUT mirroring; assorted changes !6278
  • feature: local generation of global (job) IDs !6280 !6282

Performance

  • Add distributed loader scripts and playbooks for using aisloader with multiple hosts !6156
  • pyaisloader: usability improvements !6215
  • Update Grafana dashboard to include latency statistics !6249
  • Reorganize benchmarks and related tools !6254
  • aisloader: no need to call rand for 100% or 50% read/write workloads !6256
  • aisloader-composer: add dashboard for DC network and disk !6266
  • aisloader: add an option to randomize gateways !6279
  • aisloader-composer: fix output files for GET bench !6283

Python

  • sdk: update ETL templates (docker migration) !6168
  • sdk: Release version 1.4.1 !6169
  • sdk: ETL templates (compress + ffmpeg decode) !6185
  • sdk: ETL templates (imagepullpolicy as always) !6191
  • sdk: adding keras_transform template !6200
  • sdk: ETL templates fix !6201
  • sdk: ETL templates (ffmpeg decode transformer) !6205
  • sdk: compress ETL template (updated usage) !6211
  • sdk: torchvision sample transformer ETL template !6221
  • sdk: fix comments (minor) !6240
  • sdk: update version !6248
  • sdk: increase timeout for torchvision transformer template (large image) !6252
  • sdk: updated torchvision transform ETL !6262
  • sdk: update dsort job info query and related tests !6265
  • sdk: switch ETL init code 'transform_url' boolean flag to 'arg_type' string !6269
  • docs: update ETL dev deployment for macOS !6163
  • ETL: keras template minor fix !6213
  • ETL: remove incorrect reference !6268
  • ETL: add 'arg-type=FQN' (new) !6271

Datasets (resize, resort, and shuffle)

  • [API change] dsort: extend input format !6181
  • dsort input format: iterate list, iterate range !6186 !6187
  • start using scripted integration tests; CLI: 'dsort src dst spec' !6198
  • add test scripts; memsys: init gmm only once !6192
  • refactoring and renaming !6193
  • move/consolidate error types; continued refactoring !6202
  • Go API change; add dsort/api.go; CLI: print job spec !6203
  • [API change]: dsort spec; CLI: print job spec !6204
  • CLI/dsort: extend inline help, pretty-print job spec; update docs !6206
  • dsort: continued refactoring (major update) !6208, !6209, !6210
  • free sgl on error; feature: any extension !6212
  • [API change] revise request spec (major upd) !6217
  • create destination on the fly !6218
  • record content path to retain full shard name !6219
  • output shard size estimation (rewrite) !6223
  • add is-compressed; refactor dsort-mem !6227
  • compressable shards (major) !6231
  • output ext; rcb buffer; fixes !6232
  • duplicated records (full coverage & stress); fixes !6233
  • fix tests; add stress !6234
  • rename subpackage, fix comments, refactor !6237
  • remove dsort-context, rewrite initialization !6238
  • static/stateless shard readers/writers; refactor and simplify !6239
  • two goroutines per each shard-distributing request !6241
  • [API change]: dsort: is now 'xaction' as well !6253
  • dsort: support generic abort-xaction API !6264
  • no need to block when sending shard records !6286

CLI

  • archive and download logs (feature) !6180
  • clarify "copying" vs "transforming" and "cached" vs "present" !6183
  • start using scripted integration tests; CLI: 'dsort src dst spec' !6198
  • dsort: extend inline help, pretty-print job spec; update docs !6206
  • dsort: Go API change; add dsort/api.go; CLI: print job spec !6203
  • 'archive get' is now a shortcut (an alias) !6222

Build, test, and tools

  • add test scripts; memsys: init gmm only once !6192
  • tests and tools: cleanup around stop-maintenance, wait-rebalance" !6194
  • deployment: update local deployment script to allow target-only deployment with defined primary host !6195
  • deployment: optionally, skip deploying primary proxy !6197
  • start using scripted integration tests; CLI: 'dsort src dst spec' !6198
  • tools/generate shards: optimize buffer allocation !6224
  • deploy/dev: Add ansible deployment scripts for deploying locally on multiple nodes !6199
  • aistorage/CI docker image (lzma libraries) !6220
  • tests: init with cleanup and without !6226
  • CI: Retry stuck Python ETL tests in GitLab CI pipeline !6270
  • remove aisfs (FUSE) !6273
  • dev tools: readers; handle read from corrupted arch or non-arch !6250

Documentation

  • update getting started !6161
  • updated python sdk readme !6162
  • update ETL dev deployment for macOS !6163
  • update documentation with recent ETL changes !6173
  • CLI/dsort: extend inline help, pretty-print job spec; update docs !6206

3.18

09 Jul 18:56
Compare
Choose a tag to compare

Core

  • add htext to track restarted state; target run and misc !5966
  • cluster rebalance (scenarios) !5969, !5971, !5973, !5974, !5975, !5977, !5980, !5983, !5986, !5987, !5989, !5991, !5992, !5993, !5995, !6002
  • add 'cluster-ready' helper; use it to reinforce !5976
  • cleanup better when decommissioning; previous BMD at startup !5979
  • fs: reliable remove-all !5981, !5982, !5984
  • yet another buf pool !5985
  • do not modify cluster map when starting up; always skip logging idle disks !5988
  • rebalance (scenarios, major update) !5992
  • [API change]: core: rebalance (scenarios) !5993
  • rebalance (major update); when receiving new cluster map !5995
  • up modules; handle housekeeper registration race !5994
  • 'not present in the loaded cluster map' and similar startup validation !5996
  • shutdown or decommission a node that's already in maintenance !5998
  • transport: never establish a streaming connection to the peer that's in maintenance (or will be) !5999
  • metasync just-in-time; assorted refactoring (minor) !6001
  • maintenance mode: pre & post vs keepalive & metasync; CLI: more colored cues !6004
  • shutdown is also 'maintenance'; docs: adding-removing intro !6005
  • add meta package !6006, !6007
  • ETL: add arg-type parameter when initializing with code !6008
  • archive v2: support empty template (tar entire bucket); atime !6013
  • keep poi.atime in nanoseconds !6015
  • archive v2: append to arch; refactoring !6017
  • archive v2: up modules !6018
  • archive v2: part four (major) !6019
  • archive v2: detect an empty tar when appending, and handle !6020
  • archive v2: part six !6022
  • archive v2: mime detection !6024
  • archive v2: extend 'append-to-arch' to support tar.gz !6025, !6027
  • archive v2: tar and tgz append; fixes !6028
  • log filenames; overlapping run vs node-restarted !6029
  • archive v2: multi-object append-to-arch !6030
  • archive v2: multi-object append-to-arch !6033
  • archive v2: multi-object append-to-arch !6034
  • cleanup disk utils (minor) !6035
  • ios startup: run the command only once !6036
  • hide AuthN secret !6038
  • archive v2: append to zip !6041
  • archive v2: append to msgpack !6043
  • add cmn/archive package !6044
  • archive v2: write and copy via new 'cmn/archive' !6045
  • archive v2: append via new 'cmn/archive' !6046
  • [API change] archive v2: MIME vs file extensions !6047
  • ios: cleanup lsblk cache; CLI: refactor get-node-arg; up modules !6048
  • archive v2: remove msgpack; refactor !6051
  • archive v2: add '.tar.lz4' serialization (new) !6053
  • archive v2: tar.lz4 cont-d !6054
  • archive v2: lz4 features; checksum !6055
  • s3 compat: run E2E tests with correct HTTP/HTTPS mode !6057
  • [API change]: append to arch if exists !6062
  • [API change] append to arch if doesn't exist; CLI cont-d !6064
  • checksumming and buffering vs reader-from !6074
  • core: content-length universally; revise write-json and friends !6075, !6076
  • archive v2: [API change] put (files, dirs) with an option to append !6081, !6082
  • archive v2: quiesce faster, refine continue-on-error logic !6083
  • core: double-check target-in-maintenance, quiesce faster !6084
  • archive v2: finalize cmn/archive package !6085
  • archive v2: finalize cmn/archive package !6086
  • log verbosity: core and modules !6087
  • http client: disable compression; core: undefer & micro-optimize !6066
  • append to (non-existing) arch: an option to create !6068
  • mem-pool alloc/free symmetry: copy/transform & archive !6069
  • copy/transform, multi-archive: refactor Rx logic and error handling !6071
  • log verbosity: core and modules; remote cluster !6089
  • log verbosity: core and modules; remote cluster !6091
  • ec: minor refactoring !6092
  • archive v2: WD basename; get with extraction; Range !6093
  • archive v2: WD basename; get with extraction; Range !6094
  • archive v2: tools/archive utils !6095
  • archive v2: tools/archive utils !6096
  • compile-out asserts; super-verbose logging; log module 'mirror' (ref) !6097
  • log verbosity at runtime; log modules; remove glog; unify (major update) !6099
  • fields iterator; size converter; log rotation (fixes) !6101
  • [API change] get-bucket-info to count remote objects !6102
  • [API change] get-bucket-info to count remote objects !6103
  • [API change] get-bucket-info (part three); docs and CLI !6106
  • list-objects vs buckets: revise and refactor, add validation, clarify !6107
  • list-objects: introduce optional args (ref, cleanup) !6108
  • list-objects: mem-pool msgpack buffers !6109
  • kvdb: remove redundant err-not-found; amend dsort, downloader, authn !6110
  • x-lso must idle more time !6111
  • log modules (part three) !6112
  • add nlog (new logger) !6113, !6122, !6124
  • do not log perf counters when there's no change; sort the names !6119
  • fix disk usage call for clusters on mac OS !6121
  • log etl events: spec parsed, pod ready, hpull/hpush !6125
  • cleanup fs.PathError; add object name validation !6127
  • extend dsort to support .tar.lz4 !6128
  • aisloader: cleanup output when running with json option !6129
  • xactions to return extended error info !6130
  • xactions: add and return errors (major upd) !6131, !6132
  • xactions: error handling cont-d; bucket-scope multi-object op-s !6133
  • jobs error handling and reporting; tests !6134, !6135, !6136
  • use Go 1.20 join-err; reduce default pre-election interval !6141
  • http request multiplexer !6142
  • dsort: use common cmn/archive pkg (major upd) !6143
  • archive v2: tar formats (USTAR, etc.) !6153
  • archive v2: extend list-objects and GET to operate inside archives !6155
  • archive v2: extend list-objects and GET to operate inside archives !6157
  • control plane: consistently propagate cluster map !6158
  • old EC metadata shouldn't terminate cluster-wide rebalance !6159
  • core: when primary goes down it notifies !6160

CLI

  • performance tabs: always show 'cluster idle' if idle !5997
  • stop-maintenance w/ no rebalance; stats idle-ness (minor) !6000
  • add transform-url flag (used when initializing ETL with code) !6010
  • misc. improvements !6011
  • mountpath completions; disable/detach; minor ref !6056
  • de-spaghettify put handler !6059
  • archive multi-object (cont-d) !6060
  • archive v2: CLI put, append, alias, docs !6072
  • move 'gen-shards' and extend it to support all formats !6073
  • archive v2: CLI: APPEND-to-arch is now on-par with multi-PUT !6077
  • archive v2: CLI: dry-run option is back !6078
  • archive v2: CLI: destination naming; dry-run; tips and examples !6079
  • list-objects: extended help, template with no ranges !6104
  • error and warning verbosity (major update) !6105
  • log modules (part two); CLI archive: pre-parse and add tip, advice !6100
  • archive help !6126
  • archive v2: CLI and docs !6149
  • archive v2: multiple CLI updates and improvements; is-archive bit !6150
  • remove k8s apimachinery pkg (minor) !6151

Python

  • sdk/python: Add option to pre-import modules when initializing ETL with code !5963
  • sdk/python: Refactor ETL to provide name once on object creation !5968
  • sdk/python: Release version 1.2.0 !5970
  • sdk/python: Fix python package links !5978
  • sdk/python: improve SDK testing, fix bucket eviction keep_md parameter !5990
  • sdk/python: Release version 1.2.2 !6012
  • sdk/python: Add dSort support in SDK !6032
  • sdk/python: Add wait to dsort abort test !6037
  • sdk/python: Fix dsort abort test on faster systems !6039
  • sdk/python: Add get_url option to bucket, object group, and objects !6042
  • sdk/python: Add interface for AIS sources that can be accessed via list of URLs !6052
  • PyTorch: Add new PyTorch DataPipe to iterate over URLs from various AIS sources !6058
  • update lint-tests to work with Python 3.11 (pytorch unsupported) !6061
  • sdk/python: Include user agent in SDK requests !6067
  • sdk/python Use msgpack content type when listing objects in bucket !6070
  • sdk/python: Add msgspec to pyproject dependencies !6088
  • sdk/python: Release SDK version 1.3.0 !6090
  • sdk/python: bucket summary + bucket info !6098
  • sdk/python: bck info integration test fix !6115
  • sdk/python: bck summary integration test fix !6117
  • sdk/python: bck summ + bck info fixes (revert and cleanup) !6120
  • sdk/python: pyaisloader !6123
  • pyaisloader: bucket utils (minor fixes) !6139
  • sdk/python: bucket summary and info (minor fixes) !6140
  • pyaisloader: total size/count fix !6146
  • sdk/python: Release version 1.4.0 !6152

Docs

  • add lifecycle.md !6009
  • WebDataset Blog Post part 1: Storing WebDataset in AIS !6014
  • WebDataset Blog Post part 2: AIS ETL on WebDataset shards !6016
  • update docs/tools and cmd/README !6050
  • WebDataset blog post pt 3 -- PyTorch Datatpipe with ETL !6065
  • archive v2: CLI put, append, alias, docs !6072
  • archive v2: CLI: destination naming; dry-run; tips and examples !6079
  • archive v2: CLI and docs !6149

Build, CI

  • add local build options to k8s scripts and fix local registry !6021
  • build: upgrade all minors !6040
  • add pyaisloader stage !6148
  • k8s: fix deployment scripts for compatibility with the latest aisnode image !6154

3.17

11 Apr 00:46
Compare
Choose a tag to compare

Table of Contents

  • CLI v1.2
  • Python SDK v1.1.2
  • S3 compatibility and Botocore
  • API changes
  • Tests and Documentation
  • Core: bug fixes and improvements
  • Build and Continuous Integration
  • Extensions: Downloader, dSort, ETL

See also:

CLI

  • show all jobs !5645
  • start/stop job/xaction !5660
  • refresh rate and countdown; long-running 'show job' and friends !5651
  • 'show log node-name' to mimic 'tail -f' !5652, !5654
  • add custom duration flag and logic !5655
  • 'ais config (cluster|node|cli)', 'ais config reset', and friends !5656
  • bucket completions !5657
  • set-config to show all updates; tweak iter-fields reflection !5658
  • 'show job' to aggregate all categories and support all selections !5661
  • transition to using job display names (major) !5663
  • start, stop, show jobs and xactions (cont-d) !5665
  • amend and restructure jobs !5666
  • running xactions (completions) !5672
  • tweak config json printout; get-config from memory !5673
  • update backend config !5674
  • update backend config (part two) !5677
  • add footnote, marshal message only once !5678
  • remove xaction term and subcommand (everything is job now) !5692
  • suggest (targets, proxies, nodes) !5694, !5696
  • revise bash completions script !5697
  • remove 'xaction' (term and subcommand) !5698, !5699, !5700
  • 'show cluster': separate cluster nodes from all other (tab-tab) completions !5701
  • consolidate and refactor cluster map access !5704
  • tweak ais create bucket --props & ais bucket props set !5706
  • extend 'job start' to support (resilver, copy-bucket, rename-bucket) !5715
  • tweak listed props !5719
  • remove (cleanup) download and dsort jobs !5721
  • extend 'ais stop' to support --all|--regex !5722
  • 'show job' verbose option; unify usage args; ref PUT/APPEND !5723
  • rewrite command-not-found logic; add similar commands !5724
  • show jobs (major) !5726, !5727
  • bash autocomplete ordering improvements !5728
  • improvements (usability) !5729
  • add bucket cp alias !5730
  • flag printable name; split 'show job' in parts; usability !5736
  • further unify stopping, waiting-for, and showing jobs !5744
  • revise & amend 'show rebalance' - all permutations !5761
  • universal start-end formatting; template refactoring !5762
  • jobs grouping by name and, within name, by UUID !5764
  • complete etl-name transition !5767
  • ETL tools, UUID (part one) !5745, !5746, !5749, !5753, !5754, !5763
  • fix download/dsort progress !5769
  • new table to show target statistics !5788
  • 'ais show performance' (new) !5791, !5793, !5800, !5802, !5803, !5809, !5810, !5811, !5812, !5816
  • IEC, SI, and raw (bytes, nanoseconds) formatting (major) !5820
  • reduce code, simplify, cleanup !5821
  • IEC, SI, and raw (bytes, nanoseconds) formatting (major) !5823
  • disk stats: add average read/write sizes !5824
  • amend existing mountpath tab and add a new one !5833, !5834
  • expect node unreachable when iterating '--refresh' !5837
  • assorted usability; add 'no-color' config !5839
  • 'ais show performance': average (GET, PUT, etc.) sizes on the fly !5840
  • support new API to reset stats !5841, !5843
  • 'ais show performance': refactor throughput, add latency !5844
  • 'ais show performance': finalize latency tab !5847
  • 'ais show performance' cont-d !5848
  • 'ais show performance': finalize top-level tab !5850
  • 'ais show performance': add cluster-level throughput, beautify !5852
  • 'ais show performance': alias 'stats' and remove older code !5853
  • 'ais show performance': disk table v2 !5855
  • 'ais show performance': finalize disk table !5857
  • 'ais show performance': new mountpaths/disks/capacity table !5858, !5859
  • 'ais show performance': finalize capacity table !5861
  • refactor and cleanup multi-object put !5862
  • multi-object PUT: source dir, list/range; matching pattern !5865
  • fix concatenation logic, refactor progress bar !5867
  • copy bucket: support progress bars (copied objects and size) !5870
  • consistent timeout management !5871
  • copy/transform a list or range of objects: add progress bar !5873
  • copy/transform with progress bar: style, reuse !5874
  • multi-object PUT !5876
  • rogress bar: all multi-object operations; universal 'wait-for' !5879
  • PUT multi-object - all flavors !5880
  • get multiple objects in one shot (""multi-object GET"") !5884
  • GET destination & assorted fixes !5882
  • copy-bucket: prepend prefix, command helps, examples !5889, !5891
  • more inline help !5892
  • assorted improvements (minor) !5900
  • fix downloading with progress bar enabled !5903
  • how-to text: how to reconfigure remote ais cluster !5932
  • add CLI compatibility warning (new CLI vs old cluster and vice versa) !5952
  • cluster membership-changing operations (shutdown, decommission, et al): improve usability !5956

Python: SDK library and ETL

  • Add unit tests for cluster class !5675
  • Add unit test for api client class !5676
  • Fix python ETL test workflow !5680
  • Add unit tests for Xaction class !5681
  • Add unit test for object class, fix object put data from a filepath !5682
  • Add unit tests for ETL class !5683
  • Add unit tests for sdk utils !5685
  • Restructure python subdirectories and containment !5686
  • Run python unit tests as part of default python test make option, include python ETL tests in all python test runs !5688
  • Only run python ETL tests when python labels are added !5702
  • update test utils to support running Python tests on Windows !5716
  • Add multi-object functionality !5720
  • Add tests and validation to object ranges, support leading zeros !5731
  • Add unit test for object group class !5732
  • Update documentation for python multi-object ops !5734
  • Use aws bucket for python sdk CI tests to test caching functions !5739
  • Add string template support to object groups !5741
  • Refactor all references from xaction to job !5742
  • Update ETL runtime defaults and add new python 3.11 option !5743
  • Fix remote bucket tests to avoid collisions !5748
  • Increment cloudpickle version and update github action to use 3.11 for tests !5756
  • Fix python test dependencies !5780
  • Refactor to simplify typing and tests !5789
  • Bump python SDK to 1.0.5 !5790
  • Update python sdk version !5792
  • Update pytorch integration README with compatibility issue !5794
  • Patch sdk to support torchdata integration !5795
  • Bump aistore package version !5798
  • Fix pylint and formatting !5799
  • Add ObjAttr type for returning additional object metadata !5805
  • Address lint warnings, general improvements !5808
  • Add writer option to object get !5813
  • Split README for different python projects !5814
  • Add PROMOTE functionality to objects !5818
  • Standardize pylint version, fix lint errors !5822
  • Improve object put behavior and add directory put options !5826
  • Use Pathlib over os.path !5827
  • Refactor multi-file put to bucket class !5830
  • Improve job interface and update promote options !5832
  • Update python package build tools, increase version to 1.0.9 !5835
  • Improve example documentation !5842
  • Improve input validation !5846
  • Set up proper logging, update constants !5849
  • Add job wait for idle status and fix job bucket filter !5863
  • Add multi-object copy !5866
  • Fix remote test fixture !5877
  • Add multi-object ETL !5878
  • Release 1.0.10 !5881
  • Expand multi-object examples !5883
  • Improve usability !5888
  • Follow-up for copy/transform prepend !5890
  • Improve object interface !5893
  • Add prefix_filter option to bucket copy !5899
  • Add flags and target options to bucket list_objects methods !5901
  • Release 1.1.0 !5910
  • Release 1.1.1 !5911
  • Improve python sdk examples !5913
  • Add cluster list_jobs, require id for individual job status query !5916
  • Add support for multi-object archive !5920
  • Update bucket params to use Bucket object, support namespaces !5925
  • Add prefix filter to bucket transform !5949
  • Release 1.1.2 !5950
  • PyTorch: add support for etl-name in AISDataset and AISFileLoaderIterDataPipe !5957

S3 compatibility and Botocore

  • Add botocore monkey patch alongside python SDK !5684
  • Move botocore and pytorch packages to python top-level, separate from sdk !5691
  • Add s3 compatibility testing with boto3 !5703
  • list-objects vs HEAD !5705
  • compute multipart md5 and set etag !5708
  • Improve s3 compat test documentation and update validated tests !5747
  • Pass S3 delete a list of objects !5854
  • fix infinite loop when listing objects; add bucket name to list object response !5864
  • return bucket creation date in UTC !5869
  • new flag NoRecursion to support S3 delimiter feature !5930

API changes and new APIs

  • add API to query multiple xactions via any IC proxy !5670, !5671
  • yet another API to query xactions (new) !5687
  • list-objects API: tweak listed props !5718
  • [API change] flatten xaction-snap control structure (major upd) !5740
  • [API change] ETL: tools, UUID (part nine) !5754
  • [API change] ETL: new query parameter to specify transform name !5755, !5765
  • [API change] init/start ETL to return xaction ID !5768
  • [API change] GET(object) !5770, !5772, !5773, !5775
  • [API change] remove xaction 'query-msg' (deprecated) !5783, !5784, !5785
  • get-object API: amend comments !5769
  • refactor api package (major) !5776
  • [new API] query metric names and kinds ('counter', 'latency', 'throughput', et al.) !5796
  • [API change] get node status !5811
  • [API change] core stats: consistency between stats-querying APIs !5817
  • [API change] PUT(object) !5828
  • [API change] amend capacity-disks-filesystems control !5831
  • [API change] add API to reset stats !...
Read more

3.16.rc2

31 Mar 14:23
Compare
Choose a tag to compare
v1.3.16

v3.16.rc2

3.12

13 Nov 19:47
Compare
Choose a tag to compare

Table of Contents

  • Remote AIS clusters
  • List objects
  • List and summarize buckets
  • S3 API
  • CLI
  • Documentation
  • Tests
  • Python SDK & ETL
  • Core
  • Refactoring & fixes
  • Tools
  • Build & CI

See also:

Remote AIS clusters

  • list remote ais buckets; fix remote namespace containment !5567, !5621
  • aliasing and bucket namespaces (major revision) !5622, !5623
  • ais gateways to cache remote ais clusters (information) !5630
  • extend remote ais errors - add cluster UUID !5631
  • remote ais aliasing !5632
  • [API change]: remote ais clusters (major) !5633, !5634
  • [API change] add HealthUptime; use it to show remote clusters !5636
  • remove support for multi-aliasing !5637
  • alias vs bucket metadata !5638

List objects

  • CLI to show CACHED (column) only when it's informative (and not when user asks ais ls ... --cached) !5554, !5555
  • rename list-objects structures; regenerate message pack !5542, !5543
  • amend and re-enforce single target rule for remote buckets !5556
  • introduce location property, phase-out target URL (property) !5557, !5559
  • add custom metadata, regenerate message pack; refactor !5564, !5565
  • list-objects version 2 (major): !5583, !5584, !5586, !5589, !5590, !5591, !5601, !5602, !5605, !5607
  • [backend API change] mem-pool 'list-objects' pages !5604
  • list-objects flow with a single target listing remote props (name, size, version) !5608
  • list remote pages v2: flow controls; ais gateways will now select single target to execute remote call !5609, !5610
  • list-objects to include (in listing result) custom object properties !5614
  • consistency in custom props across: get, put, head object, and list-objects !5615, !5616

List & summarize buckets

  • easy URL to support listing buckets via /gs, /az, and /ais endpoints !5486
  • list-buckets and get-bucket-info API; CLI summary table !5521, !5522
  • list-buckets version 2 (major update) !5524
  • unify 'list-buckets' and 'bucket-summary' !5530, !5531
  • shorter and faster list-buckets version: w/ presence but wo/ summary !5535, !5536- [API change] add presence filter; eliminate bucket-info control structure (is redundant); refactor !5540
  • list-buckets and bucket-summary cont-d !5541
  • bucket sizing: always compute two sizes: on-disk and sum objs !5544
  • bucket-summary on-disk (apparent) vs sum-all-objects, remote vs cached !5545
  • bucket-summary and list-buckets cont-d unification !5546, !5547
  • list-buckets by target: remote ais; hdfs and http; cloud; other !5567
  • rename/move bucket summary; align control structures !5568
  • list remote ais buckets; fix remote namespace containment !5621
  • [API change] list-buckets correctness for named bucket queries !5626

S3 API

  • multipart upload !5458, !5460, !5467, !5468
  • extend s3 API to support remote buckets (major) !5462
  • persistent multipart state (major) !5466
  • multipart GET !5472
  • multipart list active uploads and upload parts (major) !5474
  • return xml-formatted errors (compliance) !5480
  • multipart: refine and simplify !5482
  • revise list-buckets to include all providers !5485
  • always set header Server when redirecting !5495
  • amend s3 compatibility readme; add multipart example !5496
  • multipart: sha256 fix; md5 not checked but computed; aws s3api example !5500
  • xml-format errors (parts two and three) !5501
  • revise HEAD object implementation; multipart ETag; miscellaneous fixes !5504
  • compatibility: configurable root '/' access to AIS cluster !5572, !5573, !5574
  • compatibility: assorted error codes !5620

CLI

  • 'ais etl' to replace extension in transformed objects !5418
  • standardize error and warning coloring !5493
  • don't log and don't show backend secret !5506
  • improve unknown-command messaging, provide hints !5512
  • add ais object ls command; add present flags !5514
  • support updated HEAD(bucket) API; 'ais ls' and permutations to show obj props !5517
  • static linkage with cgo disabled !5526
  • rename aws, et al. backend provider constants !5528, !5529
  • option to display sizes either in bytes or KiB, MiB etc. units !5532
  • rename sub-packages; show-unmatched & hide-header !5534
  • bucket display-name helper (and consistent usage) !5539
  • flip bucket listing default to 'present' (was 'all accessible') !5550
  • option --all to list both buckets and objects (in remote buckets) !5551
  • multi-feature completions; action-warn and action-done !5575
  • show config subsections in JSON; multi-feature completions !5576
  • amend config JSON printout !5577
  • extend and amplify: (config | section | json) !5578
  • is-time-not-set, is-object-not-cached (for consistency) !5611
  • atime & ctime !5612
  • throughput must be averaged over an interval of time since the previous request !5643

Documentation

Tests

  • python ETL !5410, !5461, !5476
  • python IO communicator !5415
  • tests: skipping short (minor) !5416
  • pyhton/sdk: ETL stress test !5420
  • add s3 multipart test !5482
  • add ETL test for streaming with python runtime !5513
  • re-enable and amend object-properties tests across all providers !5515
  • amend list-buckets: add cases; up CLI !5569, !5570
  • reinforce multi-object tests, fix bucket-exists helper !5606
  • amend and fix multi-object range/list copy test !5619
  • amend list-buckets test that checks expected result !5631

Python SDK & ETL

  • ETL API part 1 !5370
  • sdk version !5374
  • code (re)formatting and misc !5381
  • reorg Makefile dependencies !5390
  • ETL APIs part 2 !5394
  • bucket ETL transform API !5405
  • bump sdk version to 1.0.3 !5408
  • fix IO communicator fixes !5413
  • fix transform bucket !5421, !5428
  • add examples !5441
  • add streaming runtime w/ refactoring and cleanup !5444
  • streaming init-code with chunk-size param (major) !5446, !5447
  • with temp debug !5448
  • init_code changes for before and after functions !5449
  • remove debug w/ minor ref !5450
  • functions filter (new), transform, and before/after !5451
  • runtime v2 fixes !5455
  • bootstrapper; by reference !5457
  • ETL API fixes !5464
  • update readme (etl init_code) !5475
  • extend hpush yaml with bitwise flags !5479
  • remove unused ETL functions; add test for streaming with python runtime !5513
  • update Jupyter notebook to reflect sdk changes !5548

Core

  • global rebalance: initiating new vs receiving delayed ACKs !5426
  • etl md: remove sgl, read-all !5470
  • error handling: introduce type codes (major upgrade) !5477, !5481
  • easy URL: support listing buckets via /gs, /az, and /ais endpoints !5486
  • easy URL: add readme with extended examples and comments !5487
  • user agent, internet browser: pretty-print JSON response !5492
  • add User-Agent to http request headers !5494
  • backward compatibility; remote backends !5498
  • consolidate http headers; add type-cast fail !5503
  • introduce error types unsupported and not-implemented-yet !5508
  • refactor GET(object) !5507
  • add lom.rename-file w/cleanups !5509
  • simplify lom.persist & recache/refresh !5510
  • update atime on PUT !5511
  • HEAD(bucket): extend the API in re: cluster metadata !5516
  • [API change] add api.GetBucketInfo !5519
  • enumerate and unify bucket and object 'presence' vs 'existence' !5520
  • fs: add size-bucket and refactor x-bucket-summary !5523
  • backend providers as simple KVs (minor) !5525
  • on the fly and (new) upon request remote bucket creation/addition !5538
  • object props: remove get-node, add location and helpers !5561, !5562, !5563
  • creating/adding buckets on the fly: further clarifications !5566
  • debug pkg: assert-msg is redundant; minor ref !5579
  • cos-assert; align (minor ref) !5587
  • periodic stats log: tweak idleness logic !5588
  • x-archive to always close data mover !5594
  • key-value d...
Read more

3.11

18 Jul 23:56
Compare
Choose a tag to compare

Highlights

Authentication and Access Control

  • CI: fix AuthN to run !5264
  • replace form3tech-oss/jwt and upgrade to jwt/v4 !5266
  • prep for production !5267
  • prep for production (major update) !5268
  • prep for production (part three) !5269
  • prep for production (part four) !5272
  • prep for production (aisloader) !5275
  • prep for production (clean_deploy) !5277
  • extend error info, add usage examples, tweak CLI help !5279
  • revamp ais auth CLI !5281
  • rename APIs for consistency; add CLI to update existing role !5283
  • E2E test: added basic AuthN test !5284
  • make cluster optional for role add/set; randomize E2E tests; unify auth-enabled checks !5285
  • rename AuthN entities for consistency !5288
  • add api package, isolate server internals (major) !5289
  • continued refactoring, assorted fixes !5291
  • add 'tok' package to consolidate all jwt calls; continued refactoring !5292
  • status 'forbidden' or 'unauthorized' to accompany all errors !5310
  • allow 3rd party client lookup ht://bucket !5312
  • separate enabling HTTPS for AuthN and AIS cluster when deploying !5313
  • oad-token: if user-given location is empty then !5314
  • secret handshake when adding/updating AIS clusters !5316
  • revisit AIS calling and retrying; unify errors !5317
  • login by non-superusers: require cluster (to be explicitly specified) !5326
  • assorted fixes, improvements !5333
  • introduce 'show-cluster' permission; reinforce admin access !5350

Python SDK

  • add copy_bucket method !5225
  • add method xact_status !5226
  • move test to cluster ops file !5232, !5233
  • add xaction start !5237
  • restructuring (adding pytorch) !5238
  • PyTorch plugin (dataset/aisio) implementation for aistore !5242
  • fixed bug - list empty buckets !5251
  • changed error handling !5258
  • add iterator for bucket objects !5259
  • improved pytorch plugin !5274
  • release 0.9.0 fixes !5276
  • cluster health api !5280
  • health check for non existing urls !5282
  • syncing aisio.py !5286
  • rename_bucket() port !5304
  • test suite changes !5305
  • makefile dependency fix !5307
  • cluster health api !5278
  • added aisio example !5322
  • restructure api (bucket) !5323
  • fixing aisio example !5325
  • aisio example fixes !5328
  • restructure cluster !5335
  • obj streamer fixes !5337
  • blog post for ais file loader and lister !5338
  • using session for requests !5341
  • restructure object !5342
  • object method name changes !5343, !5346
  • restructure fixes !5348
  • bucket api fixes !5351
  • object get fix !5352
  • restructure xaction operations !5353
  • fixes and release prep !5355
  • minor readme edits !5356
  • readme changes wrt pytorch !5357
  • api docs and make file fixes !5359
  • removing aistore.pytorch imports from main init !5360
  • makefile fix (pipx path) !5367

GitHub Action

  • add github action for building ais !5265
  • add github workflows for lint, python-tests and build !5270
  • add test-short github workflow (Linux) !5271
  • github/workflow: changed go lint version !5366

CLI

  • confirm assorted operations: decommission, rm bucket, rm -rf !5224
  • revamp ais show cluster; add K8s POD names !5243, !5244
  • bucket extra-props-to-update; filter by provider !5256
  • allow setting permission on per-bucket basis for a role !5309
  • revise 'show config', add nested completions !5339
  • revise 'set config', add local vs inherited !5344
  • add --cluster filter flag to 'auth show role' command !5349

Bucket Summary

  • revise bucket summary (major) !5230
  • bucket summary: micro-optimize and simplify !5262
  • bucket summary: multi-bucket; add min/max/avg counters; fixes !5299
  • bucket summary: apparent vs on-disk sizing, formatting clarity !5300, !5302

Common environment and system filenames

  • move and refactor common environment; AuthN version and build time !5318
  • filename constants; environment variables !5324
  • consolidate all configs under $HOME/.config/ais !5327

Misc. Features

  • warn when decommissioning cluster, support '--rm-user-data' option !5221
  • ability to decommission node and join it back with a new identity !5222
  • anonymous public-access buckets: list-objects anonymously (w/ API change) !5330
  • s3: the capability to set custom s3 endpoint !5257, !5260

Docs

  • tweak ETL markdown !5227
  • updated getting_started and etl markdown !5228
  • changes to standalone deployment README !5253
  • minor indentation fix to standalone readme !5254
  • standalone ec2 deployment instructions and benchmarks !5261
  • python api reference !5294
  • python_api.md fixes !5297
  • bucket.md to reflect 'summary' changes !5308
  • update 'ais ls', add anonymous and more !5336
  • main readme !5358
  • s3 compatibility, 's3cmd' configuration and getting-started !5361
  • main readme, index.md, python.md !5362
  • amend and extend cli/config.md; fix config node reset !5363
  • add 's3cmd.md' !5364

Tests

  • random page, remote bucket !5234
  • remote bucket, object props !5235
  • extend cleanup to support (prior to test, remote bucket) !5240
  • ensure more !5245
  • fix iter-fields unit for recently added config extras !5290
  • reverse proxy test against GCS !5329

Build

  • upgrade gopkg !5223
  • fix statsd github URL !5229
  • new (and supported) JWT packages !5266

Performance micro-optimizations, bug fixes, refactoring

  • aisloader: rewrite SGL freeing piece !5236
  • refactor low-level is-bucket, bucket-init, and friends !5239, !5241
  • BMD: speed-up bucket init; add global namespace uname (const) to avoid allocations !5246
  • BMD atomic update vs backend props !5247
  • reuse buffer when iterating range template !5248
  • devtools: refactor and simplify !5249
  • bucket extra-props-to-update: include readonly; CLI: filter by provider !5256
  • yet another case to skip err logging block devices !5273, !5295
  • cluster config: allow critical updates !5298
  • capacity: periodic log and refresh; version and build time; fixes !5315
  • local playground: loopback device sizing; fixes !5321
  • refactor and move docker pkg (formerly containers) !5331
  • list-objects flags: unify, rename, document, and reference !5332
  • config: restart-required selection; feature flags !5334
  • blog image fix !5347
  • assorted fixes and refactoring !5354
  • CI/CD: skip-ci & python-tests-only conflict fix !5365