Skip to content

Commit

Permalink
v1.18.0
Browse files Browse the repository at this point in the history
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
  • Loading branch information
shefty committed Apr 7, 2023
1 parent 952e9bc commit 8f3a881
Show file tree
Hide file tree
Showing 5 changed files with 214 additions and 3 deletions.
1 change: 1 addition & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ Dipti Kothari <dkothar@amazon.com>
Dmitry Durnov <dmitry.durnov@intel.com>
Dmitry Gladkov <dmitry.gladkov@intel.com>
Doug Oucharek <dougso@me.com>
Edgar Gabriel <Edgar.Gabriel@amd.com>
Eric Raut <eraut@amazon.com>
Erik Paulson <epaulson10@gmail.com>
Erik Paulson <erik.r.paulson@intel.com>
Expand Down
210 changes: 210 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,30 +11,240 @@ v1.18.0, Fri Apr 7, 2023

## Core

- rocr: fix offset calculation
- rocr: use ofi spinlock functions
- rocr: minor fixes
- neuron: convert warn to info for nrt_get_dmabuf_fd not found
- neuron: check existance of neuron devices during initialization
- neuron: Add support for neuron dma-buf
- ze: update ZE to support new driver index specification
- List variables read from config file
- Add switch to prefer system-config over environment
- Add basic system-config support for setting library variables
- Move peer provider defines into new header
- rocr: Support asynchronous memory copies
- rocr: Add support for ROCR IPC
- rocr: rename rocr data-structures
- synpaseai: return 0 for host_register and host_deregister
- fabric: Improve log level of provider mismatch
- cuda: Allow CUDA IPC when P2P disabled
- ze: add ZE command list pool to reuse command lists
- cuda: implement cuda_get_xfer_setting for non cuda build
- cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior
- cuda.c: Add const to param to remove warning
- Add IFF_RUNNING check to indicate iface is up and running
- io_uring support enhancements

## EFA

- Implement CUDA support on instance types that do not support GPUDirect RDMA
- Implement fi_write using device's RDMA write capability
- Enrich error messages with debug and connection info
- Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt
- Implement support for FI_OPT_CUDA_API_PERMITTED in fi_setopt
- Add support for neuron dma-buf
- Use gdrcopy to improve the intra-node CUDA communication performance for small messages
- Use shm provider's FI_AV_USER_ID support
- Fix bugs in efa provider’s shm info initialization procedure

## Hooks

- dmabuf_peer_mem: Handle IPC handle caching in L0
- trace: Add trace log for CM operation APIs
- trace: Change tag in trace log to hex format
- trace: Enhance trace log for data transfer API calls
- trace: Add trace log for API fi_cq_readerr()
- trace: Add trace log for CQ operation APIs
- Add tracing hook provider

## Net

- Net provider optimizations have been integrated into the tcp provider.
- Net provider has been removed as a reported provider.

## OPX

- Fixes for Coverity scan issues
- Enhanced tag matching
- Tune expected recv for unaligned buffers
- Add finer logging granularity
- Reduce RTS immediate data and fix packet estimate for odd TID lengths
- Add additional sources for FI_OPX_UUID
- Exclude opx from build if missing needed defines
- Move some logs to optimized builds
- Fix build warnings for unused return code from posix_memalign
- Add reliability sanity check to detect when send buffer is illegally altered
- SDMA Completion workaround for driver cache invalidation race condition
- Fix replay payload pointer increment
- Handle completion counter across multiple writes in SDMA
- Cleanup pointers after free()
- Modify domain creation to handle soft cache errors
- Two biband performance improvements
- Fixes based on Coverity Scan related to auto progress patch
- Changed poll many argument to rx_caps instead of caps
- Resync with server configured for Multi-Engines (DAOS CART Self Tests)
- Remove import_monitor as ENOSYS case
- Address memory leaks reported on OFIWG issues page
- General code cleanup
- Add replays over SDMA
- Implement basic TID Cache
- Revert work_pending check change
- Fix use_immediate_blocks
- Restore state after replay packet is NULL
- Fix memory leak from early arrival packets
- Fix segfault in SHM operations from uninitialized value in atomic path
- Prevent SDMA work entries from being reused with outstanding replays
- Set runtime as default for OPX_AV
- Fix RTS replay immediate data
- Fix errors caught by the upstream libfabric Coverity Scan
- fi_getInfo - Support multiple HFI devices
- Support OFI_PORT and Contiguous endpoint addresses for CART & Mercury
- Add fi_opx_tid.h to Makefile.include
- Fix progress checks and default domain
- Revert is_intranode simplification.
- Don't inline handle_ud_ping function
- Allow atomic fetch ops to use SDMA for sufficiently large counts
- Cleaned up FI_LOG_LEVEL=warn output
- Cleaned up unused macros for FI_REMOTE_COMM and FI_LOCAL_COMM
- Reset default progress to FI_PROGRESS_MANUAL
- Fixed GCC 10 build error with Auto Progress
- Add support for FI_PROGRESS_AUTO
- Use max allowed packet size in SDMA path when expected TID is off
- Expected receive (TID) rendezvous
- RMA Read/Write operations over SDMA
- Remove origin_rs from cts and dput packet header
- Fix for hang in DAOS CART tests
- Use single IOV for bounce buffer in SDMA requests.
- Check for FI_MULTI_RECV with bitwise OR instead of AND
- Fix for intermittent intra-node deadlock hang (DAOS CART tests)
- Fix to RPC transport error failure (DAOS CART tests)
- Fix for context->buf set to NULL
- Fix bad asserts
- Ensure atomicity of atomic ops
- fi_opx_cq_poll_inline count and head check fix
- Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests)

## PSM3

- Update provider to sync with IEFS 11.4.1.1.2
- Fix warnings from build
- Add oneapi ZE support to OFI configure

## RXD

- Ignore error path in av_close return

## RXM

- Handle NULL av in rxm_freeall_conns()
- Implement the FI_OPT_CUDA_API_PERMITTED option
- Write "len" field for remote write
- Ignore error path domain_close return
- Free coll_pool on ep close
- Update rxm to use util_cq FI_PEER support functions
- Fix incorrect CQ completion field
- Rename srx to msg_srx
- Disable FI_SOURCE if not requested
- Memory leaks removed
- Set offload_coll_mask based on actual configuration
- Report on coll offload capabilities with OFI_OFFLOAD_PROV_ONLY
- Fabric setups collective offload fabric
- Create eq for collective offload provider
- Close collective providers ep when rxm_ep is closed
- Fix incorrect use of OFI_UNUSED()
- Rework collective support to use collective provider(s)

## SHM

- Fix potential deadlock in smr_generic_rma()
- smr_generic_rma() wwrite error completion with positive errno
- Update SHM to use ROCR
- Fix incorrect discard call when cleaning up unexpected queues
- Separate smr_generic_msg into msg and tagged recv
- Fix start_msg call
- Implement the FI_OPT_CUDA_API_PERMITTED option
- Assert not valid atomic op
- Fix a bug in smr_av_insert
- Optimize locking on the SAR path
- Remove unneeded sar_cnt
- Optimize locking
- Enable multiple GPU/interface support
- Remove HMEM specific calls from atomic path
- Use util_cq FI_PEER support
- Import shm as device host memory
- Add HMEM flag to smr region
- Fix user_id support
- Write tx err comp to correct cq
- Fix index when setting FI_ADDR_USER_ID

## TCP

- Provider source has been replaced by net provider source
- Removed incorrect reporting of support for FI_ATOMIC
- Do not save unmatched messages until we have the peer's fi_addr
- Use internal flag for FI_CLAIM messages, versus a reserved tag bit
- Fix updating error counter when discarding saved messages
- Allow saved messages to be received after the underlying ep has been closed
- Enhanced debug logging in connection path
- Force CM progress on unconnected ep's when posting data transfers
- Support connect and accept calls with io_uring
- Fix segfault accessing an invalid fi_addr
- Add io_uring support for CM message exchange
- Move CM progress from fabric to EQ to improve multi-threaded performance
- Fix small memory leak destroying an EQ
- Fix race where same rx entry could be freed twice
- Handle NULL av in rdm ep cleanup
- Reduce stack use for epoll event array

## UCX

- New provider targeting Nvidia fabrics that layers over libucp

## Util

- Fix the behavior of cq_read for FI_PEER
- rocr: Fix compilation issue
- cuda: Use correct debug string calls
- Free cq->peer_cq on close
- Remove extra new line from av insert log
- Check for count = 0 in ofi_ip_av_insert
- rocr: Add support for ROCR IPC
- Add FI_PEER support to util_cq
- Disable FI_SOURCE if not requested
- Remove FID events from the EQ when closing endpoint
- Rework collective support to be a peer collective provider(s)
- Allow FI_PEER to pass CQ, EQ and AV attr checking
- Remove annoying WARNING message for FI_AFFINITY
- Add utility collective provider

## Verbs

- Implement the FI_OPT_CUDA_API_PERMITTED option
- Add support for ROCR IPC

## Fabtests

- Add fi_setopt_test unit test
- Update ze device registration calls
- fi-rdmabw-xe: Always use host buffer for synchronization
- Fix bug in posting RMA operation
- fi_cq_data: Extend test to fi_writedata
- fi_cq_data: Extend validation of completion data
- Rename fi_msg_inject tests to fi_inject_test to reflect its use
- fi_rdm_stress: Add count option to json key/pair options
- Add and fix OOB option handling in several tests
- fi_eq_test: Fix incorrect return value
- fi_rdm_multi_client: Increase the size of ep name buffer
- Add FI_MR_RAW to default mr_mode
- Support larger control messages needed by newer providers
- fi-rdmabw-xe: Update to work with the ucx provider
- fi_ubertest: Cleanup allocations in failure cases
- Change ft_reg_mr to not assume hmem iface & device
- fi_multinode: Bugfix multinode test for ze + verbs
- fi_multinode: Remove unused validation print
- fi_multinode: Skip tests for unsupported collective operations
- fi_ubertest: Fix data validation with device memory
- fi_peek_tagged: Restructure and expand test

v1.17.1, Fri Mar 3, 2023
========================
Expand Down
2 changes: 1 addition & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ dnl
dnl Process this file with autoconf to produce a configure script.

AC_PREREQ([2.60])
AC_INIT([libfabric], [1.18.0rc2], [ofiwg@lists.openfabrics.org])
AC_INIT([libfabric], [1.18.0], [ofiwg@lists.openfabrics.org])
AC_CONFIG_SRCDIR([src/fabric.c])
AC_CONFIG_AUX_DIR(config)
AC_CONFIG_MACRO_DIR(config)
Expand Down
2 changes: 1 addition & 1 deletion fabtests/configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ dnl
dnl Process this file with autoconf to produce a configure script.

AC_PREREQ(2.57)
AC_INIT([fabtests], [1.18.0rc2], [ofiwg@lists.openfabrics.org])
AC_INIT([fabtests], [1.18.0], [ofiwg@lists.openfabrics.org])
AC_CONFIG_AUX_DIR(config)
AC_CONFIG_MACRO_DIR(config)
AC_CONFIG_HEADERS(config.h)
Expand Down
2 changes: 1 addition & 1 deletion include/windows/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@
#define PACKAGE_TARNAME PACKAGE

/* Define to the version of this package. */
#define PACKAGE_VERSION "1.18.0rc2"
#define PACKAGE_VERSION "1.18.0"

/* Define to the full name and version of this package. */
#define PACKAGE_STRING PACKAGE_NAME " " PACKAGE_VERSION
Expand Down

0 comments on commit 8f3a881

Please sign in to comment.