v1.18.0

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
ofiwg · Apr 7, 2023 · 8f3a881 · 8f3a881
1 parent 952e9bc
commit 8f3a881
Show file tree

Hide file tree

Showing 5 changed files with 214 additions and 3 deletions.
diff --git a/AUTHORS b/AUTHORS
@@ -61,6 +61,7 @@ Dipti Kothari <dkothar@amazon.com>
 Dmitry Durnov <dmitry.durnov@intel.com>
 Dmitry Gladkov <dmitry.gladkov@intel.com>
 Doug Oucharek <dougso@me.com>
+Edgar Gabriel <Edgar.Gabriel@amd.com>
 Eric Raut <eraut@amazon.com>
 Erik Paulson <epaulson10@gmail.com>
 Erik Paulson <erik.r.paulson@intel.com>

diff --git a/NEWS.md b/NEWS.md
@@ -11,30 +11,240 @@ v1.18.0, Fri Apr 7, 2023
 
 ## Core
 
+- rocr: fix offset calculation
+- rocr: use ofi spinlock functions
+- rocr: minor fixes
+- neuron: convert warn to info for nrt_get_dmabuf_fd not found
+- neuron: check existance of neuron devices during initialization
+- neuron: Add support for neuron dma-buf
+- ze: update ZE to support new driver index specification
+- List variables read from config file
+- Add switch to prefer system-config over environment
+- Add basic system-config support for setting library variables
+- Move peer provider defines into new header
+- rocr: Support asynchronous memory copies
+- rocr: Add support for ROCR IPC
+- rocr: rename rocr data-structures
+- synpaseai: return 0 for host_register and host_deregister
+- fabric: Improve log level of provider mismatch
+- cuda: Allow CUDA IPC when P2P disabled
+- ze: add ZE command list pool to reuse command lists
+- cuda: implement cuda_get_xfer_setting for non cuda build
+- cuda: adjust FI_HMEM_CUDA_ENABLE_XFER behavior
+- cuda.c: Add const to param to remove warning
+- Add IFF_RUNNING check to indicate iface is up and running
+- io_uring support enhancements
+
 ## EFA
 
+- Implement CUDA support on instance types that do not support GPUDirect RDMA
+- Implement fi_write using device's RDMA write capability
+- Enrich error messages with debug and connection info
+- Implement support for FI_OPT_EFA_USE_DEVICE_RDMA in fi_setopt
+- Implement support for FI_OPT_CUDA_API_PERMITTED in fi_setopt
+- Add support for neuron dma-buf
+- Use gdrcopy to improve the intra-node CUDA communication performance for small messages
+- Use shm provider's FI_AV_USER_ID support
+- Fix bugs in efa provider’s shm info initialization procedure
+
 ## Hooks
 
+- dmabuf_peer_mem: Handle IPC handle caching in L0
+- trace: Add trace log for CM operation APIs
+- trace: Change tag in trace log to hex format
+- trace: Enhance trace log for data transfer API calls
+- trace: Add trace log for API fi_cq_readerr()
+- trace: Add trace log for CQ operation APIs
+- Add tracing hook provider
+
 ## Net
 
+- Net provider optimizations have been integrated into the tcp provider.
+- Net provider has been removed as a reported provider.
+
 ## OPX
 
+- Fixes for Coverity scan issues
+- Enhanced tag matching
+- Tune expected recv for unaligned buffers
+- Add finer logging granularity
+- Reduce RTS immediate data and fix packet estimate for odd TID lengths
+- Add additional sources for FI_OPX_UUID
+- Exclude opx from build if missing needed defines
+- Move some logs to optimized builds
+- Fix build warnings for unused return code from posix_memalign
+- Add reliability sanity check to detect when send buffer is illegally altered
+- SDMA Completion workaround for driver cache invalidation race condition
+- Fix replay payload pointer increment
+- Handle completion counter across multiple writes in SDMA
+- Cleanup pointers after free()
+- Modify domain creation to handle soft cache errors
+- Two biband performance improvements
+- Fixes based on Coverity Scan related to auto progress patch
+- Changed poll many argument to rx_caps instead of caps
+- Resync with server configured for Multi-Engines (DAOS CART Self Tests)
+- Remove import_monitor as ENOSYS case
+- Address memory leaks reported on OFIWG issues page
+- General code cleanup
+- Add replays over SDMA
+- Implement basic TID Cache
+- Revert work_pending check change
+- Fix use_immediate_blocks
+- Restore state after replay packet is NULL
+- Fix memory leak from early arrival packets
+- Fix segfault in SHM operations from uninitialized value in atomic path
+- Prevent SDMA work entries from being reused with outstanding replays
+- Set runtime as default for OPX_AV
+- Fix RTS replay immediate data
+- Fix errors caught by the upstream libfabric Coverity Scan
+- fi_getInfo - Support multiple HFI devices
+- Support OFI_PORT and Contiguous endpoint addresses for CART & Mercury
+- Add fi_opx_tid.h to Makefile.include
+- Fix progress checks and default domain
+- Revert is_intranode simplification.
+- Don't inline handle_ud_ping function
+- Allow atomic fetch ops to use SDMA for sufficiently large counts
+- Cleaned up FI_LOG_LEVEL=warn output
+- Cleaned up unused macros for FI_REMOTE_COMM and FI_LOCAL_COMM
+- Reset default progress to FI_PROGRESS_MANUAL
+- Fixed GCC 10 build error with Auto Progress
+- Add support for FI_PROGRESS_AUTO
+- Use max allowed packet size in SDMA path when expected TID is off
+- Expected receive (TID) rendezvous
+- RMA Read/Write operations over SDMA
+- Remove origin_rs from cts and dput packet header
+- Fix for hang in DAOS CART tests
+- Use single IOV for bounce buffer in SDMA requests.
+- Check for FI_MULTI_RECV with bitwise OR instead of AND
+- Fix for intermittent intra-node deadlock hang (DAOS CART tests)
+- Fix to RPC transport error failure (DAOS CART tests)
+- Fix for context->buf set to NULL
+- Fix bad asserts
+- Ensure atomicity of atomic ops
+- fi_opx_cq_poll_inline count and head check fix
+- Fix intermittent intra-node hang causing RPC timeouts (DAOS CART tests)
+
 ## PSM3
 
+- Update provider to sync with IEFS 11.4.1.1.2
+- Fix warnings from build
+- Add oneapi ZE support to OFI configure
+
+## RXD
+
+- Ignore error path in av_close return
+
 ## RXM
 
+- Handle NULL av in rxm_freeall_conns()
+- Implement the FI_OPT_CUDA_API_PERMITTED option
+- Write "len" field for remote write
+- Ignore error path domain_close return
+- Free coll_pool on ep close
+- Update rxm to use util_cq FI_PEER support functions
+- Fix incorrect CQ completion field
+- Rename srx to msg_srx
+- Disable FI_SOURCE if not requested
+- Memory leaks removed
+- Set offload_coll_mask based on actual configuration
+- Report on coll offload capabilities with OFI_OFFLOAD_PROV_ONLY
+- Fabric setups collective offload fabric
+- Create eq for collective offload provider
+- Close collective providers ep when rxm_ep is closed
+- Fix incorrect use of OFI_UNUSED()
+- Rework collective support to use collective provider(s)
+
 ## SHM
 
+- Fix potential deadlock in smr_generic_rma()
+- smr_generic_rma() wwrite error completion with positive errno
+- Update SHM to use ROCR
+- Fix incorrect discard call when cleaning up unexpected queues
+- Separate smr_generic_msg into msg and tagged recv
+- Fix start_msg call
+- Implement the FI_OPT_CUDA_API_PERMITTED option
+- Assert not valid atomic op
+- Fix a bug in smr_av_insert
+- Optimize locking on the SAR path
+- Remove unneeded sar_cnt
+- Optimize locking
+- Enable multiple GPU/interface support
+- Remove HMEM specific calls from atomic path
+- Use util_cq FI_PEER support
+- Import shm as device host memory
+- Add HMEM flag to smr region
+- Fix user_id support
+- Write tx err comp to correct cq
+- Fix index when setting FI_ADDR_USER_ID
+
 ## TCP
 
+- Provider source has been replaced by net provider source
+- Removed incorrect reporting of support for FI_ATOMIC
+- Do not save unmatched messages until we have the peer's fi_addr
+- Use internal flag for FI_CLAIM messages, versus a reserved tag bit
+- Fix updating error counter when discarding saved messages
+- Allow saved messages to be received after the underlying ep has been closed
+- Enhanced debug logging in connection path
+- Force CM progress on unconnected ep's when posting data transfers
+- Support connect and accept calls with io_uring
+- Fix segfault accessing an invalid fi_addr
+- Add io_uring support for CM message exchange
+- Move CM progress from fabric to EQ to improve multi-threaded performance
+- Fix small memory leak destroying an EQ
+- Fix race where same rx entry could be freed twice
+- Handle NULL av in rdm ep cleanup
+- Reduce stack use for epoll event array
+
 ## UCX
 
+- New provider targeting Nvidia fabrics that layers over libucp
+
 ## Util
 
+- Fix the behavior of cq_read for FI_PEER
+- rocr: Fix compilation issue
+- cuda: Use correct debug string calls
+- Free cq->peer_cq on close
+- Remove extra new line from av insert log
+- Check for count = 0 in ofi_ip_av_insert
+- rocr: Add support for ROCR IPC
+- Add FI_PEER support to util_cq
+- Disable FI_SOURCE if not requested
+- Remove FID events from the EQ when closing endpoint
+- Rework collective support to be a peer collective provider(s)
+- Allow FI_PEER to pass CQ, EQ and AV attr checking
+- Remove annoying WARNING message for FI_AFFINITY
+- Add utility collective provider
+
 ## Verbs
 
+- Implement the FI_OPT_CUDA_API_PERMITTED option
+- Add support for ROCR IPC
+
 ## Fabtests
 
+- Add fi_setopt_test unit test
+- Update ze device registration calls
+- fi-rdmabw-xe: Always use host buffer for synchronization
+- Fix bug in posting RMA operation
+- fi_cq_data: Extend test to fi_writedata
+- fi_cq_data: Extend validation of completion data
+- Rename fi_msg_inject tests to fi_inject_test to reflect its use
+- fi_rdm_stress: Add count option to json key/pair options
+- Add and fix OOB option handling in several tests
+- fi_eq_test: Fix incorrect return value
+- fi_rdm_multi_client: Increase the size of ep name buffer
+- Add FI_MR_RAW to default mr_mode
+- Support larger control messages needed by newer providers
+- fi-rdmabw-xe: Update to work with the ucx provider
+- fi_ubertest: Cleanup allocations in failure cases
+- Change ft_reg_mr to not assume hmem iface & device
+- fi_multinode: Bugfix multinode test for ze + verbs
+- fi_multinode: Remove unused validation print
+- fi_multinode: Skip tests for unsupported collective operations
+- fi_ubertest: Fix data validation with device memory
+- fi_peek_tagged: Restructure and expand test
 
 v1.17.1, Fri Mar 3, 2023
 ========================

diff --git a/configure.ac b/configure.ac
@@ -8,7 +8,7 @@ dnl
 dnl Process this file with autoconf to produce a configure script.
 
 AC_PREREQ([2.60])
-AC_INIT([libfabric], [1.18.0rc2], [ofiwg@lists.openfabrics.org])
+AC_INIT([libfabric], [1.18.0], [ofiwg@lists.openfabrics.org])
 AC_CONFIG_SRCDIR([src/fabric.c])
 AC_CONFIG_AUX_DIR(config)
 AC_CONFIG_MACRO_DIR(config)

diff --git a/fabtests/configure.ac b/fabtests/configure.ac
@@ -5,7 +5,7 @@ dnl
 dnl Process this file with autoconf to produce a configure script.
 
 AC_PREREQ(2.57)
-AC_INIT([fabtests], [1.18.0rc2], [ofiwg@lists.openfabrics.org])
+AC_INIT([fabtests], [1.18.0], [ofiwg@lists.openfabrics.org])
 AC_CONFIG_AUX_DIR(config)
 AC_CONFIG_MACRO_DIR(config)
 AC_CONFIG_HEADERS(config.h)

diff --git a/include/windows/config.h b/include/windows/config.h
@@ -174,7 +174,7 @@
 #define PACKAGE_TARNAME PACKAGE
 
 /* Define to the version of this package. */
-#define PACKAGE_VERSION "1.18.0rc2"
+#define PACKAGE_VERSION "1.18.0"
 
 /* Define to the full name and version of this package. */
 #define PACKAGE_STRING PACKAGE_NAME " " PACKAGE_VERSION