Skip to content

Commit

Permalink
Merge pull request #9391 from rakhmets/topic/news-1.15.0
Browse files Browse the repository at this point in the history
NEWS: Added 1.15.0 section.
  • Loading branch information
shamisp committed Sep 28, 2023
2 parents e674114 + 583791e commit 348d14f
Showing 1 changed file with 18 additions and 48 deletions.
66 changes: 18 additions & 48 deletions NEWS
Expand Up @@ -11,53 +11,7 @@
### Features:
### Bugfixes:

## 1.15.0-rc6 (September 20, 2023)
### Bugfixes:
#### UCP
* Fixed assertion when sending from noncontig GPU buffer to managed buffer.

## 1.15.0-rc5 (September 12, 2023)
### Bugfixes:
#### UCP
* Fixed the data race on endpoint configurations.

## 1.15.0-rc4 (August 30, 2023)
### Bugfixes:
#### RDMA CORE (IB, ROCE, etc.)
* Fixed dma-buf based memory region registration
* Fixed memory handle data corruption when PCIe relaxed ordering is enabled
#### UCS
* Fixed lane selection, adding bandwidth estimation for Sapphire Rapids family

## 1.15.0-rc3 (August 8, 2023)
### Bugfixes:
#### UCP
* Fixed endpoint reconfiguration issues because of assymetrical selection
#### UCT
* Check dmabuf kernel support in ROCm memory domain
#### UCM
* Fixed conditional jump patching
#### Tools
* Fixed memory access flags in perftest

## 1.15.0-rc2 (July 27, 2023)
### Features:
#### RDMA CORE (IB, ROCE, etc.)
* Implemented is_reachable_v2 for IB interfaces
#### Build
* Enabled build with binutils 2.40
* Added versioned dependency to switch between packages with the same names

### Bugfixes:
#### UCP
* Fixed endpoint reconfiguration error due to wrong locality detection
#### RDMA CORE (IB, ROCE, etc.)
* Fixed performance degradation when indirect atomic key is not supported by the hardware
* Fixed remote access error to strict-order key because of wrong offset
#### GPU (CUDA, ROCM)
* Fixed CUDA IPC performance degradation after libnuma removal

## 1.15.0-rc1 (May 10, 2023)
## 1.15.0 (September 28, 2023)
### Features:
#### UCP
* Added 2-stage pipeline protocol in the new protocol infrastructure
Expand All @@ -75,6 +29,7 @@
* Added base implementation of is_reachable_v2 API using intra/inter flag
* Introduced MD capability for non-blocking registration memory types
#### RDMA CORE (IB, ROCE, etc.)
* Added implementation of is_reachable_v2 routine to IB interface
* Added option to control CQE zipping per CQ RX/TX direction
* Added option to specify how DCI selects port under RoCE LAG
* Added hw_dcs to the list of policies to select DCI by an endpoint
Expand Down Expand Up @@ -104,12 +59,17 @@
* Added user-side memcpy option for AM benchmarks in ucx_perftest
* Added wireshark LUA dissectors for some UCX protocols
#### Build
* Added support for binutils 2.40
* Added versioned dependency to switch between packages with the same names
* Added a separate xpmem deb subpackage
* Added aarch64 support to the binary distribution pipeline
* Removed dependency on libnuma

### Bugfixes:
#### UCP
* Fixed assertion when sending from non-contiguous GPU buffer to managed buffer
* Fixed the race condition on endpoint configurations
* Fixed endpoint reconfiguration issues due to asymmetrical selection
* Fixed endpoint reconfiguration error due to wrong locality detection
* Fixed crash during connection manager cleanup
* Fixed rkey index calculation for rendezvous protocol
* Fixed rcache dump function
Expand All @@ -123,20 +83,29 @@
* Fixed CPU/device atomics selection in the new protocol infrastructure
* Multiple fixes in the new protocol infrastructure information output
#### UCT
* Added check for dmabuf kernel support in ROCm memory domain
* Fixed exported memh packing
* Fixed an error in checking return status of multi-threaded memory registration function
#### RDMA CORE (IB, ROCE, etc.)
* Fixed dma-buf based memory region registration
* Fixed memory handle data corruption when PCIe relaxed ordering is enabled
* Fixed performance degradation when indirect atomic key is not supported by the hardware
* Fixed remote access error to strict-order keys because of wrong offset
* Added check for UAR support to memory domain opening
* Fixed updating port counters for devx qp
* Fixed ibv_create_cq error message on node without Infiniband
* Fixed performance degradation due to using 2 paths on NDR400 by default
* Removed unnecessary async lock which otherwise would block UD progress
#### GPU (CUDA, ROCM)
* Fixed CUDA IPC performance degradation due to libnuma removal
#### UCS
* Fixed lane selection and added bandwidth estimation for Sapphire Rapids family
* Fixed displaying wrong environment variable suggestions
* Fixed VFS warning output
* Fixed SEGV in ucs_debug_backtrace_next(), upon previous SEGV handling, due to ENOMEM situation
* Fixed memory corruption when using UCX_MPOOL_FIFO=y
#### UCM
* Fixed conditional jump patching
* Fixed mremap() override
#### GPU (CUDA, ROCM)
* Fixed usage of dmabuf when the buffer is not page-aligned
Expand All @@ -148,6 +117,7 @@
#### Tests
* Fixed wrong usage of ep_close in examples
#### Tools
* Fixed memory access flags in perftest
* Removed support for librte from perf
* Fixed worker flush deadlock when using multiple workers in ucx_perftest
#### Build
Expand Down

0 comments on commit 348d14f

Please sign in to comment.