Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iox-#2177 Update SPSC SoFi #2214

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

albtam
Copy link
Contributor

@albtam albtam commented Mar 5, 2024

  • add comments to clarify concurrent behavior
  • fix memory orders

Pre-Review Checklist for the PR Author

  1. Add a second reviewer for complex new features or larger refactorings
  2. Code follows the coding style of CONTRIBUTING.md
  3. Tests follow the best practice for testing
  4. Changelog updated in the unreleased section including API breaking changes
  5. Branch follows the naming format (iox-123-this-is-a-branch)
  6. Commits messages are according to this guideline
  7. Update the PR title
    • Follow the same conventions as for commit messages
    • Link to the relevant issue
  8. Relevant issues are linked
  9. Add sensible notes for the reviewer
  10. All checks have passed (except task-list-completed)
  11. All touched (C/C++) source code files from iceoryx_hoofs are added to ./clang-tidy-diff-scans.txt
  12. Assign PR to reviewer

Notes for Reviewer

Checklist for the PR Reviewer

  • Commits are properly organized and messages are according to the guideline
  • Code according to our coding style and naming conventions
  • Unit tests have been written for new behavior
  • Public API changes are documented via doxygen
  • Copyright owner are updated in the changed files
  • All touched (C/C++) source code files from iceoryx_hoofs have been added to ./clang-tidy-diff-scans.txt
  • PR title describes the changes

Post-review Checklist for the PR Author

  1. All open points are addressed and tracked via issues

References

@albtam
Copy link
Contributor Author

albtam commented Mar 5, 2024

From #2178 (comment)

@elBoberido:

I could make your example pass miri by making both compare and swap loops use either compare_exchange or compare_exchange_weak.
But when increasing the number of pushes a lot it still detects (false positive?) races. The tests also need to be done in a different way, e.g. the cancelation points of the loop become more complex due to the overflow behavior.

Copy link

codecov bot commented Mar 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.06%. Comparing base (b2cd72b) to head (04837bb).
Report is 142 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2214      +/-   ##
==========================================
- Coverage   79.09%   78.06%   -1.03%     
==========================================
  Files         431      432       +1     
  Lines       16581    15904     -677     
  Branches     2300     2301       +1     
==========================================
- Hits        13114    12415     -699     
- Misses       2616     2644      +28     
+ Partials      851      845       -6     
Flag Coverage Δ
unittests 77.88% <100.00%> (-1.00%) ⬇️
unittests_timing 14.98% <46.15%> (-0.35%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...concurrent/buffer/include/iox/detail/spsc_sofi.hpp 100.00% <ø> (ø)
...concurrent/buffer/include/iox/detail/spsc_sofi.inl 91.83% <100.00%> (+2.36%) ⬆️

... and 238 files with indirect coverage changes

Copy link
Member

@elBoberido elBoberido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for letting you wait so long for the review. I needed a fresh mind to review lock-free code :)

- add comments to clarify concurrent behavior
- fix memory orders
@elBoberido
Copy link
Member

@albtam just to be sure that you are not waiting on feedback. I'll wait until the CI (at least the github actions) are green.

@albtam
Copy link
Contributor Author

albtam commented Mar 27, 2024

@albtam just to be sure that you are not waiting on feedback. I'll wait until the CI (at least the github actions) are green.

@elBoberido Back to you

Comment on lines 187 to 189
// While memory synchronization is not needed with m_readPosition, we need
// memory_order_acquire to avoid the reordering of the operation
uint64_t currentReadPosition = m_readPosition.load(std::memory_order_acquire);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corresponds to the m_readPosition.compare_exchange_weak store operation in pop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what I wanted to say is that the memory synchronization is not required. But to avoid the reordering, we need a corresponding store/release. So in the end, there will be a memory synchronization but this is only a side effect. Do you agree?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. What do you think about talking about happens before relationships instead of memory synchronization and reordering? This is a more generic term and from the context it becomes clear if the happens before relationship is required for memory or for logic.

In this specific case the comment could become something like We need to establish a happens before relationship with 'm_writePosition.load(...)'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elBoberido Sorry, can you remind me what's the happens-before relationship we want to enforce here? Since it's a load/acquire, it doesn't influence the order of operations above (so it can't be m_writePosition.load(...)).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This specific one acts as a sync point to a store in pop and I think we can remove it since the CAS below now has the correct ordering and does an acquire load on success. We might need to change the failure ordering from relaxed to acquire to still act as a sync point to the store in pop but I'm not fully sure about this. Maybe @elfenpiff, @FerdinandSpitzschnueffler, @MatthiasKillat @budrus could also share their expertise.

We can do the same with the CAS in the pop method but then the early return needs to be removed and replaced with the previous logic again. Overall, I think it would be more performant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to change the failure ordering from relaxed to acquire to still act as a sync point to the store in pop but I'm not fully sure about this.

I think we do need to make it acquire

We can do the same with the CAS in the pop method but then the early return needs to be removed and replaced with the previous logic again. Overall, I think it would be more performant.

Not sure to understand what you want to do here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can make uint64_t currentReadPosition = m_readPosition.load(std::memory_order_acquire); in pop also relaxed but then the early return needs to be removed again and the previous logic with popWasSuccessful needs to be re-introduced.

Overall, this would lead to only one acquire and one release for m_writePosition and two acq_rel/acquire for m_readPosition which simplifies the relation of the sync points.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, changed

@albtam
Copy link
Contributor Author

albtam commented Apr 10, 2024

@elBoberido Thanks. I addressed most of your comments. There are still some points to clarify

@elBoberido
Copy link
Member

@albtam you can undo some of my suggestions 😅

Copy link
Member

@elBoberido elBoberido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This somehow becomes a never ending story but with the revert of the relaxed ordering in the places where acq_rel was required, we might be able to relax some of the acquire orderings. Unfortunately this is something which is hard to see when there are too many wrong orderings but once the ordering is at least as strong as required, we can identify the locations which definitely need the strong ordering and can try to relax some orderings which act as sync points when we identify new sync points or change the algorithm a bit.

@albtam
Copy link
Contributor Author

albtam commented Apr 23, 2024

I pushed some of your proposals. There is still something unclear. Lock-free is hard :P

Copy link
Member

@elBoberido elBoberido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lock-free is hard 😅

Btw, can you please adjust your commit messages with the issue number.

Before this is merged, it would be good to run test_stress_spsc_sofi. Ideally each of the three test for a few hours. The stress time can only be adjusted in the source code.

@@ -206,14 +206,14 @@ inline bool SpscSofi<ValueType, CapacityValue>::push(const ValueType& valueIn, V
// Another issue might be that two consecutive pushes (not concurrent) happen on different CPU
// cores without synchronization, then the memory also needs to be synchronized for the overflow
// case.
// Memory order failure is memory_order_relaxed since there is no further synchronization ////
// Memory order failure is memory_order_relaxed since there is no further synchronization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the order is changed to memory_order_acquire, the comment is not true anymore 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Comment on lines 187 to 189
// While memory synchronization is not needed with m_readPosition, we need
// memory_order_acquire to avoid the reordering of the operation
uint64_t currentReadPosition = m_readPosition.load(std::memory_order_acquire);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can make uint64_t currentReadPosition = m_readPosition.load(std::memory_order_acquire); in pop also relaxed but then the early return needs to be removed again and the previous logic with popWasSuccessful needs to be re-introduced.

Overall, this would lead to only one acquire and one release for m_writePosition and two acq_rel/acquire for m_readPosition which simplifies the relation of the sync points.

@albtam
Copy link
Contributor Author

albtam commented May 1, 2024

@elBoberido I implemented your last recommendations.

@elBoberido
Copy link
Member

@albtam sorry, this totally slipped my attention. The changes look good. Did you have a chance to run the stress tests?
It would also be great if you could run the Miri tests with the current memiry orders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in lock-free sofi Decipher weird comment in SoFi Update SPSCFiFo
2 participants