Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Lnear Congruential Generator to generate work-stealing permutation #193

Open
mratsim opened this issue Feb 24, 2023 · 1 comment
Open
Labels
enhancement :shipit: New feature or request

Comments

@mratsim
Copy link
Owner

mratsim commented Feb 24, 2023

A tricky issue in message-passing based work-requesting/work-stealing is how to efficiently select victims, especially with a large number of cores.

  • The initial intuition is to use a bitset, but an uint64 can only address up to 64 cores.
    • What if we have a 8-socket Xeon Platinum 9282 with 56 cores - 112 threads each
      hence a total of 896 threads?
      A 896-bit bitset would be huge, and we don't really want to have this be a compile-time parameter.
  • The current solution in Weave is unsatisfactory, we send to the victims an address to our own set of tried/untried victims. This works with shared-memory but not in a distributed computing setting.

Instead we can use a Linear Congruential Generator: https://en.wikipedia.org/wiki/Linear_congruential_generator

The following recurrence generates all random numbers mod m without repetition:

Xₙ₊₁ = aXₙ+c (mod m)

if and only if (Hull-Dobell theorem):

  1. c and m are coprime
  2. a-1 is divisible by all prime factors of m
  3. a-1 is divisible by 4 if m is divisible by 4

The only state to send in steal request would be a and c. Xₙ is the victim ID and m is either the number of threads or for ease of implementation the next power of 2.

Nim code: https://github.com/mratsim/constantine/blob/1dfbb8bd4f2fefc4059f4aefb040f0b3ba2918cb/constantine/platforms/threadpool/threadpool.nim#L84-L128

iterator pseudoRandomPermutation(randomSeed: uint32, maxExclusive: int32): int32 =
  ## Create (low-quality) pseudo-random permutations for [0, max)
  # Design considerations and randomness constraint for work-stealing, see docs/random_permutations.md
  #
  # Linear Congruential Generator: https://en.wikipedia.org/wiki/Linear_congruential_generator
  #
  # Xₙ₊₁ = aXₙ+c (mod m) generates all random number mod m without repetition
  # if and only if (Hull-Dobell theorem):
  # 1. c and m are coprime
  # 2. a-1 is divisible by all prime factors of m
  # 3. a-1 is divisible by 4 if m is divisible by 4
  #
  # Alternative 1. By choosing a=1, all conditions are easy to reach.
  #
  # The randomness quality is not important besides distributing potential contention,
  # i.e. randomly trying thread i, then i+1, then i+n-1 (mod n) is good enough.
  #
  # Assuming 6 threads, co-primes are [1, 5], which means the following permutations
  # assuming we start with victim 0:
  # - [0, 1, 2, 3, 4, 5]
  # - [0, 5, 4, 3, 2, 1]
  # While we don't care much about randoness quality, it's a bit disappointing.
  #
  # Alternative 2. We can choose m to be the next power of 2, meaning all odd integers are co-primes,
  # consequently:
  # - we don't need a GCD to find the coprimes
  # - we don't need to cache coprimes, removing a cache-miss potential
  # - a != 1, so we now have a multiplicative factor, which makes output more "random looking".

  # n and (m-1) <=> n mod m, if m is a power of 2
  let maxExclusive = cast[uint32](maxExclusive)
  let M = maxExclusive.nextPowerOfTwo_vartime()
  let c = (randomSeed and ((M shr 1) - 1)) * 2 + 1 # c odd and c ∈ [0, M)
  let a = (randomSeed and ((M shr 2) - 1)) * 4 + 1 # a-1 divisible by 2 (all prime factors of m) and by 4 if m divisible by 4

  let mask = M-1                                   # for mod M
  let start = randomSeed and mask

  var x = start
  while true:
    if x < maxExclusive:
      yield cast[int32](x)
    x = (a*x + c) and mask                         # ax + c (mod M), with M power of 2
    if x == start:
      break
@mratsim mratsim added the enhancement :shipit: New feature or request label Feb 24, 2023
@dumblob
Copy link

dumblob commented Feb 27, 2023

You are genius! I use congruential generators for many different use cases but this use case I never thought of. So simple yet so efficient. Congratulations!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement :shipit: New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants