Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load 14 billion credentials at once #31

Open
kpcyrd opened this issue Mar 29, 2018 · 1 comment
Open

Can't load 14 billion credentials at once #31

kpcyrd opened this issue Mar 29, 2018 · 1 comment
Labels
scheduler Related to the badtouch scheduler

Comments

@kpcyrd
Copy link
Owner

kpcyrd commented Mar 29, 2018

It turned out the scheduler is surprisingly inefficient at loading very large lists. After some math it turns out it needs to be redesigned to allow lists of that size in one go:

On a 64bit system, even just collecting the pointers of all newlines takes a very large amount of memory:

14_000_000_000 * 8 = 112000000000 # 104.3 GiB

Three important bits we need to keep in mind for feature parity with the current system:

  • due to threading we need to be able to process this list at multiple positions at once
  • to measure the process, we need to know how many credentials we've processed, but we also need to know how many we have to process in total
  • jobs can fail and need to be rescheduled

To support lists that large, we'd have to change the scheduler design:

generator thread

  • Open the list of credentials
  • Scan the whole file and count newlines
  • Seek back to 0
  • Start the worker threads
  • Fill a size-limited mpsc queue with credentials, then block at send
  • Every time a worker receives from the queue, send unblocks, and a new line can be loaded that we try to insert into the queue.

Memory-wise, this would be one of the most lightweight solutions.

offset + limit

This could be applied to dict-style runs as well:

  • Skip offset number of attempts
  • Submit limit number of attempts
  • Ignore everything else
    This would also allow resumption from aborted jobs (assuming the offset has been saved) or distributed tests (especially for dict style runs) as well.

It would be quirky to use though.

zero-copy + chunk assignment

To avoid overhead that comes from our data structures, we could just map the whole file into ram and then operate on slices. Since we need to process this list in parallel we could assign this file into chunks of a specific size and each worker is able to process this chunk individually, no synchronization needed until the end of that chunk has been reached.

This still requires enough ram to load the whole file at once.

Mutex<Cursor>

We can simply scan the file in the main thread, count the credentials, seek back to 0 and then lock the file handle in a mutex:

  • lock the bufreader
  • read an entry
  • release the mutex
  • parse the credentials and test them

This would introduce the need for an exception message to the msg loop since reading from the file might fail in a non-recoverable way.


Note that there's also some overhead by the way the threadpool currently works, which allocates some memory for each job that we want to run. While this isn't much, keep in mind that a single byte per credential would result in 14gb.

In the end, I'm not sure if tests that large are realistic and how much effort should go into this.

@kpcyrd kpcyrd added the scheduler Related to the badtouch scheduler label Mar 29, 2018
@kpcyrd
Copy link
Owner Author

kpcyrd commented Apr 1, 2018

#40 significantly reduces ram usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scheduler Related to the badtouch scheduler
Projects
None yet
Development

No branches or pull requests

1 participant