Better CPU utilisation for wide systems during analysis #889

tomato42 · 2023-11-29T12:06:12Z

Feature request

Is your feature request related to a problem? Please describe

When running medium sized analysis (n=2M, k=34), on large systems (128 cores or more), there's significant amount of idle time (10-20% of wall clock time the CPU isn't 100% utilised). It would be nice if the processing was more aggressive in using the CPUs

This is caused by the analysis script creating new mp.Pool for every statistic it calculates:

tlsfuzzer/tlsfuzzer/analysis.py

Lines 392 to 396 in a2d1236

    
           with mp.Pool(self.workers) as pool: 
        
               pvals = list(pool.imap_unordered( 
        
                   self._mt_process_runner, 
        
                   zip(comb, repeat(sum_func), repeat(args)), 
        
                   job_size))

tlsfuzzer/tlsfuzzer/analysis.py

Lines 787 to 791 in a2d1236

    
           with mp.Pool(self.workers, initializer=self._import_diffs, 
        
                        initargs=(_diffs,)) as pool: 
        
               cent_tend = pool.imap_unordered( 
        
                   self._cent_tend_of_random_sample, 
        
                   chain(repeat(job_size, reps // job_size), [reps % job_size]))

Describe the solution you'd like

The script should create one Pool of workers for the whole analysis and then keep reusing them.
The data between the main thread and the workers should be passed around through memory mapped files (that should also help with memory pressure on systems with little memory compared to CPU count)

Describe alternatives you've considered

Alternative is just to wait longer or get larger machines, but then the inefficiencies become more and more apparent.

Additional context

n/a

The text was updated successfully, but these errors were encountered:

tomato42 added enhancement new feature to be implemented complex Issues that require good knowledge of tlsfuzzer internals labels Nov 29, 2023

tomato42 changed the title ~~Better CPU utilisation for wide systems~~ Better CPU utilisation for wide systems during analysis Dec 3, 2023

tomato42 added the help wanted label Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better CPU utilisation for wide systems during analysis #889

Better CPU utilisation for wide systems during analysis #889

tomato42 commented Nov 29, 2023

Better CPU utilisation for wide systems during analysis #889

Better CPU utilisation for wide systems during analysis #889

Comments

tomato42 commented Nov 29, 2023

Feature request

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional context