GSoC 2019 Application SHIKSHA RAWAT : Benchmarks and performance

ABSTRACT

Sympy, being a Computer Algebra System offers great functionality and extensive support for symbolic computing. But speed is also an important feature. As of now it is also difficult to figure out the bottlenecks for sympy.

My work in Sympy during summer would be to figure out these bottlenecks, creating benchmarks for them and then improving overall performance of sympy.

ABOUT ME

NAME: Shiksha Rawat

Email ID : shiksharawat01@gmail.com

Phone No. : +91 9893715599

University: Birla Institute Of Technology,Mesra,Ranchi,India.

PROGRAMMING BACKGROUND

I am a second-year undergraduate student of Computer Science and Engineering at Birla Institute Of Technology, Mesra, Ranchi, India.

Being a student of computer science department, my primary interest has always been in Mathematics and Computer Science. I have 2 years of programming experience in Python. I have also developed some projects using python.

I am reasonably comfortable with Git and Github. I have been using these over the past months for contributing to Sympy and otherwise making me quite comfortable with them.

GitHub Profile : https://github.com/shiksha11

CASE STUDY OF AN ISSUE RELATED TO PERFORMANCE

SOURCE

Because of my keen interest in benchmarking and profiling codes, I went through the issues with label "Performance". I found https://github.com/sympy/sympy/issues/16249 issue intriguing. Below, I have described how I found the bottleneck for slow performance of Min/Max and further how I tried to substitute it with a code having better complexity.

ANALYSIS

Going through the comments and part of the codebase involved, I deduced that performance is being affected by the high complexity(O(n**2)) of the function(_find_localzeros) which was being called during the computation of the output.I observed that the high complexity is because the function is comparing every pair of arguments

PROFILING THE PART OF CODEBASE INVOLVED FOR OUTPUT COMPUTATION

Input : I tried to find the Maximum value among 50 numbers.

The results I got after profiling are:

3553 function calls in 0.006 seconds

Ordered by: standard name

ncalls   tottime    percall    cumtime    percall filename:lineno(function)
  642    0.000    0.000    0.001    0.000 <frozen importlib._bootstrap>:1009(_handle_fromlist)

    1    0.000    0.000    0.006    0.006 <ipython-input-58-f01740fe9027>:1(new1)

   50    0.000    0.000    0.003    0.000 <ipython-input-58-f01740fe9027>:2(<genexpr>)

  642    0.001    0.000    0.002    0.000 <ipython-input-58-f01740fe9027>:33(_is_connected)

    1    0.001    0.001    0.006    0.006 <ipython-input-58-f01740fe9027>:6(_find_localzeros)

    1    0.000    0.000    0.006    0.006 <string>:1(<module>)

    1    0.000    0.000    0.000    0.000 basic.py:121(__hash__)

    1    0.000    0.000    0.000    0.000 basic.py:96(__new__)

    1    0.000    0.000    0.003    0.003 libmpf.py:410(from_float)

    1    0.000    0.000    0.000    0.000 libmpf.py:64(dps_to_prec)

    1    0.000    0.000    0.000    0.000 numbers.py:1109(_hashable_content)

    1    0.000    0.000    0.000    0.000 numbers.py:1367(__hash__)

   48    0.000    0.000    0.000    0.000 numbers.py:2008(__new__)

   25    0.000    0.000    0.000    0.000 numbers.py:2205(__hash__)

    1    0.000    0.000    0.000    0.000 numbers.py:739(__hash__)

    1    0.000    0.000    0.003    0.003 numbers.py:954(__new__)
   49    0.000    0.000    0.003    0.000 sympify.py:76(sympify)

    1    0.000    0.000    0.000    0.000 {built-in method __new__ of type object at 0x10993d6a8}

    1    0.000    0.000    0.006    0.006 {built-in method builtins.exec}

  642    0.001    0.000    0.001    0.000 {built-in method builtins.hasattr}

   26    0.000    0.000    0.000    0.000 {built-in method builtins.hash}

 1330    0.000    0.000    0.000    0.000 {built-in method builtins.id}

   55    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}

    1    0.000    0.000    0.000    0.000 {built-in method builtins.max}

    1    0.000    0.000    0.000    0.000 {built-in method builtins.round}

    1    0.003    0.003    0.003    0.003 {built-in method gmpy2._mpmath_create}

    1    0.000    0.000    0.000    0.000 {built-in method math.frexp}

    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

   26    0.000    0.000    0.000    0.000 {method 'update' of 'set' objects}

ANALYSIS

Analyzing the results I found that _is_connected is being called a large number of times(for comparing)and this number increases manifold when the number of values passed as argument will increase.

DESIGNING A SUBSTITUTE

I divided the _find_localzeros function into subparts and deduced that the number of comparisons can be reduced if the arguments can be sorted. But this approach can be used only for numerical values and not symbolic. As the complexity of the built-in function of python for sorting is O(n*logn). I tried to use that for performance improvement. This approach helped me to get rid of _is_connected function for numerical values.

PROFILING THE OPTIMIZED CODE

I did profiling using the same input as in the above case and got the following result.

2680 function calls in 0.005 seconds

Ordered by: standard name

 ncalls    tottime    percall   cumtime    percall filename:lineno(function)

   20    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:1009(_handle_fromlist)

   50    0.000    0.000    0.000    0.000 <ipython-input-59-9afdc0d4cbe0>:16(<genexpr>)

    1    0.000    0.000    0.005    0.005 <ipython-input-59-9afdc0d4cbe0>:3(new1)

   50    0.000    0.000    0.000    0.000 <ipython-input-59-9afdc0d4cbe0>:4(<genexpr>)

    1    0.000    0.000    0.005    0.005 <ipython-input-59-9afdc0d4cbe0>:7(_find_localzeros)
    
    1    0.000    0.000    0.005    0.005 <string>:1(<module>)

    17    0.000    0.000    0.000    0.000 basic.py:121(__hash__)

    4    0.000    0.000    0.002    0.000 basic.py:583(is_comparable)

    4    0.000    0.000    0.000    0.000 basic.py:617(<listcomp>)

    4    0.000    0.000    0.000    0.000 basic.py:630(func)

    5    0.000    0.000    0.000    0.000 basic.py:96(__new__)

   94    0.000    0.000    0.000    0.000 boolalg.py:313(__nonzero__)

  114    0.000    0.000    0.000    0.000 boolalg.py:376(__nonzero__)

  208    0.000    0.000    0.000    0.000 boolalg.py:406(<lambda>)

    4    0.000    0.000    0.000    0.000 evalf.py:1265(<lambda>)

    4    0.000    0.000    0.000    0.000 evalf.py:1303(evalf)

    4    0.000    0.000    0.000    0.000 evalf.py:1363(evalf)

    4    0.001    0.000    0.002    0.000 expr.py:1743(as_real_imag)

   12    0.000    0.000    0.000    0.000 libmpf.py:330(from_int)

    1    0.000    0.000    0.000    0.000 libmpf.py:410(from_float)

   12    0.000    0.000    0.001    0.000 libmpf.py:549(mpf_cmp)

   12    0.000    0.000    0.001    0.000 libmpf.py:601(mpf_lt)

    5    0.000    0.000    0.000    0.000 libmpf.py:64(dps_to_prec)

    7    0.001    0.000    0.001    0.000 libmpf.py:677(mpf_add)

    7    0.000    0.000    0.001    0.000 libmpf.py:772(mpf_sub)

    4    0.000    0.000    0.000    0.000 numbers.py:1088(_new)

    1    0.000    0.000    0.000    0.000 numbers.py:1109(_hashable_content)

    4    0.000    0.000    0.000    0.000 numbers.py:1124(_as_mpf_val)

    4    0.000    0.000    0.003    0.001 numbers.py:1333(__lt__)

   17    0.000    0.000    0.000    0.000 numbers.py:1367(__hash__)

    4    0.000    0.000    0.000    0.000 numbers.py:1745(__eq__)

    8    0.000    0.000    0.000    0.000 numbers.py:1802(__lt__)

    8    0.000    0.000    0.000    0.000 numbers.py:2001(_as_mpf_val)

   48    0.000    0.000    0.000    0.000 numbers.py:2008(__new__)

    4    0.000    0.000    0.000    0.000 numbers.py:2159(__eq__)

    3    0.000    0.000    0.000    0.000 numbers.py:2169(__gt__)

  201    0.000    0.000    0.001    0.000 numbers.py:2178(__lt__)

  257    0.001    0.000    0.001    0.000 numbers.py:2205(__hash__)

    4    0.000    0.000    0.000    0.000 numbers.py:2546(__nonzero__)

   17    0.000    0.000    0.000    0.000 numbers.py:739(__hash__)

    8    0.000    0.000    0.000    0.000 numbers.py:80(mpf_norm)

    1    0.000    0.000    0.000    0.000 numbers.py:954(__new__)

    4    0.000    0.000    0.000    0.000 sympify.py:13(__init__)

  428    0.000    0.000    0.001    0.000 sympify.py:375(_sympify)

  526    0.001    0.000    0.001    0.000 sympify.py:76(sympify)

    5    0.000    0.000    0.000    0.000 {built-in method __new__ of type object at 0x10993d6a8}

    1    0.000    0.000    0.000    0.000 {built-in method builtins.all}

    1    0.000    0.000    0.005    0.005 {built-in method builtins.exec}

   44    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}

  258    0.000    0.000    0.000    0.000 {built-in method builtins.hash}

   87    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}

    3    0.000    0.000    0.000    0.000 {built-in method builtins.len}

   13    0.000    0.000    0.000    0.000 {built-in method builtins.max}

    4    0.000    0.000    0.000    0.000 {built-in method builtins.min}

    5    0.000    0.000    0.000    0.000 {built-in method builtins.round}

    1    0.000    0.000    0.004    0.004 {built-in method builtins.sorted}

   13    0.000    0.000    0.000    0.000 {built-in method gmpy2._mpmath_create}

   15    0.000    0.000    0.000    0.000 {built-in method gmpy2._mpmath_normalize}

    7    0.000    0.000    0.000    0.000 {built-in method gmpy2.bit_length}

    8    0.000    0.000    0.000    0.000 {built-in method gmpy2.mpz}

    1    0.000    0.000    0.000    0.000 {built-in method math.frexp}

    1    0.000    0.000    0.000    0.000 {method 'add' of 'set' objects}

    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

   16    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}

PROFILING ANALYSIS

It could be deduced from the results of profiling that not only the function calls have decreased by a large number in the substituted code but also the time has decreased from 0.006 to 0.005 seconds and this has been done for 50 values only. This difference will be even higher when the number of values in the argument will increase.

###BENCHMARKING I benchmarked the current and the optimized code for input of 50 numbers. The results are:

Here in the graph x-axis: Array size

y- axis: time(seconds)

https://drive.google.com/file/d/195SsV2QK1-T7pcR7woLHbywAphkF2dpZ/view?usp=sharing

More or less, the project can be broken into 3 stages:

STAGE 1

Figuring out the main bottlenecks for sympy : The best way to figure out these bottlenecks would be to first profile the current part of codebase involved in a particular issue and analyzing it.

STAGE 2

Designing A Substitute : A better approach with optimization has to be searched and its validity can be tested by profiling and comparing the output, with outputs of profiling of current code, as I did in the above case study.

STAGE 3

Improving performance : If appropriate substitute can be found then making the required changes and running the Travis CI build.

THE PROJECT

IMPLEMENTATION PLAN

I think the most important part of this project would be correctly identifying the patches or files in the codebase which are making computation slower and then adding benchmarks for them. Since this will help in identifying the code which has to be substituted to improve performance.

I would like to break the process into smaller parts which will help me provide a better insight to solve the problems. Going through the issues with tags "Performance" in the sympy repository I found the following issues intriguing and I would like to benchmark and profile them and then find a substitute to improve performance.(following the steps of the above case study)

TIMELINE

My semester will end on 1st May; hence I am not including the time before 1st May 2019 in the timeline. Listed below is the tentative timeline I would follow:

Pre Selection Period + Community Bonding Period

May 2 - May 27

The goal in this period would be:

Through analysis of the present benchmarking repository of sympy( https://github.com/sympy/sympy_benchmarks) and digging out the regressions which could be further investigated.I will add the issue which can be investigated in the list which I have prepared above.

Since the work I will do will always depend upon the results of profiling and benchmarking and the bottlenecks which I will find. So, the timeline I am writing below would be a bit uncertain and can change according to the results.

THE CODING PERIOD

MAY 27 - JUNE 10(Week 1,2)

Profiling and benchmarking half the issues which I have in the list(of issues) above. Analyzing the results of the profiling to identify the bottleneck.(All the steps which I am writing here would be implemented the way I have described in my case study above)

June 20 - June 24 (Week 3,4) Add new benchmarks as needed on the basis of results obtained in the previous week in the sympy repository. And start substituting the bottlenecks with better algorithms (if possible).

Phase 1 Evaluation

June 24 - July 8 (Week 5, 6) Profiling and benchmarking the other half the issues which I have in the list(of issues) above. Analyzing the results of the profiling identifying the bottleneck.

July 8 - July 24 (Week 7,8) Add new benchmarks as needed on the basis of results obtained in the previous week in the sympy repository. Substitute the bottlenecks with better algorithms (if possible).

Phase 2 Evaluation

July 22 - August 19 (Week 9,10,11,12) I will try to profile and add required benchmarks for maximum possible regressions and issues in the previous weeks so in this period my main aim would be substituting the bottlenecks.

August 19 - August 26 (Week 13) Completion of any backlog before marking the completion of the project.

Final Submission

Any Plans/Commitments (During GSoC)

I have no prior plans or commitments during this summer. My end-semester exams will end on 1st May and the new semester starts around 15th July. (But I can manage academics and gsoc work in parallel, and will compensate in advance for this in the community bonding period) Hence I can give about 50 hours per week throughout the whole program.

CONTRIBUTIONS TO SYMPY

PULL REQUESTS

The following are the lists of merged/open pull requests I have created(listed in chronological order)

#15744 Mod Function : Made changes to give simplified output when dividend is a multiple of divisor.(Closed)

#15842 Printing: Added root_noatation flag in latex.py and pretty.py. And updated docstrings and tests files according to the changes.(Merged)

#15901 Printing : Added root_noatation flag in mathml.py.(Merged)

#16509 Perfromance : Made changes to improve computation time when the number of arguments in Min/Max is large.(Open)

REFERENCES

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC 2019 Application SHIKSHA RAWAT : Benchmarks and performance

ABSTRACT

ABOUT ME

PROGRAMMING BACKGROUND

CASE STUDY OF AN ISSUE RELATED TO PERFORMANCE

SOURCE

ANALYSIS

PROFILING THE PART OF CODEBASE INVOLVED FOR OUTPUT COMPUTATION

ANALYSIS

DESIGNING A SUBSTITUTE

PROFILING THE OPTIMIZED CODE

PROFILING ANALYSIS

THE PROJECT

IMPLEMENTATION PLAN

TIMELINE

THE CODING PERIOD

Any Plans/Commitments (During GSoC)

CONTRIBUTIONS TO SYMPY

PULL REQUESTS

Clone this wiki locally