Simplify the codebase by avoiding metaprogramming #696

oscarbenjamin · 2023-05-08T13:17:16Z

Following on from #683 (comment) concerning the speed of creating a context object and see the past discussions in gh-178.

The mpmath codebase uses a number of metaprogramming features that increase complexity for what I suspect is relatively little benefit (mostly DRY and micro-optimisation). These make the codebase hard to follow and also defeat static code analysis which is increasingly integrated in modern Python editors. I propose to remove these and move towards ordinary Python code with plain functions, methods and attributes.

Specifically I suggest:

Eliminate dynamic class generation. Make e.g. mpf a simple type that just has _mpf_ and ctx attributes.
Write methods in ordinary Python code rather than generating them dynamically either with exec or by attaching them from decorators.
Use more traditional methods for reducing boilerplate like reusable functions.

One example of this is dynamically created classes like:

mpmath/mpmath/ctx_mp_python.py

Lines 584 to 596 in 29b47ce

    
           class PythonMPContext: 
        
               def __init__(ctx): 
        
                   ctx._prec_rounding = [53, round_nearest] 
        
                   ctx.mpf = type('mpf', (_mpf,), {}) 
        
                   ctx.mpc = type('mpc', (_mpc,), {}) 
        
                   ctx.mpf._ctxdata = [ctx.mpf, new, ctx._prec_rounding] 
        
                   ctx.mpc._ctxdata = [ctx.mpc, new, ctx._prec_rounding] 
        
                   ctx.mpf.context = ctx 
        
                   ctx.mpc.context = ctx 
        
                   ctx.constant = type('constant', (_constant,), {}) 
        
                   ctx.constant._ctxdata = [ctx.mpf, new, ctx._prec_rounding] 
        
                   ctx.constant.context = ctx

There are also functions and methods created from code strings at runtime:

mpmath/mpmath/ctx_mp_python.py

Lines 280 to 293 in 29b47ce

    
           def binary_op(name, with_mpf='', with_int='', with_mpc=''): 
        
               code = mpf_binary_op 
        
               code = code.replace("%WITH_INT%", with_int) 
        
               code = code.replace("%WITH_MPC%", with_mpc) 
        
               code = code.replace("%WITH_MPF%", with_mpf) 
        
               code = code.replace("%NAME%", name) 
        
               np = {} 
        
               exec(code, globals(), np) 
        
               return np[name] 
        
           _mpf.__eq__ = binary_op('__eq__', 
        
               'return mpf_eq(sval, tval)', 
        
               'return mpf_eq(sval, from_int(other))', 
        
               'return (tval[1] == fzero) and mpf_eq(tval[0], sval)')

We also have methods that are dynamically added to classes:

mpmath/mpmath/functions/functions.py

Lines 16 to 20 in 29b47ce

    
           def __init__(self): 
        
               cls = self.__class__ 
        
               for name in cls.defined_functions: 
        
                   f, wrap = cls.defined_functions[name] 
        
                   cls._wrap_specfun(name, f, wrap)

Some of these happen at import time and others happen when a context object is created and add measurable overhead to the cost of creating a context:

In [9]: from mpmath import MPContext, mp

In [10]: %timeit MPContext()
538 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [11]: %timeit mp.clone()
561 µs ± 27.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [12]: %timeit mp.cos(1)
10.9 µs ± 253 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [18]: %prun -s cumulative [mp.clone() for _ in range(1000)]
         863004 function calls in 1.736 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.736    1.736 {built-in method builtins.exec}
        1    0.000    0.000    1.736    1.736 <string>:1(<module>)
        1    0.002    0.002    1.736    1.736 <string>:1(<listcomp>)
     1000    0.005    0.000    1.734    0.002 ctx_mp.py:297(clone)
     1000    0.022    0.000    1.718    0.002 ctx_mp.py:63(__init__)
     1000    0.007    0.000    0.831    0.001 ctx_base.py:42(__init__)
     1000    0.364    0.000    0.783    0.001 functions.py:18(__init__)
     1000    0.355    0.000    0.775    0.001 ctx_mp.py:96(init_builtins)
   213000    0.270    0.000    0.378    0.000 ctx_mp_python.py:1014(_wrap_specfun)
    43000    0.356    0.000    0.372    0.000 ctx_mp_python.py:979(_wrap_libmp_function)
   220000    0.080    0.000    0.080    0.000 {built-in method builtins.setattr}
     1000    0.073    0.000    0.073    0.000 ctx_mp_python.py:585(__init__)
   256000    0.048    0.000    0.048    0.000 {method 'get' of 'dict' objects}
    17000    0.020    0.000    0.038    0.000 rational.py:31(__new__)
    14000    0.017    0.000    0.037    0.000 ctx_mp_python.py:336(__new__)
     1000    0.023    0.000    0.023    0.000 matrices.py:749(__init__)
    26000    0.020    0.000    0.020    0.000 {built-in method builtins.getattr}
    17000    0.016    0.000    0.018    0.000 rational.py:7(create_reduced)
     1000    0.008    0.000    0.015    0.000 ctx_base.py:52(_init_aliases)
    24000    0.011    0.000    0.011    0.000 {built-in method __new__ of type object at 0x7f2bc4270a40}
     1000    0.005    0.000    0.010    0.000 ctx_mp_python.py:612(_set_prec)
     1000    0.008    0.000    0.010    0.000 inverselaplace.py:668(__init__)
     1000    0.004    0.000    0.006    0.000 quadrature.py:461(__init__)
     5000    0.003    0.000    0.006    0.000 ctx_mp_python.py:597(make_mpf)
     1000    0.003    0.000    0.004    0.000 libmpf.py:59(prec_to_dps)
     1000    0.002    0.000    0.002    0.000 ctx_base.py:458(memoize)
     1000    0.002    0.000    0.002    0.000 rszeta.py:54(__init__)
     2000    0.002    0.000    0.002    0.000 quadrature.py:21(__init__)
     2000    0.002    0.000    0.002    0.000 {built-in method builtins.max}
     1000    0.001    0.000    0.001    0.000 ctx_mp_python.py:602(make_mpc)
     1000    0.001    0.000    0.001    0.000 {built-in method builtins.round}
     4000    0.001    0.000    0.001    0.000 inverselaplace.py:17(__init__)
     1000    0.001    0.000    0.001    0.000 ctx_mp_python.py:620(<lambda>)
     1000    0.001    0.000    0.001    0.000 ctx_mp_python.py:607(default)
     1000    0.001    0.000    0.001    0.000 {method 'update' of 'dict' objects}
     1000    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Ideally the cost of creating a new context object would be small enough that it could be worthwhile to create one just for the purposes of a single function call. The context object only has a handful of genuine attributes so it should not be this expensive to create:

In [29]: class Context:
    ...:     __slots__ = ('prec',)
    ...:     def __init__(self, prec=53):
    ...:         self.prec = prec
    ...: 

In [30]: %timeit Context()
235 ns ± 25.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

I think that it would be better to rewrite these things using ordinary Python code with a primary emphasis on keeping the code simple and easy to understand and maintain. Future versions of CPython will also boost the performance of much Python code but mpmath will potentially not be able to benefit as much from these optimisations because they are generally targeted at simple straight-forward code.

With a simpler codebase it would also be easier to aim for bigger performance improvements by making it easier to swap different backends in and out and to leverage e.g. python_flint if available. The small performance benefits that are achieved by this metaprogramming could be made a lot greater just by having an optional C/Cython backend at least for basic types like mpf but it any such backend in a non dynamically typed language would not be able to replicate the metaprogramming. The python_flint module could house a faster version of mpf and MPContext if it was to be used as an optional backend for mpmath.

The text was updated successfully, but these errors were encountered:

skirpichev · 2023-05-19T03:52:40Z

That seems reasonable. However,

Make e.g. mpf a simple type that just has mpf and ctx attributes.

Are you suggest ctx attribute as an instance property?

I think we could use contextvars.ContextVar to store the current context and provide an API to get/set one. Then we could make the ctx.mpf to be a simple type and the mpf.context will just run query to get the current context. I believe this could be a relatively simple change.

(Another issue with _mpf is that we have the libmpf.py, that implements floating-point arithmetic in a functional style. Not sure if this layer simplify the codebase or add a significant performance bonus.)

Write methods in ordinary Python code rather than generating them dynamically either with exec or by attaching them from decorators.

I doubt that the second is something wrong. Why do you think so?

There are also functions and methods created from code strings at runtime

See #702.
Remaining cases are make_hyp_summator() and identify().

oscarbenjamin · 2023-05-19T11:44:21Z

Are you suggest ctx attribute as an instance property?

Effectively it already is an instance attribute but it is implemented by having a dynamically created class with a class attribute. I'm suggesting not to have the dynamically generated class and just have ctx as an instance attribute.

Currently it is like this:

class _mpf:
    def __init__(self, value):
        self.value = value

    def __repr__(self):
        return "mpf(%s) (%s)" % (self.value, self.ctx)

class Context:
    def __init__(self, name):
        self.name = name
        self.mpf_type = type("mpf", (_mpf,), {})
        self.mpf_type.ctx = self

    def mpf(self, value):
        return self.mpf_type(value)

    def __repr__(self):
        return "Context(%s)" % self.name

ctx1 = Context("default")
ctx2 = Context("other")

a = ctx1.mpf(1)
b = ctx2.mpf(2)
print("a =", a)
print("b =", b)

I would instead write it like:

class mpf:
    def __init__(self, value, ctx):
        self.value = value
        self.ctx = ctx

    def __repr__(self):
        return "mpf(%s) (%s)" % (self.value, self.ctx)

class Context:
    def __init__(self, name):
        self.name = name

    def mpf(self, value):
        return mpf(value, self)

    def __repr__(self):
        return "Context(%s)" % self.name

ctx1 = Context("default")
ctx2 = Context("other")

a = ctx1.mpf(1)
b = ctx2.mpf(2)
print("a =", a)
print("b =", b)

oscarbenjamin · 2023-05-19T11:53:25Z

Write methods in ordinary Python code rather than generating them dynamically either with exec or by attaching them from decorators.

I doubt that the second is something wrong. Why do you think so?

One problem with the decorator approach as currently implemented is that it is slow when creating a context object (see the profile above).

Another problem is just that it makes the codebase complicated to navigate and understand. Many different files and functions need to be inspected just to understand exactly how one function calls another. Usually I would follow things like that by jumping from one function to another in my editor using "go to definition" (gd in vim) but that fails as does all static analysis like type checking code completion etc when the methods are assigned dynamically.

asmeurer · 2023-05-19T21:43:25Z

I agree with this. I often find it hard to figure out where stuff is because of this. Especially if the only point of the metaprogramming is to avoid some code duplication, it would be better to either use standard methods of avoiding duplication (like factoring things into functions or subclasses), or to just duplicate the code.

skirpichev · 2023-09-22T09:01:50Z

Another example is multiple pickling issues with the current code base.

Effectively it already is an instance attribute but it is implemented by having a dynamically created class with a class attribute.

On another hand, this is a huge change in semantics: right now we have different mpf types per context. You suggest instead something like gmpy2.mpfr.

I did some experiments in https://github.com/skirpichev/mpmath/tree/simple-mpf branch. Not sure it worth a draft pr, let me know. But I'm surprised there are only ~20 test failures with a decent part of this - broken isinstance(foo, mp.mpf) checks. All changes are in the ctx_mp_python.py so far - the rest is things like type(m)(-1)->-ctx.one change in the elliptic.py, can be accepted in the master branch.

oscarbenjamin · 2023-09-22T15:36:13Z

On another hand, this is a huge change in semantics: right now we have different mpf types per context.

Is that a huge change in semantics? Do you think that code outside of mpmath really relies on these dynamic types?

Probably something in SymPy would break but that is just because SymPy reaches into the internals of mpmath too much.

I did some experiments in https://github.com/skirpichev/mpmath/tree/simple-mpf branch. Not sure it worth a draft pr, let me know

The commit looks good to me.

But I'm surprised there are only ~20 test failures with a decent part of this - broken isinstance(foo, mp.mpf) checks.

I'm not so surprised because the dynamically created classes do not really achieve anything in terms of behaviour.

skirpichev · 2023-09-23T04:37:51Z

Do you think that code outside of mpmath really relies on these dynamic types?

Maybe. Don't get me wrong, I like this idea. In a long term, with this, probably we even could use (optionally) gmpy's mpfr/mpc instead of mpmath's mpf/mpc.

But this will be not a backward compatible change.

Probably something in SymPy would break but that is just because SymPy reaches into the internals of mpmath too much.

Lets test that: diofant/diofant#1356 (I think the situation will be more or less same for the SymPy).

UPD: And yes, it does.

The commit looks good to me.

WIP: #719

I'm not so surprised because the dynamically created classes do not really achieve anything in terms of behaviour.

There are other issues (e.g. pickling), which aren't simple to fix.

skirpichev · 2023-10-22T09:02:25Z

Ok, now there are only pickling-related failures.

skirpichev · 2024-02-28T08:15:32Z

Now pickling works.

skirpichev added the enhancement new feature requests (or implementation) label May 8, 2023

oscarbenjamin mentioned this issue May 11, 2023

Metaclassing causes different classes for custom contexts #532

Closed

skirpichev mentioned this issue May 19, 2023

Avoid dynamic method creation in _mpf #702

Merged

oscarbenjamin mentioned this issue May 21, 2023

Public API #704

Open

skirpichev mentioned this issue Sep 23, 2023

Eliminate dynamic classes mpf/mpc (use simple types) #719

Draft

skirpichev mentioned this issue Oct 19, 2023

Test sage integration or remove sage backend #705

Closed

skirpichev added this to the 1.4 milestone Apr 1, 2024

skirpichev added the need decision label Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify the codebase by avoiding metaprogramming #696

Simplify the codebase by avoiding metaprogramming #696

oscarbenjamin commented May 8, 2023

skirpichev commented May 19, 2023

oscarbenjamin commented May 19, 2023

oscarbenjamin commented May 19, 2023

asmeurer commented May 19, 2023

skirpichev commented Sep 22, 2023

oscarbenjamin commented Sep 22, 2023

skirpichev commented Sep 23, 2023

skirpichev commented Oct 22, 2023

skirpichev commented Feb 28, 2024

Simplify the codebase by avoiding metaprogramming #696

Simplify the codebase by avoiding metaprogramming #696

Comments

oscarbenjamin commented May 8, 2023

skirpichev commented May 19, 2023

oscarbenjamin commented May 19, 2023

oscarbenjamin commented May 19, 2023

asmeurer commented May 19, 2023

skirpichev commented Sep 22, 2023

oscarbenjamin commented Sep 22, 2023

skirpichev commented Sep 23, 2023

skirpichev commented Oct 22, 2023

skirpichev commented Feb 28, 2024