Wrong result for digamma integral representation #26523

rogerbalsach · 2024-04-21T12:10:36Z

I was trying to evaluate the following integral using sympy 1.12 (see complete list of packages below)

package list

# packages in environment at /home/user/anaconda3/envs/sympy:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
bzip2                     1.0.8                h5eee18b_5  
ca-certificates           2024.3.11            h06a4308_0  
expat                     2.6.2                h6a678d5_0  
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.4.4                h6a678d5_0  
libgcc-ng                 11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libstdcxx-ng              11.2.0               h1234567_1  
libuuid                   1.41.5               h5eee18b_0  
mpmath                    1.3.0           py312h06a4308_0  
ncurses                   6.4                  h6a678d5_0  
openssl                   3.0.13               h7f8727e_0  
pip                       23.3.1          py312h06a4308_0  
python                    3.12.3               h996f2a0_0  
readline                  8.2                  h5eee18b_0  
setuptools                68.2.2          py312h06a4308_0  
sqlite                    3.41.2               h5eee18b_0  
sympy                     1.12            py312h06a4308_0  
tk                        8.6.12               h1ccaba5_0  
tzdata                    2024a                h04d1e81_0  
wheel                     0.41.2          py312h06a4308_0  
xz                        5.4.6                h5eee18b_0  
zlib                      1.2.13               h5eee18b_0

import sympy as sym
t, z = sym.symbols('t z')
integrate((1 - t**z) / (1 - t), (t, 0, 1))  # Should evaluate to digamma(z+1) + EulerGamma

The integration returns oo + I*pi. The integral is a standard representation for the digamma function.
sympy is able to do the integral correctly for the different concrete values of z I have tried, but not for the general value of z.
The integral also gives the incorrect result even if I add the constrain that z is positive.

Also, sympy doesn't seem to be able to recognise the integral representation of the polygamma functions, though in that case it simply returns the unevaluated integral.

The text was updated successfully, but these errors were encountered:

oscarbenjamin · 2024-04-21T12:20:50Z

Looks like this comes from meijerg:

In [11]: integrate((1 - t**z) / (1 - t), (t, 0, 1), meijerg=False)
Out[11]: 
1          
⌠          
⎮  z       
⎮ t  - 1   
⎮ ────── dt
⎮ t - 1    
⌡          
0          

In [12]: integrate((1 - t**z) / (1 - t), (t, 0, 1), meijerg=True)
Out[12]: 
     ⎛ 2⋅ⅈ⋅π          ⎞             ⎛ 2⋅ⅈ⋅π          ⎞             
  z⋅Φ⎝ℯ     , 1, z + 1⎠⋅Γ(z + 1)   Φ⎝ℯ     , 1, z + 1⎠⋅Γ(z + 1)    
- ────────────────────────────── - ──────────────────────────── + ∞
             Γ(z + 2)                        Γ(z + 2)              

In [13]: integrate((1 - t**z) / (1 - t), (t, 0, 1))
Out[13]: ∞ + ⅈ⋅π

oscarbenjamin · 2024-04-21T12:41:41Z

The result from meijerg=True looks like it would be correct if the oo was EulerGamma:

In [58]: integrate((1 - t**z) / (1 - t), (t, 0, 1), meijerg=True).simplify()
Out[58]: 
   ⎛ 2⋅ⅈ⋅π          ⎞    
- Φ⎝ℯ     , 1, z + 1⎠ + ∞

The exp_polar there if replaced by 1 would give -lerchphi(1, 1, z+1) which is maybe equivalent to digamma(z+1) (i.e. polygamma(0, z+1)).

rogerbalsach · 2024-04-21T15:59:13Z

@oscarbenjamin , I am not familiar with the Lerch transcendent, but evaluating the indefinite integral I get the following

import sympy as sym

t = sym.Symbol('t', nonnegative=True)
z = sym.Symbol('z')
sym.Integral((1 - t**z) / (1 - t), t).doit().simplify()

which results in -t**(z + 1)*lerchphi(t, 1, z + 1) - log(t - 1).
This seems to agree with WolframAlpha.
So, it looks like whatever the problem is, it must be in the evaluation of the limits.
This seems plausible by observing that at t=1 we have log(1 - 1) = oo and at t=0 we have log(0 - 1) = I*pi.

oscarbenjamin · 2024-04-21T16:25:49Z

which results in -t**(z + 1)*lerchphi(t, 1, z + 1) - log(t - 1).

That's coming from meijerg's indefinite integration:

In [8]: integrate((1 - t**z) / (1 - t), t, meijerg=True).simplify()
Out[8]: 
⎧ ⎛ z + 1  ⎛   2⋅ⅈ⋅π          ⎞             ⎞           
⎪-⎝t     ⋅Φ⎝t⋅ℯ     , 1, z + 1⎠ + log(t - 1)⎠  for t > 1
⎨                                                       
⎪ ⎛ z + 1  ⎛   2⋅ⅈ⋅π          ⎞             ⎞           
⎩-⎝t     ⋅Φ⎝t⋅ℯ     , 1, z + 1⎠ + log(1 - t)⎠  otherwise

In [9]: integrate((1 - t**z) / (1 - t), t).simplify()
Out[9]: 
   z + 1                            
- t     ⋅Φ(t, 1, z + 1) - log(t - 1)

So, it looks like whatever the problem is, it must be in the evaluation of the limits.

Maybe. Note that meijerg's definite integration is not really handled by computing the indefinite integral in the form shown. I think it would actually compute the definite integral using G-functions and then try to express the result in ordinary looking functions at the end.

rogerbalsach · 2024-04-21T18:43:01Z

Ok. First things first, this is the first time I look into the sympy code for integration, so I might miss a lot of things (for example, I don't really know what meijerg's method is).
I've been playing a bit with the debugger for the following code

import sympy as sym

t = sym.Symbol('t', nonnegative=True)
z = sym.Symbol('z')
f = (1 - t**z) / (1 - t)
I = sym.Integral((1 - t**z) / (1 - t), (t, 0, 1))

I.doit()

and it seems that it is doing the following:
First, it computes the indefinite integral

sympy/sympy/integrals/integrals.py

Lines 619 to 620 in c9a2f11

antideriv = self._eval_integral(

function, xab[0], **eval_kwargs)

which evaluates to -t**(z + 1)*z*lerchphi(t, 1, z + 1)*gamma(z + 1)/gamma(z + 2) - t**(z + 1)*lerchphi(t, 1, z + 1)*gamma(z + 1)/gamma(z + 2) - log(t - 1) , which is the same expression I had before.
The first thing I think is relevant is that I don't get any piecewise function, and as @oscarbenjamin mentioned

In [8]: integrate((1 - t**z) / (1 - t), t, meijerg=True).simplify()
Out[8]: 
⎧ ⎛ z + 1  ⎛   2⋅ⅈ⋅π          ⎞             ⎞           
⎪-⎝t     ⋅Φ⎝t⋅ℯ     , 1, z + 1⎠ + log(t - 1)⎠  for t > 1
⎨                                                       
⎪ ⎛ z + 1  ⎛   2⋅ⅈ⋅π          ⎞             ⎞           
⎩-⎝t     ⋅Φ⎝t⋅ℯ     , 1, z + 1⎠ + log(1 - t)⎠  otherwise

In [9]: integrate((1 - t**z) / (1 - t), t).simplify()
Out[9]: 
   z + 1                            
- t     ⋅Φ(t, 1, z + 1) - log(t - 1)

but I don't think this is a problem since log(t - 1) = log(1 - t) + I*pi, thus it should not matter for the definite integral.

After that, it indeed tries to evaluate the primitive in the two limits in line

sympy/sympy/integrals/integrals.py

Line 717 in c9a2f11

evalued = Add(*others)._eval_interval(x, a, b)

Then, for t = 0, the integral evaluates to I*pi (sometime I also get terms of the type 0**z). Or, if we use log(1 - t), then it evaluates to 0.

The problem then comes for 't = 1', which is evaluated at

sympy/sympy/core/expr.py

Line 916 in c9a2f11

C = limit(self, x, c, "+" if left else "-")

This limit evaluates to infinite, which is the wrong part.
Indeed, after simplifying the expression, as I mentioned before, the primitive can be written as
$$-t^{1 + z} \Phi(t, 1, 1 + z) - \log(1 - t)$$

The limit as $t \to 1^-$ is indeed the expected result of $\psi(z+1) + \gamma_E$. This can be seen from one of the series expansions of the Lerch transcendent
$$z^a \Phi(z, 1, a) \to \sum_{k=1}^\infty \zeta(1-k, a) \frac{\log^k(1)}{k!} + [\psi(1) - \psi(a) - \log(-\log(1))]\frac{\log^0(1)}{0!} = -\gamma_E - \psi(a) - \log(0)$$

Thus, in conclusion, it looks like the limit is the reason for the wrong value.
As I said at the beginning, I don't know much about how sympy handles integration, and I have not passed any fancy parameters like meijerg, so maybe in that case the integral is handled differently, but from your response, it seems to fail in a similar way.

oscarbenjamin · 2024-04-21T19:02:11Z

This limit evaluates to infinite, which is the wrong part.

What exactly is the expression whose limit is being computed? Can you reproduce the problem just using limit rather than integrate?

rogerbalsach · 2024-04-21T20:19:16Z

The expression is the primitive function: -t**(z + 1)*z*lerchphi(t, 1, z + 1)*gamma(z + 1)/gamma(z + 2) - t**(z + 1)*lerchphi(t, 1, z + 1)*gamma(z + 1)/gamma(z + 2) - log(t - 1)

Indeed this can be reproduced with the following code:

import sympy as sym

t = sym.Symbol('t', positive=True)
z = sym.Symbol('z')

F = (-t * t**z * z * sym.lerchphi(t, 1, z + 1) * sym.gamma(z + 1) / sym.gamma(z + 2)
     - t * t**z * sym.lerchphi(t, 1, z + 1) * sym.gamma(z + 1) / sym.gamma(z + 2)
     - sym.log(t - 1))
# An equivalent form
F = -t**(z+1)*sym.lerchphi(t, 1, z+1) - sym.log(t - 1)

L = sym.Limit(F, t, 1, '-')
L.doit()

This return infinite, but as I justified, it should be digamma(z+1) + GammaEuler (- I*pi).
The limit is also wrong in WolframAlpha.

oscarbenjamin · 2024-04-21T20:32:57Z

So it looks like the bug is in limit...

In [13]:  L.args[0].subs(z, S(1)/2).subs(t, 0.9999).n()
Out[13]: 0.61365563825509 - 3.14159265358979⋅ⅈ

In [14]:  L.args[0].subs(z, S(1)/2).limit(t, 1, '-')
Out[14]: ∞

rogerbalsach · 2024-04-23T20:19:16Z

Ok, I have looked into the limit implementation, and after going through the rabbit hole, I found the origin of the problem.

import sympy as sym

a = sym.Symbol('a')
e = sym.Symbol('varepsilon', positive=True)

f = sym.lerchphi(1 - e, 1, a)
F = - (1 - e)**a * f - sym.log(e)

f._eval_as_leading_term(e, logx=None, cdir=1)  # Out: lerchphi(1, 1, a)
F._eval_as_leading_term(e, logx=None, cdir=1)  # Out: -lerchphi(1, 1, a) - log(varepsilon)

The Lerch transcendent has no _eval_as_leading_term implementation, and thus it calls the Function._eval_as_leading_term

sympy/sympy/core/function.py

Lines 789 to 794 in 74c56ac

def _eval_as_leading_term(self, x, logx=None, cdir=0):

"""Stub that should be overridden by new Functions to return

the first non-zero term in a series if ever an x-dependent

argument whose leading term vanishes as x -> 0 might be encountered.

See, for example, cos._eval_as_leading_term.

"""

If I add the following code to lerchphi (this is by no means a complete implementation),

def _eval_as_leading_term(self, x, logx=None, cdir=0):
        from sympy.functions.special.gamma_functions import digamma
        z, s, a = self.args
        z0, s0, a0 = (arg.subs(x, 0).cancel() for arg in self.args)
        if s == 1 and z0 == 1 and not a.has(x):
            lt = -log(-log(z))._eval_as_leading_term(x, logx=logx, cdir=cdir)
            return lt - S.EulerGamma - digamma(a)
        return super()._eval_as_leading_term(x, logx=logx, cdir=cdir)

the previous code runs as expected:

import sympy as sym

a = sym.Symbol('a')
e = sym.Symbol('varepsilon', positive=True)

f = sym.lerchphi(1 - e, 1, a)
F = - (1 - e)**a * f - sym.log(e)

f._eval_as_leading_term(e, logx=None, cdir=1)  # Out: -log(varepsilon) - polygamma(0, a) - EulerGamma
F._eval_as_leading_term(e, logx=None, cdir=1)  # Out: polygamma(0, a) + EulerGamma

However, this still does not solve the original problem with the integration. The reason is that the function F is the result of simplifying the primitive function; see my previous comment.
The problem would be trivially solved by adding a .simplify() in

sympy/sympy/integrals/integrals.py

Lines 619 to 620 in 74c56ac

antideriv = self._eval_integral(

function, xab[0], **eval_kwargs)

But I guess that simplify is a very expensive function, so you might prefer not to do it.

Taking the limit of the unsimplified primitive function, we arrive at the line

sympy/sympy/series/limits.py

Line 330 in 74c56ac

coeff, ex = newe.leadterm(z, cdir=cdir)

which returns coeff = (z*log(t)*gamma(z + 1) + z*gamma(z + 1)*polygamma(0, z + 1) + EulerGamma*z*gamma(z + 1) - I*pi*z*gamma(z + 1) + log(t)*gamma(z + 1) - log(t)*gamma(z + 2) + gamma(z + 1)*polygamma(0, z + 1) + EulerGamma*gamma(z + 1) - I*pi*gamma(z + 1))/gamma(z + 2) ,
which is the correct coefficient, as can be seen after applying the simplify function.
The problem is in line

sympy/sympy/series/limits.py

Line 350 in 74c56ac

if not coeff.has(z):

The previous expression fails this test. So the problem could also be solved by simply simplifying the coefficient, but again, I don't know if this might have a negative impact on performance.

After that, it tries to apply the Gruntz algorithm, which still gives the wrong result:

import sympy as sym

from sympy import gamma, log, lerchphi

t = sym.Symbol('t', positive=True)
z = sym.Symbol('z')

F = (-t * t**z * z * lerchphi(t, 1, z + 1) * gamma(z + 1) / gamma(z + 2)
      - t * t**z * lerchphi(t, 1, z + 1) * gamma(z + 1) / gamma(z + 2)
      - log(t - 1))
sym.gruntz(F, t, 1, '-')  # Wrong result

So I guess that trying to understand why the gruntz function returns the wrong result might be the best option.

oscarbenjamin · 2024-04-24T22:11:38Z

So looks like the root problem is a bug in gruntz. I don't know the algorithm well enough to investigate easily.

rogerbalsach · 2024-04-25T18:43:26Z

Well, I wouldn't say that the root problem is in gruntz. I think the root problem is the missing implementation of lerchphi._eval_as_leading_term. Once this function is implemented, gruntz is able to compute the limit correctly if the expression is previously simplified.
Of course, there must be some other bug in gruntz to return the incorrect limit when the expression is not previously simplified.
I will try to investigate this also, but I'm also not familiar with the algorithm.

oscarbenjamin · 2024-04-25T19:24:24Z

Okay, so two issue then:

lerchphi needs _eval_as_leading_term for limits to be computed properly.
gruntz returns incorrect results when _eval_as_leading_term is missing.

To me the second problem is a bug but the first is just a missing feature. Of course it is still fine to fix the first but not the second if the second is not easy to fix but it does mean that the bug is still there.

rogerbalsach · 2024-04-27T16:42:32Z

Ok, I was using the latest public version of sympy (1.12), where apparently there was a bug: whenever _eval_as_leading_term was not defined, it called Function._eval_as_leading_term, which simply did the substitution x=x0. This was fixed in 57acc60.
I am working now with the master branch, which does not have this problem, and thus I can agree that the missing implementation is not a bug but a missing feature.

Working now with the master branch, the first problem I face is the following:

import sympy as sym

t = sym.Symbol('t', positive=True)
z = sym.Symbol('z')

expr = -t**(z + 1)*sym.lerchphi(t, 1, z + 1) - sym.log(t - 1)
sym.gruntz(expr, t, 1, '-')  # Out: oo

The problem is on these lines in Function._eval_nseries:

sympy/sympy/core/function.py

Lines 729 to 731 in 95085aa

term = e.subs(x, S.Zero)

if term.is_finite is False or term is S.NaN:

raise PoleError("Cannot expand %s around 0" % (self))

When doing the substitution, the output is lerchphi(1, 1, z+1), which is infinite, but it doesn't have the is_finite flag, so the code continues assuming that it is a finite value and causes the wrong result in the limit.
Thus, if I understand it correctly, the bug would be in the current implementation of Function._eval_nseries: Right now, it assumes everything is finite unless explicitly marked as infinite. I guess this is the desired behavior in almost all cases, but when this is not true, it gives the wrong result. Has it ever been considered the option of assuming everything is infinite unless explicitly marked as finite? It is obviously much less practical, and probably a lot of limits that can currently be correctly computed would fail if this changed, but these silent bugs would be removed.

By defining polylog._eval_is_finite in this way:

def _eval_is_finite(self):
    z, s, a = self.args
    if z == 1 and s == 1:
        return False

the Gruntz algorithm no longer returns the wrong result; it fails with the following error:

Traceback (most recent call last):
  File "/home/roger/Desktop/Sympy/sympy/sympy/series/gruntz.py", line 557, in mrv_leadterm
    lt = f.leadterm(w, logx=logw)
         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/expr.py", line 3533, in leadterm
    raise ValueError(filldedent("""
ValueError: 
cannot compute leadterm(-(1 - _w)**(z + 1)*lerchphi(1 - _w, 1, z + 1)
- log(-_w), _w). The coefficient should have been free of _w but got t
- lerchphi(1 - _w, 1, z + 1) - I*pi

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/mul.py", line 1962, in _eval_nseries
    coeff, exp = t.leadterm(x)
                 ^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/expr.py", line 3533, in leadterm
    raise ValueError(filldedent("""
ValueError: 
cannot compute leadterm(lerchphi(1 - _w, 1, z + 1), _w). The
coefficient should have been free of _w but got lerchphi(1 - _w, 1, z
+ 1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/roger/Desktop/Sympy/sympy_debug.py", line 57, in <module>
    res = sym.gruntz(expr, t, 1, '-')
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/series/gruntz.py", line 733, in gruntz
    r = limitinf(e0, z)
        ^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/cache.py", line 72, in wrapper
    retval = cfunc(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/series/gruntz.py", line 453, in limitinf
    c0, e0 = mrv_leadterm(e, x)
             ^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/cache.py", line 72, in wrapper
    retval = cfunc(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/series/gruntz.py", line 563, in mrv_leadterm
    _series = f._eval_nseries(w, n=n0+incr, logx=logw)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/add.py", line 507, in _eval_nseries
    terms = [t.nseries(x, n=n, logx=logx, cdir=cdir) for t in self.args]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/expr.py", line 3396, in nseries
    return self._eval_nseries(x, n=n, logx=logx, cdir=cdir)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/mul.py", line 1986, in _eval_nseries
    facs = [t.nseries(x, n=ceiling(n-n0), logx=logx, cdir=cdir) for t in self.args]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/expr.py", line 3396, in nseries
    return self._eval_nseries(x, n=n, logx=logx, cdir=cdir)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/roger/Desktop/Sympy/sympy/sympy/core/function.py", line 731, in _eval_nseries
    raise PoleError("Cannot expand %s around 0" % (self))
sympy.core.function.PoleError: Cannot expand lerchphi(1 - _w, 1, z + 1) around 0

This should indeed be the expected result.

However, the following code still gives a "wrong" result

import sympy as sym

t = sym.Symbol('t', positive=True)
z = sym.Symbol('z')

expr = -t**(z + 1)*sym.lerchphi(t, 1, z + 1) - sym.log(t - 1)
L = sym.Limit(expr, t, 1, '-')
res = L.doit()  # Out: -lerchphi(1, 1, z + 1) + oo

Of course, one could argue that this is not strictly a wrong result since lerchphi(1, 1, z + 1)=oo' and therefore is returning oo - oo. The reason is that once the gruntzalgorithm fails, it call theheuristics` algorithm

In summary:

An implementation of lerchphi._eval_as_leading_term is needed if we want to get the correct limit.
An implementation of lerchphi._eval_is_finite is needed if we want the gruntz algorithm to raise an error instead of returning the wrong result.
The heuristics algorithm still gives the "wrong" result.

oscarbenjamin · 2024-04-27T20:03:18Z

Has it ever been considered the option of assuming everything is infinite unless explicitly marked as finite?

The problem is an expression like this:

In [1]: x = symbols('x')

In [2]: print(x.is_finite)
None

I would say that it is not robust to depend on expr.subs(x, 0).is_finite in general.

An implementation of lerchphi._eval_as_leading_term is needed if we want to get the correct limit.

An implementation of lerchphi._eval_is_finite is needed if we want the gruntz algorithm to raise an error instead of returning the wrong result.

The heuristics algorithm still gives the "wrong" result.

I agree with all these points. Adding the missing methods is a useful thing to do regardless of anything else.

rogerbalsach · 2024-05-04T12:20:44Z

The problem is an expression like this:
In [1]: x = symbols('x')

In [2]: print(x.is_finite)
None

Ok, as I said, I'm not sure if it is worth it to fix this problem since 99% of the time it should work fine and it would probably break a lot of limits. However, here is an option: We could add a condition like term.is_Function and not term.is_finite to impose that when we deal with functions, we need the expression to be explicitly finite to continue. This should avoid the problem with symbols.

Then, regarding heuristics, I found the problem. It is essentially the same as the rest; when calling heuristics to a function, it simply computes the limits of each argument and then evaluates the function at the argument, which doesn't work if the function is infinite. Thus, I would add

diff --git a/sympy/series/limits.py b/sympy/series/limits.py
index b3976e5512..e9553e58fc 100644
--- a/sympy/series/limits.py
+++ b/sympy/series/limits.py
@@ -115,7 +115,7 @@ def heuristics(e, z, z0, dir):
                     l = limit(e3, z, z0, dir)
                     rv = l * Mul(*r2)
 
-            if rv is S.NaN:
+            if rv is S.NaN or e.is_Function and rv.is_finite is False:
                 try:
                     from sympy.simplify.ratsimp import ratsimp
                     rat_e = ratsimp(e)

I don't know whether it is better to use rv.is_finite is False (i.e., only catch when the function is explicitly marked as infinite) or not rv.is_finite (i.e., catch everything unless the function is explicitly finite).

Also, I realised there is another potential bug in heuristics, specifically in this part:

sympy/sympy/series/limits.py

Lines 119 to 126 in a73da94

try:

from sympy.simplify.ratsimp import ratsimp

rat_e = ratsimp(e)

except PolynomialError:

return

if rat_e is S.NaN or rat_e == e:

return

return limit(rat_e, z, z0, dir)

The idea is to simplify the expression and try the limit again, with an implicit assumption that if heuristics is called again, then e will be already simplified, and thus the check rat_e == e will catch this and return. However, the expression passed initially to the limit function (i.e., rat_e) is not the same expression that limit will pass to the heuristics function; in particular, there is this line:

sympy/sympy/series/limits.py

Line 334 in a73da94

e = powsimp(e)

This is problematic since, in some cases, ratsimp and powsimp are inverses of each other:

import sympy as sym
from sympy.simplify import ratsimp, powsimp

t = sym.Symbol('t', positive=True)
z = sym.Symbol('z')

expr = t**(z+1)
rat_e = ratsimp(expr)  # t*t**z
pow_rat_e = powsimp(rat_e)  # t**(z + 1) == expr

So, even though I didn't find this problem for any real example (since the functions need to be more complicated for the limits to fail in a particular way to see this), this has the potential to fall into an infinite loop.

With this, the limit of expr = -t**(z + 1)*sym.lerchphi(t, 1, z + 1) - sym.log(t - 1) is now correctly computed.

However, everything here still depends on the original expression being simplified. The original integral now evaluates to
oo*sign(-z*gamma(z + 1)/gamma(z + 2) - gamma(z + 1)/gamma(z + 2) + 1) + I*pi
So at least now the user would be able to manually identify this as a oo * 0 case and realise that the limit cannot be trusted (compared with the initial result that simply gave oo + I*pi).
The problem seems to appear in this line in the implementation of limitinf:

sympy/sympy/series/gruntz.py

Line 446 in a73da94

e = e.rewrite('tractable', deep=True, limitvar=x)

This rewrites the expression
-z*(1 - 1/t)*(1 - 1/t)**z*lerchphi(1 - 1/t, 1, z + 1)*gamma(z + 1)/gamma(z + 2) - (1 - 1/t)*(1 - 1/t)**z*lerchphi(1 - 1/t, 1, z + 1)*gamma(z + 1)/gamma(z + 2) - log(-1/t),
which can be successfully simplified to
-z*(1 - 1/t)*(1 - 1/t)**z*lerchphi(1 - 1/t, 1, z + 1)*exp(loggamma(z + 1))*exp(-loggamma(z + 2)) - (1 - 1/t)*(1 - 1/t)**z*lerchphi(1 - 1/t, 1, z + 1)*exp(loggamma(z + 1))*exp(-loggamma(z + 2)) - log(-1/t),
for which not even the full simplify function is able to handle.
In particular, then in the line

sympy/sympy/series/gruntz.py

Line 453 in a73da94

c0, e0 = mrv_leadterm(e, x)

it returns
(-z*exp(loggamma(z + 1) - loggamma(z + 2)) - exp(loggamma(z + 1) - loggamma(z + 2)) + 1, -1)
which is not correct since the c0 coefficient is 0.
I have no idea how to solve it besides my original idea of simplifying the result. I would personally simplify the coeff before doing this check:

sympy/sympy/series/limits.py

Lines 350 to 363 in a73da94

if not coeff.has(z):

if ex.is_positive:

return S.Zero

elif ex == 0:

return coeff

elif ex.is_negative:

if cdir == 1:

return S.Infinity*sign(coeff)

elif cdir == -1:

return S.NegativeInfinity*sign(coeff)*S.NegativeOne**(S.One + ex)

else:

return S.ComplexInfinity

else:

raise NotImplementedError("Not sure of sign of %s" % ex)

This would indeed fix the original problem. Of course, a full simplify might not be the best idea for performance, so we could call more specific functions. For this particular example, a call to gammasimp would be enough.

oscarbenjamin · 2024-05-04T15:14:49Z

I'm finding it difficult to follow all of this so I'm not sure exactly what to suggest. Probably there are various things that could be fixed. I would say let's start with the most obvious fixes and get them in first.

In general simplify should not be used in library code not just because of potential slowness but also because it is heuristic and its output is not well defined.

More generally it would be better if the limits etc code used some sort of canonical representation rather than lots of heuristic checks and simplification.

oscarbenjamin added integrals integrals.definite integrals.meijerg labels Apr 21, 2024

oscarbenjamin added the limits label Apr 21, 2024

rogerbalsach linked a pull request May 4, 2024 that will close this issue

[WIP] Fix digamma integral #26563

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong result for digamma integral representation #26523

Wrong result for digamma integral representation #26523

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 23, 2024 •

edited

oscarbenjamin commented Apr 24, 2024

rogerbalsach commented Apr 25, 2024

oscarbenjamin commented Apr 25, 2024

rogerbalsach commented Apr 27, 2024

oscarbenjamin commented Apr 27, 2024

rogerbalsach commented May 4, 2024

oscarbenjamin commented May 4, 2024

Wrong result for digamma integral representation #26523

Wrong result for digamma integral representation #26523

Comments

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 21, 2024

oscarbenjamin commented Apr 21, 2024

rogerbalsach commented Apr 23, 2024 • edited

oscarbenjamin commented Apr 24, 2024

rogerbalsach commented Apr 25, 2024

oscarbenjamin commented Apr 25, 2024

rogerbalsach commented Apr 27, 2024

oscarbenjamin commented Apr 27, 2024

rogerbalsach commented May 4, 2024

oscarbenjamin commented May 4, 2024

rogerbalsach commented Apr 23, 2024 •

edited