Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault testing src/sage/libs/gap/element.pyx on python 3.12 #37026

Open
tornaria opened this issue Jan 7, 2024 · 13 comments · May be fixed by #37951
Open

Segfault testing src/sage/libs/gap/element.pyx on python 3.12 #37026

tornaria opened this issue Jan 7, 2024 · 13 comments · May be fixed by #37951
Labels
Milestone

Comments

@tornaria
Copy link
Contributor

tornaria commented Jan 7, 2024

Concerning the error in src/sage/libs/gap/element.pyx, it happens in line 2489:

for i in range(100):
        rnd = [ randint(-10,10) for i in range(randint(0,7)) ]
        # compute the sum in GAP
        _ = libgap.Sum(rnd)
        try:
            libgap.Sum(*rnd)
            print('This should have triggered a ValueError')
            print('because Sum needs a list as argument')
        except ValueError:
            pass

getting Killed due to segmentation fault. Running this code in the terminal works, the error appears only in tests and with python 3.12.

Originally posted by @enriqueartal in #37011 (comment)

@tornaria tornaria changed the title Segfault testing src/sage/libs/gap/element.pyx Segfault testing src/sage/libs/gap/element.pyx on python 3.12 Jan 7, 2024
@tornaria tornaria added the t: bug label Jan 7, 2024
@dimpase
Copy link
Member

dimpase commented Jan 7, 2024

my eyes bleed from this line:

        rnd = [ randint(-10,10) for i in range(randint(0,7)) ]

How about

        rnd = random.choices(range(-10,10), k=randint(0,7))

?

See https://docs.python.org/3/library/random.html#random.choices

@dimpase
Copy link
Member

dimpase commented Jan 7, 2024

maybe it segfaults on libgap.Sum(*rnd) when rnd==[] ?

@dimpase
Copy link
Member

dimpase commented Mar 11, 2024

This also pops up on Fedora 39 with system Python 3.12, see e.g. https://groups.google.com/g/sage-release/c/oTnTrEnB3YU/m/tZiLm66ZAwAJ

@dimpase
Copy link
Member

dimpase commented Mar 11, 2024

But I can't reproduce this with Python 3.12.2

@enriqueartal
Copy link
Contributor

Is it possible that both @jaapspies and me are missing some python3 system package?

@enriqueartal
Copy link
Contributor

At some point this segfault occured only in tests, now it happens also in the command line

@kiwifb
Copy link
Member

kiwifb commented Apr 22, 2024

my eyes bleed from this line:

        rnd = [ randint(-10,10) for i in range(randint(0,7)) ]

How about

        rnd = random.choices(range(-10,10), k=randint(0,7))

?

See https://docs.python.org/3/library/random.html#random.choices

Tried that but it does not have an effect

sage: import random ## line 2490 ##
sage: for i in range(100):
    rnd = random.choices(range(-10,10), k=randint(0,7))
    # compute the sum in GAP
    _ = libgap.Sum(rnd)
    try:
        libgap.Sum(*rnd)
        print('This should have triggered a ValueError')
        print('because Sum needs a list as argument')
    except ValueError:
        pass ## line 2491 ##
------------------------------------------------------------------------
/usr/lib/python3.12/site-packages/cysignals/signals.cpython-312-x86_64-linux-gnu.so(+0xa0d4)[0x7fbe2865c0d4]
/usr/lib/python3.12/site-packages/cysignals/signals.cpython-312-x86_64-linux-gnu.so(+0xa196)[0x7fbe2865c196]
/usr/lib/python3.12/site-packages/cysignals/signals.cpython-312-x86_64-linux-gnu.so(+0xcbf6)[0x7fbe2865ebf6]
/usr/lib64/libc.so.6(+0x3a210)[0x7fbe2945f210]

@tornaria
Copy link
Contributor Author

There's a more detailed trace in #36407 (comment)

@antonio-rojas
Copy link
Contributor

antonio-rojas commented May 1, 2024

Caused by GCC commit https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=049ec9b981d1f4f97736061d5cf7d0ae990b57d7

Upstream report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

Can be reproduced by calling any libgap function in exactly 3 parameters, eg. libgap.AbelianGroup(0,0,0)

@dimpase
Copy link
Member

dimpase commented May 3, 2024

@sheerluck has kindly disassembled broken element....so, resulting from compilation of element.c. I report the result here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872#c12
(that is, code generation is broken, the call to GAP_CallFunc3Args() is formed with one parameter less than it should be)

Fixing gcc is above my payrate, though :-)

@mkoeppe mkoeppe added this to the sage-10.4 milestone May 3, 2024
dimpase added a commit to dimpase/sage that referenced this issue May 6, 2024
@dimpase dimpase linked a pull request May 6, 2024 that will close this issue
5 tasks
@dimpase
Copy link
Member

dimpase commented May 8, 2024

more discussion on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872 seems to indicate that gcc is actually not broken, it's our code that is (missing volatile declarations).
See also https://stackoverflow.com/questions/7996825/why-volatile-works-for-setjmp-longjmp

If I am not mistaken, we have

sig_on()
...
GAP_CallFunc3Args(foo0,...,foo3)
...
sign_off()

and the longmp() triggered somewhere in libgap, called by GAP_CallFunc3Args, invalidates the registers' contents.
Now, if foo3, say, was held in a register, it is gone for good. Thus unwinding this call, which involves a GAP object
held in foo3, is no longer possible, and triggers a segfault.

Looks like a lot of volatile to add all over the place: (anywhere libgap, singular (which also uses setjmp/longjmp) is called, any other setjmp/longjmp-ing upstream code?

@dimpase
Copy link
Member

dimpase commented May 13, 2024

See https://trofi.github.io/posts/312-the-sagemath-saga.html for a detailed analysis. Not sure whether the detail that GAP uses setjmp/longjmp to return from GAP_CallFunc3Args() here in the 1st place.

@dimpase
Copy link
Member

dimpase commented May 13, 2024

By the way, the Python 3.12 is only needed to produce a crash. Otherwise, we get a memory leak, as these temporary GAP objects are not cleared.

$ ./sage
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 10.4.beta6, Release Date: 2024-05-12              │
│ Using Python 3.11.9. Type "help()" for help.                       │
└────────────────────────────────────────────────────────────────────┘

 sage: libgap.count_GAP_objects()
0
sage: libgap.Sum(1,2,42)
---------------------------------------------------------------------------
GAPError                                  Traceback (most recent call last)
Cell In[2], line 1
----> 1 libgap.Sum(Integer(1),Integer(2),Integer(42))

File /home/scratch/scratch2/dimpase/sage/sage/src/sage/libs/gap/element.pyx:2514, in sage.libs.gap.element.GapElement_Function.__call__()
   2512 try:
   2513     sig_GAP_Enter()
-> 2514     sig_on()
   2515     if n == 0:
   2516         result = GAP_CallFunc0Args(self.value)

GAPError: Error, no method found! Error, no 1st choice method found for `SumOp' on 3 arguments
sage: libgap.count_GAP_objects()
4
sage: libgap.Sum(3,5,7)
---------------------------------------------------------------------------
GAPError                                  Traceback (most recent call last)
Cell In[4], line 1
----> 1 libgap.Sum(Integer(3),Integer(5),Integer(7))

File /home/scratch/scratch2/dimpase/sage/sage/src/sage/libs/gap/element.pyx:2514, in sage.libs.gap.element.GapElement_Function.__call__()
   2512 try:
   2513     sig_GAP_Enter()
-> 2514     sig_on()
   2515     if n == 0:
   2516         result = GAP_CallFunc0Args(self.value)

GAPError: Error, no method found! Error, no 1st choice method found for `SumOp' on 3 arguments
sage: libgap.count_GAP_objects()
7

dimpase added a commit to dimpase/sage that referenced this issue May 14, 2024
vbraun pushed a commit to vbraun/sage that referenced this issue May 15, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this issue May 18, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this issue May 18, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this issue May 24, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants