Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

declare the last arg to GAP_CallFunc3Args volatile #37951

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

dimpase
Copy link
Member

@dimpase dimpase commented May 6, 2024

This appears to fix #37026 (segfaults in src/sage/libs/gap/element.pyx with Python 3.12 and gcc 13.2.1)

See also #36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

@dimpase
Copy link
Member Author

dimpase commented May 6, 2024

CI seems to be shot, but the tests for element.pyx work, and also libgap.AbelianGroup(0,0,0) does not cause a crash any more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since sig_GAP_Enter() calls sig_error(), shouldn't it be called after the call to sig_on()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this ought to be fixed, but it's a different bug. Flipping these orders don't cure the issue at hand, that volatile declaration is still needed. I'm not sure it's a Sage or a gcc bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was just wondering. Since I don't have gcc 14 I can't reproduce myself. Since we don't know what is the cause of the problem, we don't know whether the volatile fixes the issue or we just push around the optimizer to avoid it. The nesting of setjmp() (one for gap, one for sig_on) seems to complicate things.

Looking at the assembler, it seems that gcc is using %ebx for both t in #define sig_GAP_Enter() {int t = GAP_Enter(); if (!t) sig_error();} and for __pyx_t_6 (which contains the third parameter). Making the third parameter volatile avoids the conflict t, but maybe it's better to mark t volatile instead (we don't know if another gcc will place the second parameter in %ebx, etc). Does that work?

I.e.,

--- a/src/sage/libs/gap/gap_includes.pxd
+++ b/src/sage/libs/gap/gap_includes.pxd
@@ -28,7 +28,7 @@ cdef extern from "gap/calls.h" nogil:
 
 cdef extern from "gap/libgap-api.h" nogil:
     """
-    #define sig_GAP_Enter()  {int t = GAP_Enter(); if (!t) sig_error();}
+    #define sig_GAP_Enter()  {volatile int t = GAP_Enter(); if (!t) sig_error();}
     """
     ctypedef void (*GAP_CallbackFunc)()
     void GAP_Initialize(int argc, char ** argv,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's actually gcc 13.2.1 which gives this trouble on Gentoo, not gcc 14.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.... I'm on 13.2.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the assembler, it seems that gcc is using %ebx for both t in #define sig_GAP_Enter() {int t = GAP_Enter(); if (!t) sig_error();} and for __pyx_t_6 (which contains the third parameter). Making the third parameter volatile avoids the conflict t, but maybe it's better to mark t volatile instead (we don't know if another gcc will place the second parameter in %ebx, etc). Does that work?

no, it doesn't. And, after all, t in int t = GAP_Enter(); gets out of the way almost immediately, before the list for arguments for GAP_CallFunc3Args is even created.

@mkoeppe
Copy link
Member

mkoeppe commented May 7, 2024

CI seems to be shot

@dimpase The bullet: #36982

Copy link

github-actions bot commented May 8, 2024

Documentation preview for this PR (built with commit 72e6b66; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@mkoeppe
Copy link
Member

mkoeppe commented May 13, 2024

LGTM based on the GCC bugzilla discussion cited.
If nothing else, it's a harmless change.
Let's merge it.

@dimpase
Copy link
Member Author

dimpase commented May 14, 2024

This is certainly a minimal hotfix, and we should fully expect more trouble of this sort as C/C++ compilers optimise more, and ref-counted GAP objects we use end up needing post-processing after a longjmp from libgap. Postprocessing which can be invalidated by a longjmp, if the respective handle is held in a non-volatile local variable.

A better, more efficient way, would be having a non-ref-counted version of GAP objects, which don't need this flaky post-processing, and can be used as headache-free automatic variables.

vbraun pushed a commit to vbraun/sage that referenced this pull request May 15, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this pull request May 18, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this pull request May 18, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
vbraun pushed a commit to vbraun/sage that referenced this pull request May 24, 2024
    
This appears to fix sagemath#37026 (segfaults in src/sage/libs/gap/element.pyx
with Python 3.12 and gcc 13.2.1)

See also
sagemath#36407 (comment)

The corresponding gcc 13.2.1's bug (or feature) is being dealt with
here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114872
### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#37951
Reported by: Dima Pasechnik
Reviewer(s): Dima Pasechnik, Gonzalo Tornaría
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfault testing src/sage/libs/gap/element.pyx on python 3.12
3 participants