Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugger compatibility #683

Closed
tkf opened this issue Apr 30, 2019 · 25 comments
Closed

Debugger compatibility #683

tkf opened this issue Apr 30, 2019 · 25 comments

Comments

@tkf
Copy link
Member

tkf commented Apr 30, 2019

It seems we may need to use disable_sigint to support the JuliaInterpreter.jl out-of-the-box https://discourse.julialang.org/t/debugger-fails-on-pycall/23560/3

At the moment, users have to do push!(JuliaInterpreter.compiled_modules, PyCall): https://discourse.julialang.org/t/debugger-fails-on-pycall/23560/6

@stevengj You didn't want to put disable_sigint for all ccall #574 (comment) but maybe replacing only the sigatomic_begin-sigatomic_end pairs is OK?

@yuyichao
Copy link
Collaborator

Afaict if the interpretor has different behavior than normal execution it's the interpreters bug.

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

Thanks. So it looks like push!(compiled_methods, which(disable_sigint, Tuple{Function})) etc. are added in PR JuliaDebug/JuliaInterpreter.jl#108 to workaround JuliaDebug/JuliaInterpreter.jl#106 and just postponed solving the real bug.

But I'm still interested in #574 so moving to disable_sigint is something I want to do anyway. It's just more PyCall code would become debuggable once JuliaInterpreter.jl fix the issue. So I don't think there is any harm in using disable_sigint. It just makes PyCall rely less on non-public functions.

@yuyichao
Copy link
Collaborator

And reading through the list of compiled methods from JuliaInterpreter, they all seems like temporary workaround and none of them seems like the correct long term solution. This means that changing other's code is not the correct solution either.

In this and also a few other cases like pointerset, the problem is simply that the underlying function requires special treatment. For this case specifically, without reading the code or trying myself, I assume the interpreter simply blindly executed the ccall, which causes the error. However, the signal handling is something a debugger has to deal with. The correct thing to do here is to intercept the ccall, use it to update a counter maintained by the interpreter for the current task instead of performing the ccall. Depending on the signal handing user interface you want to expose, the interpreter could itself run either with the signal delivery enabled or disabled and it could run the user code with the signal handling disabled if the counter is not-0.

just postponed solving the real bug.

Correct. And that's pretty much all what I see in that list as mentioned above.

But I'm still interested in #574 so moving to disable_sigint is something I want to do anyway

The correct solution there is likely a cfunction flag.

@yuyichao
Copy link
Collaborator

Closing since this is clearly a debugger bug and it doesn't make much sense to workaround it here when there's already an easy workaround for the user. whether to use disable_sigint is an independent issue that already has discussion elsewhere so no reason to do it here.

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

Closing this issue is OK, I guess. I don't think there is an active issue tracking only disable_sigint vs sigatomic_(begin|end). PR #574 is mixing it with different harder issues. But opening a separate issue that does this is a cleaner approach.

The correct solution there is likely a cfunction flag.

What do you mean by this? We have code like

pyjlwrap_call_ptr = @cfunction(pyjlwrap_call, PyPtr, (PyPtr,PyPtr,PyPtr))

and you are suggesting to improve it somehow? Something is better than PyPtr?

@yuyichao
Copy link
Collaborator

No, I mean it needs to be a Julia change. The only thin you need a guaranteed on is that n signal is delivered before you can catch it. It has men mentioned a few times but I cannot find it.....

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

It would be nice if there is an issue in Julia repository that I can subscribe to.

By the way, I just tried the repro code I posted in JuliaLang/julia#29498 and now it works with Julia 1.3.0-DEV.121 (dc6c7c7e6f). Maybe JuliaLang/julia#31191 [1] fixed what you had in mind? But it seems the issue fixed by it was more about the world-age.

[1] mentions JuliaLang/julia#29498

@yuyichao
Copy link
Collaborator

What did I had in mind? That julia issue doesn't seem to have anything to do with signal handling.

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

That's why I was puzzled.

@yuyichao
Copy link
Collaborator

What are you puzzled by? I'm not sure what you mean by "what I had in mind" but the only thing I've mentioned in this thread that you didn't understand that is related to #574 is adding new function to cfunction. It has nothing to do with anything else and has nothing to do with this issue. I don't believe there's an issue for it since it was previously only mentioned in the context of PyCall signal handling either in one of the issues here or on discourse.

I never once mentioned either JuliaLang/julia#29498 or JuliaLang/julia#31191 so I never had anything in particular about them in mind, apart from that it doesn't seem to be signal handling related at all. And if you are just puzzled by why it's fixed or why you can hit a non-signal handling related issue, then, well, that's just how things usually works..........................

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

You said

The correct solution there is likely a cfunction flag.

From your later clarifications, my understanding was that you were saying some Julia internal related to cfunction and signal handler has to be fixed to resolve #574. However, it seems that #574 is fixed by JuliaLang/julia#31191 while it seems, as I said, that

the issue fixed by it was more about the world-age

and that the PR JuliaLang/julia#31191 does not have

anything to do with signal handling

as you said.

So, my understanding is that #574 is fixed by a patch that has nothing to do with the "correct solution" I thought you had in mind. That's why I was puzzled.

I never once mentioned either JuliaLang/julia#29498 or JuliaLang/julia#31191

JuliaLang/julia#29498 is the "Julia mirror" of #574. That's why I'm mentioning it. I'm mentioning JuliaLang/julia#31191 because, as I said, JuliaLang/julia#31191 is linking JuliaLang/julia#29498 and JuliaLang/julia#31191 may have fixed JuliaLang/julia#29498.

And if you are just puzzled by why it's fixed or why you can hit a non-signal handling related issue, then, well, that's just how things usually works

I don't understand this sentence. Are you saying the signal handler bug is still there and JuliaLang/julia#31191 just hid the symptom without fixing the cause?

@yuyichao
Copy link
Collaborator

From your later clarifications, my understanding was that you were saying some Julia internal related to cfunction and signal handler has to be fixed to resolve #574

Errr, #574 is a pull request and that's all what I was talking about. There's nothing to "fix" for a PR. In the context of a PR, I think it's pretty clear that what I was saying is the implementation of it, especially since I explicitly quoted your "moving to disable_sigint". I'm mentioning it because I don't think moving to disable_sigint has anything to do with that issue and it is also not the correct solution for the issue that PR is trying to fix.

However, it seems that #574 is fixed by JuliaLang/julia#31191 while it seems, as I said, that

If you are talking about the issue you hit in #574, you should be more clear on that........ It wasn't even mentioned anywhere in this issue before my mentioning of the correct solution.

So, my understanding is that #574 is fixed by a patch that has nothing to do with the "correct solution" I thought you had in mind. That's why I was puzzled.

So, no it's not. The issue you hit in #574 is unrelated to signal handling nor the correct solution. #574 is a PyCall PR and needs changes in julia to correctly fix.

JuliaLang/julia#29498 is the "Julia mirror" of #574

And no, really.... #574 is about signal handling, JuliaLang/julia#29498 is not...... A mirror issue is when closing one closes the other one. You merely have some dependencies in between due to the current implementation in the PR. The two are not equivalent...

I don't understand this sentence

I'm making very wield guesses of what you are puzzled by. Since you were puzzled by me mentioning the issue you hit in #574 being not signal handling related, I was guessing you were puzzled by or didn't realized that JuliaLang/julia#29498 is not really related to #574 or signal handling at all. From

Are you saying the signal handler bug is still there and JuliaLang/julia#31191 just hid the symptom without fixing the cause?

It seems that I wasn't too off. To clarify again, JuliaLang/julia#29498 is not a signal handler bug. JuliaLang/julia#29498 might have been fixed JuliaLang/julia#31191. I didn't check either so I don't know but either way, neither of the two fix or hide any signal handling bug. There are AFAICT no julia signal handler bug mentioned anywhere in any linked issues in this thread so far. The only thing I said that'll help implementing the correct solution in #574 is a small feature addition, not a bug.

@yuyichao
Copy link
Collaborator

I don't think there is an active issue tracking only disable_sigint vs sigatomic_(begin|end)

And I don't see a reason there need to be one. Debugger is completely irrelevant since it's a debugger bug that already has a workaround. The two doesn't make any difference for #574 (the PR, not any issue you hit in that PR though those issues have even less to do with either functions). Only minor issue being sigatomic_* being unexported functions. This being the only reason and that's what I believe can just fit in #574, which would be removing all signal handling code from ccall site.

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

especially since I explicitly quoted your "moving to disable_sigint"

OK. You seem to get caught up with the first sentence in that paragraph. The PR #574 at that point was mentioned for its aspect on the cleanup of API usage; i.e., stop using the Julia internal functions sigatomic_begin and sigatomic_end. It is clear if you look at the last sentence of that paragraph:

It just makes PyCall rely less on non-public functions.

Note also that API cleanup is how I mentioned #574 in the OP as well. Having said that, I agree that the sentence you quoted alone is very confusing. I'm sorry about confusing you.

I'm mentioning it because I don't think moving to disable_sigint has anything to do with that issue and it is also not the correct solution for the issue that PR is trying to fix.

Yes, I know. Note that PR #574 already uses disable_sigint. So the "issue" wouldn't have manifested if it were solved by disable_sigint.

If you are talking about the issue you hit in #574, you should be more clear on that........

I was indeed sloppy about distinction between the PR #574 itself and the blocker issue of #574. I'll try to be clear.

Since you were puzzled by me mentioning the issue you hit in #574 being not signal handling related

No, the opposite. My guess for what blocks #574 has been the world age issue revealed in JuliaLang/julia#29498. That's why I was confused with your comments indicating that merging #574 requires some changes in Julia's signal related code. This was very different from my expectation.

The only thing I said that'll help implementing the correct solution in #574 is a small feature addition, not a bug.

Thanks. This explains a lot. From my point of view, merging #574 is blocked by the bug JuliaLang/julia#29498 because it makes PyJulia not usable.

On the other hand, you seems to be focusing on the fact that disable_sigint/reenable_sigint is not enough to guarantee that signal handling is re-enabled inside Julia functions called from Python? It would be great to have such guarantee. However, IIUC it just improves whatever we can do with #574. The lack of such guarantee is not blocking #574 to be merged.

@yuyichao
Copy link
Collaborator

you seems to be focusing on the fact that disable_sigint/reenable_sigint is not enough to guarantee that signal handling is re-enabled inside Julia functions called from Python

No, it's possible to implement that but it's a really bad way to do it and AFAICT #574 isn't doing that correctly.

What you have in #574 is equivalent to/no better than not having any signal handling code, which is why it really doesn't make sense to add so many disable_sigint to begin with. The only thing missing in that approach (and one important thing missing from your PR) is to re-enable the signal after you setup the try-catch block to make sure,

no signal is delivered before you can catch it

This is only needed for callback so it should be handled there. And the only thing you need is a sigatomic on the cfunction which is the new feature you need.

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

it really doesn't make sense to add so many disable_sigint to begin with

I thought any ccall that may invoke Julia functions needs disable_sigint? If so, how do you reduce the number of disable_sigint in #574? You said "it's possible to implement that." Does it mean it's possible to implement what PR #574 does by fewer calls to disable_sigint using current Julia? Or are you saying it's possible with enhancement in Julia?

a sigatomic on the cfunction which is the new feature you need

Does this eliminate the needs of disable_sigint, provided that all Julia callback uses this feature?

Also, how is it "better than not having any signal handling code?" If you want to handle signals while calling Python functions, I think you need to call PyErr_SetInterrupt and let Python handle the signal. However, it looks like Julia does not have a way to register a custom handler JuliaLang/julia#14675.

@yuyichao
Copy link
Collaborator

I thought any ccall that may invoke Julia functions needs disable_sigint

What do you mean by "need".

You said "it's possible to implement that." Does it mean it's possible to implement what PR #574 does by fewer calls to disable_sigint using current Julia? Or are you saying it's possible with enhancement in Julia?

I mean the PR as is doesn't need the disable_sigint since it's not implementing the cfunction side correctly anyway. Without enhancement in julia there is a way to implement the signal handling correct in cfunction and with defer_sigint on every ccall. With enhancement in julia there's a way to do the same without any defer_sigint on any ccall.

Does this eliminate the needs of disable_sigint, provided that all Julia callback uses this feature?

Eliminate the need for what..... You only need disable_sigint if the signal is handled more carefully in cfunction anyway and such need can be eliminated with changes in julia base.

how is it "better than not having any signal handling code?

I said, #574 is no better than not having any signal handling code. And I'm only talking about sigatomic_begin, sigatomic_end, disable_sigint, reenable_sigint, not anything anything python related. You certainly still need exception handling code and that's the code that interacts with python, not signal handling.

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

I thought any ccall that may invoke Julia functions needs disable_sigint

What do you mean by "need".

I thought any ccall that may invoke Julia functions must be wrapped by disable_sigint. That's what you said in https://discourse.julialang.org/t/19431/7

I mean the PR as is doesn't need the disable_sigint since it's not implementing the cfunction side correctly anyway.

What do you mean by "implementing the cfunction side?" reenable_sigint usage? What is the correct implementation?

how is it "better than not having any signal handling code?

I said, #574 is no better than not having any signal handling code

(You misunderstood what it means. It's not it = #574. Please notice that I don't quote "no.")

I wanted to know in what way "a sigatomic on the cfunction" (= it) improves signal handling during ccall and/or Julia functions called inside ccalled C functions. So far, it sounds like the new feature you are talking about just eliminates disable_sigint wrapping ccall and reenable_sigint in the callback but does not introduce any improvements other than convenience (of not needing to write disable_sigint). On the other hand, you said #574 is "a really bad way" so there is clearly something you have in mind. I'd like to know what is it. Performance improvements?

You certainly still need exception handling code and that's the code that interacts with python, not signal handling.

I'm pointing out one direction of really improving #574 feature-wise, by handling signals during ccall. It perhaps does not match with what you have in mind.

@yuyichao
Copy link
Collaborator

That's what you said in https://discourse.julialang.org/t/19431/7

Yes, that's assuming a complete implementation on the cfunction side. If the behavior of #574 satisfies what you need then you don't need disable_sigint.

What do you mean by "implementing the cfunction side?" reenable_sigint usage? What is the correct implementation?

re-enable the signal after you setup the try-catch block to make sure,

Please notice that I don't quote "no."

Well, then I assume you realized that your quote completely changed the meaning of the original sentence. I'm not sure why you are quoting then..... Which is probably not important but that's really confusing..............

So far, it sounds like the new feature you are talking about just eliminates disable_sigint wrapping ccall

Yes

reenable_sigint in the callback

No.

but does not introduce any improvements other than convenience
On the other hand, you said #574 is "a really bad way"

Well, removing disable_sigint also remove all the code it comes with. In fact, I don't care about convinience at all. I'm perfectly fine with requiring complex syntax as long as the final result (the code being executed) matches what it should be. This is especially true since macros can fix most unnecessary syntax complexity anyway. I only care about the code being executed/work being done. It just doesn't make sense to run additional code for every single ccall. As I said, your real issue is in the callback and there's no reason to fix it elsewhere.

I'm pointing out one direction of really improving #574 feature-wise, by handling signals during ccall. It perhaps does not match with what you have in mind.

Well, I have no idea what API's are used by python for signal handling and I have no idea how that's related to this discussion. AFAICT, before you mentioning it, all the discussions about signals has been about how julia handles signals, which is why I feel it's OK to use signal handling to refer to julia signal handling.

@tkf
Copy link
Member Author

tkf commented Apr 30, 2019

re-enable the signal after you setup the try-catch block to make sure,

It's not clear to me what you mean by that. My guess has been

reenable_sigint() do
    try
        ...
    catch
        ...
    end
end

is bad, and

try
    reenable_sigint() do
        ...
    end
catch
    ...
end

is good. Is it correct? I suppose I still need to wrap all the ccalls that may call the callback function that has reenable_sigint() in it by disable_sigint, right?

Also, my understanding for why the second code is correct is that this use of reenable_sigint makes sure that the code between catch and end is guaranteed to be not interrupted. It would be great if you can confirm that.

That's what you said in https://discourse.julialang.org/t/19431/7

Yes, that's assuming a complete implementation on the cfunction side.

It's not clear to me what do you mean by "implementing the cfunction side." Is it reenable_sigint usage or the possible improvements in Julia?

If the behavior of #574 satisfies what you need then you don't need disable_sigint.

So you are saying there are two possible correct implementations to achieve the goal of #574:

  1. Use disable_sigint and reenable_sigint correctly. Presumably as in the second code example above.
  2. Do not use disable_sigint and reenable_sigint anywhere in PyCall.

How is the second implementation consistent with the documentation of disable_sigint?

Disable Ctrl-C handler during execution of a function on the current task, for calling external code that may call julia code that is not interrupt safe.

I've been assuming that interrupting the CPython runtime in the middle of its execution is not safe.

@yuyichao
Copy link
Collaborator

Is it correct?

Yes.

I suppose I still need to wrap all the ccalls that may call the callback function that has reenable_sigint() in it by disable_sigint, right?

Currently, yes, but that's really bad.

It's not clear to me what do you mean by "implementing the cfunction side."

Errr, you just said it above. The correct order of try catch and reenable_sigint.

So you are saying there are two possible correct implementations to achieve the goal of #574:

No. I'm saying there is another possible implementation that will behave just like #574 as currently implemented.

How is the second implementation consistent with the documentation of disable_sigint?
I've been assuming that interrupting the CPython runtime in the middle of its execution is not safe.

There's no inconsistency. ccall will never be interrupted even without disable_sigint. "interrupting the CPython runtime in the middle of its execution" will never happen no matter what you do. (OK, the force sigint throwing is the only way for that to happen but that's a safenet to kill something that is not possible to disable)

@yuyichao
Copy link
Collaborator

And in case it's not clear, the only problem with or without disable_sigint or sigatomic_* is throwing an exception out of a cfunction, that can only happen in julia code, not (julia-unaware) C code.

@tkf
Copy link
Member Author

tkf commented May 1, 2019

"interrupting the CPython runtime in the middle of its execution" will never happen no matter what you do.

OK, so I think I've been misinterpreting the documentation. If I'm allowed to put braces in English it would be:

for calling {external code that may call {julia code that is not interrupt safe}}

and not

for calling {{external code that may call julia code} that is not interrupt safe}

Is it correct?

If that's the case, why wrap ccall with disable_sigint? Isn't it enough to do

try
    ...
catch
    disable_sigint() do
        ...
    end
end

in the Julia callbacks that may be called via ccall? Is it for the safety in case you don't control what callbacks are passed?

@yuyichao
Copy link
Collaborator

yuyichao commented May 1, 2019

Is it correct?

Yes.

why wrap ccall with disable_sigint? Isn't it enough to do

Because the exception could be thrown before the try or after the last end or also after catch before disable_sigint or in between the two ends.

That's why I said it's a really local problem to the callback, in particular the entry and exit of the callback that the user cannot control. It's possible to fix there and it should be fixed there.

@tkf
Copy link
Member Author

tkf commented May 1, 2019

Good. That clarifies a lot. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants