Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New profiler breaks libgit2 ssh transport #2721

Open
tjk opened this issue Mar 24, 2023 · 4 comments
Open

New profiler breaks libgit2 ssh transport #2721

tjk opened this issue Mar 24, 2023 · 4 comments
Assignees
Labels
bug Involves a bug community Was opened by a community member profiling Involves Datadog profiling

Comments

@tjk
Copy link

tjk commented Mar 24, 2023

EDIT: While disabling the new profiler seems to fix the issue, there might be an underlying problem related to different libssl due to different libssh2 that is just heightened with this feature on... will continue investigating and report back.


Current behaviour

Setting profiling.advanced.force_enable_new_profiler to true makes using rugged (libgit2) ssh transport break.

Steps to reproduce

Make sure to compile rugged with :ssh feature, and the do the following:

creds = Rugged::Credentials::SshKey.new(username: 'git', publickey: '/path/to/key.pub', privatekey: '/path/to/key')
Rugged::Repository.clone_at('ssh://git@github.com/<username>/<repo>.git', '/tmp/some-directory', credentials: creds)

You will see an error like Failed to open SSH channel: Error waiting on socket (Rugged::SshError)

Environment

  • ddtrace version: 1.9.0
  • Configuration block (Datadog.configure ...): see above
  • Ruby version: 2.7.7
  • Operating system: Debian Buster
  • Relevant library versions: rugged 1.5.0.1 (means libgit2 1.5.0)
@tjk tjk added bug Involves a bug community Was opened by a community member labels Mar 24, 2023
@ivoanjo ivoanjo self-assigned this Mar 27, 2023
@ivoanjo
Copy link
Member

ivoanjo commented Mar 27, 2023

Thanks a lot @tjk for the report. This definitely looks similar to the mysql2 issue another customer reported (see known issues in the release notes).

You mentioned that

EDIT: While disabling the new profiler seems to fix the issue, there might be an underlying problem related to different libssl due to different libssh2 that is just heightened with this feature on... will continue investigating and report back.

Please do let us know about you find out! I'll work to reproduce this as well... For now indeed the best solution is to not use the new profiler for this application.

@ivoanjo ivoanjo added the profiling Involves Datadog profiling label Apr 4, 2023
@ivoanjo
Copy link
Member

ivoanjo commented Apr 4, 2023

I was able to reproduce this easily -- thanks for the clear notes! It's indeed the same issue as called out in the release notes for the mysql2 gem.

I can confirm the observation that the issue seems to be triggered in the ssh transport, and when cloning via http there seems to be no issue. It's unclear to me if the issue right now is in libssh2 (used by the ssh transport) or libgit2 (I don't think it comes from rugged directly).

@tjk: Actually getting a fix in may take a while, since I'll need to report this to the rugged maintainers (and possibly the libgit2 and libssh2 maintainers as well).

Are you OK with not enabling the new profiler for this specific app while we work on this?

Alternatively, I can look into exposing an API that could be used to wrap the clone_at call to avoid the profiler interrupting it, if that's something you'd be interested in using in your application.

I'll also make sure to document this and add the rugged gem to the list of gems we detect and avoid enabling the new profiler for, so that others don't run into this issue while we work to get the full fix in place!

ivoanjo added a commit that referenced this issue Apr 4, 2023
This is useful when testing if a specific bit of code is affected by
sigprof signal delivery.

I'm using it to test the issue reported in #2721.

I think this helper is small enough/useful enough to keep around for
later.
ivoanjo added a commit that referenced this issue Apr 4, 2023
**What does this PR do?**:

This PR adds the `rugged` gem to the list of gems that prevent the new
CPU Profiling 2.0 profiler from being auto-enabled.

This is needed because the `rugged` gem (more specifically, its
C-level dependencies) do not correctly handle the unix signals that the
new profiler uses.

We plan to do more than just add this workaround, but until we do so,
this change will prevent customers from being bitten by the issue.

**Motivation**:

Make sure that customers being moved to the new CPU Profiling 2.0
profiler have a smooth experience.

**Additional Notes**:

N/A

**How to test the change?**:

The snippet shared by the customer in
<#2721> reproduces the
issue for me every time.

With this change, as expected, the issue no longer appears.
ivoanjo added a commit to DataDog/documentation that referenced this issue Apr 4, 2023
 ### What does this PR do?

This PR extends the troubleshooting steps added in #17370 with a new
known incompatiblity with the `rugged` gem.

(Originally reported in
DataDog/dd-trace-rb#2721 )

 ### Motivation

Document the known incompatibility.
cswatt added a commit to DataDog/documentation that referenced this issue Apr 4, 2023
…17552)

* [PROF-7409] Document Ruby profiler incompatibility with rugged gem

 ### What does this PR do?

This PR extends the troubleshooting steps added in #17370 with a new
known incompatiblity with the `rugged` gem.

(Originally reported in
DataDog/dd-trace-rb#2721 )

 ### Motivation

Document the known incompatibility.

* Update content/en/profiler/profiler_troubleshooting.md

---------

Co-authored-by: cecilia saixue watt <cecilia.watt@datadoghq.com>
@ivoanjo
Copy link
Member

ivoanjo commented Apr 12, 2023

Update: I've reported the issue to the libssh2 developers and also asked the rugged developers if they'd be open to shipping a workaround built-into the gem.

@ivoanjo
Copy link
Member

ivoanjo commented Jun 27, 2023

Looks like there's a PR to fix libssh2 ( libssh2/libssh2#1058 ), hopefully it will get accepted soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug community Was opened by a community member profiling Involves Datadog profiling
Projects
None yet
Development

No branches or pull requests

2 participants