Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dsc_host crashing on CentOS 8 #1282

Open
brianloss opened this issue Dec 8, 2020 · 13 comments
Open

dsc_host crashing on CentOS 8 #1282

brianloss opened this issue Dec 8, 2020 · 13 comments

Comments

@brianloss
Copy link
Member

I have installed OMS Agent 1.13.33 on CentOS 8.2. The documentation indicates CentOS 8 is supported. However looking at the output for dmesg, I see segfault errors from dsc_host approximately every 15 minutes (there are several every 15 minutes). The messages look like the following:

[Tue Dec  8 22:58:05 2020] dsc_host[840579]: segfault at 28 ip 0000000000433fb4 sp 00007fff5089cdb0 error 4 in dsc_host[400000+94000]
[Tue Dec  8 22:58:05 2020] Code: 89 4c 24 30 64 48 8b 04 25 28 00 00 00 48 89 84 24 a8 00 00 00 31 c0 4d 85 e4 b0 04 0f 84 94 01 00 00 49 c7 04 24 00 00 00 00 <48> 8b 6f 28 49 89 fe 49 8b 50 18 48 8d 05 3e 19 04 00 49 89 cf 4c
@johanburati
Copy link

I'm seeing the same error on Centos 8.1, Centos 8.2, RHEL 8.1 and RHEL 8.2:

kernel: dsc_host[9960]: segfault at 28 ip 0000000000433fb4 sp 00007ffc964cb850 error 4 in dsc_host[400000+94000]
kernel: Code: 89 4c 24 30 64 48 8b 04 25 28 00 00 00 48 89 84 24 a8 00 00 00 31 c0 4d 85 e4 b0 04 0f 84 94 01 00 00 49 c7 04 24 00 00 00 00 <48> 8b 6f 28 49 89 fe 49 8b 50 18 48 8d 05 3e 19 04 00 49 89 cf 4c

@brianloss did you open a support request for this ?

@brianloss
Copy link
Member Author

@johanburati I thought this was the place to report issues. If there's somewhere else I should report, I'm happy to do so if you can point me in the right direction.

@johanburati
Copy link

It does not appear that the developers are actively monitoring issues on this project.

If you are having this issue with a specific VM, I would suggest you create a support request by clicking New support request at the bottom left menu on that VM blade in the Azure Portal:

image

If you do please let me know the number or tell the case owner to create a collab with joburati, I will follow up to fix this issue.

@Klaas-
Copy link

Klaas- commented Jan 13, 2021

it seems dsc_host is part of https://github.com/microsoft/PowerShell-DSC-for-Linux, linking author @zjalali

@johanburati
Copy link

I cut an issue in the other repo: microsoft/PowerShell-DSC-for-Linux#764

@johanburati
Copy link

johanburati commented Jan 15, 2021

Putting the workaround here as well so it is easier to find.

It's kind of the same thing as for RHEL 8 and other extensions: Azure/WALinuxAgent#1719

WORKAROUND

This should prevent dsc_host from crashing and filling up the fs with core files until the devs push a new release that fix the issue for good.

python is missing on those images, you can create and make a link and make it point to python3 with the following command:

sudo alternatives --set python /usr/bin/python3

Or if you are seeing issue with python3, you can install python2 and make python point to python2 :

sudo dnf install python2 -y
sudo alternatives --set python /usr/bin/python2

Tried it on a couple of VMs and did not see any crash after using either option.

@Klaas-
Copy link

Klaas- commented Jan 15, 2021

I think Microsoft needs to have a real look at all their python-dependent linux software. there is another issue #1292 that seems to have a similar issue :)

Anyway I'll repeat here what I said in the other issue: This should be addressed by properly recognizing python and not by aliasing the python command. Red Hat does not recommend to set an unversioned python link ( see man unversioned-python ) on a recent rhel.

@brianloss
Copy link
Member Author

@johanburati thanks for the follow-up. I kind of lost track of this while on break over the holidays, and didn't get around to filing a support case in Azure. Did you end up doing that, or should I still file it?

Anyway I'll repeat here what I said in the other issue: This should be addressed by properly recognizing python and not by aliasing the python command. Red Hat does not recommend to set an unversioned python link ( see man unversioned-python ) on a recent rhel.

The release notes for 1.13.33 explicitly state that there is no longer a requirement to install python 2 or alias the python command to it (see here). It would appear those notes are incorrect due to the "broken" dependency PowerShell-DSC-for-Linux.

@Klaas-
Copy link

Klaas- commented Jan 15, 2021

@brianloss as a customer myself, I would suggest to file an issue with microsoft. That way microsoft can properly qualify the impact of an issue. EDIT: I see in your profile you're with Microsoft :D so maybe you can escalate the existing issue internally. 121011325001609

@johanburati
Copy link

johanburati commented Jan 16, 2021

@brianloss there are already a couple of support cases open for this, the issue has been escalated, and from what I've heard the devs are working on a new version that will works with python3.

It is true what I have submitted is more of a workaround but at least it will prevent the program from crashing and generating core files until the devs release a new version that fix the issue.

I put the instructions to install python2 just in case because when I first reported the issue with the Centos/RHEL 8 images and python (Azure/WALinuxAgent#1719) some extensions had other issues with python3, it was in 2019 so maybe they all works fine with python3 now. I've see notes that indicate some components still do not work with python3, but did not see any issue after using the workaround in my lab.

I hope this will help other ppl having this issue.

@johanburati
Copy link

@Klass- thanks for sharing the case number, I've sent you an update by email.
@brianloss please check Klass's ticket notes, I have added the link for the internal ticket with the devs.

@1lolbus1
Copy link

Thanks for the solution @johanburati !
I ended up using a symlink to the same effect

ln -s /usr/bin/python3 /usr/bin/python

@Klaas-
Copy link

Klaas- commented May 17, 2021

This seems to be fixed by omsconfig-1.1.1-930.x86_64, you should no longer recommend the python workarounds in docs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants