-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mmap 4096 bytes at (nil): Cannot allocate memory #561
Comments
Hi @srhb This is a known issue that we've been discussing pretty extensively on taffybar's matrix channel. Believe it or not it seems to somehow be related to a change in the kernel (it doesn't happen on any 5.* kernels). I haven't had a ton of time to look in to the root cause yet, but one issue that we have is that we don't have a consistent repro. Do you have a set of steps you can take that consistently cause this issue? Have you tried removing individual widgets maybe determine which one might be causing it? |
Edit: Nope, false alarm. The mem graph alone does it too.
|
This also still exhibits the problem:
|
@srhb Okay that's pretty helpful, but also a bit puzzling. There's not too much going on when there are no widgets. |
I just started seeing this after a system update, too. My core dump message looked slightly different than srhb's though; the stack trace at the end also mentions
Full logs at the time of the crash just in case there is anything else relevant I've missed:
(The I'm not doing anything with virtual monitors. My taffybar is also run using the Home Manager module, but I build my config statically via a cabal package and run that rather than the binary Home Manager would provide. My system and my home environment are both managed by a single Nix flake. When I started seeing this was just after an update (yesterday) that did this:
So the difference is ultimately traceable, if we can figure out what to look at (I could trace the exact taffybar source used in both of those states for example, but there hasn't been any commits in that time let alone releases, so they should be the same). I'm pretty sure that change didn't update the Linux kernel from 5.* (it's now 6.1.27), but I'm going to rollback to the previous generation and check after I finish writing this comment. All that said, I might be wrong about taffybar only coredumping since that update. It's only this morning that taffybar has been dying entirely, but over the past few weeks I have been noticing it going completely blank for a fraction of a second every now and then. It occurs to me that might have actually been it dying and being restarted by systemd, and it's only since yesterday's update that it's coredumping so many times in a row that it exceeds systemd's auto-restart limit. :( |
Okay, my previous version is on Linux kernel 6.1.25. So there was a kernel update (to 6.1.27), but not the major update from 5 (that happened a couple of months ago, according to my generation history). Taffybar hasn't coredumped within a few minutes. I have seen the flicker I was talking about, and no coredump in the logs, so that wasn't the same thing after all. So something changed between those two nixpkgs commits that is triggering this. If it is the Linux kerel, then it's something between 6.1.25 and 6.1.27. I'm happy to try anything you think would generate useful information. |
@cumber Cool. all of that info is pretty helpful. I was never 100% sure which kernel version caused the problem, so a difference between 6.1.25 and 27 could be the cause, although it is surprising that it would be such a minor version change.
Can you just put taffybar logging in ultra verbose mode? |
I'll also note this. I was seeing this problem consistently on my machine for a while, then I did a nixos system rebuild where I bumped nixpkgs and it went away. Interestingly, my kernel version is 6.1.23 atm. I've already garbage collected my old generation so I'm not sure what I was on before, but I doubt it was a later version. |
I will say it takes longer to trigger with no widgets. With just the clock widget I can trigger it much, much more often. |
So, I switched back to the Nixos build with kernel 6.1.27, and ran with the minimal Taffybar config from @srhb's earlier comment. I just added:
before the I then ran for over 4 hours without seeing any coredumps. However when I took the logging calls out, rebuilt and I then went switched back to my full normal Taffybar config, and added the DEBUG logging in there. It also seems more stable when I have the DEBUG logging than normal, but I did see it coredump while I had DEBUG logging enabled. Here's the relevant section of
|
If I'm not just seeing things (or it's a coincidence), and the extra logging does make it less likely to happen, that probably suggests some sort of race condition somwhere? |
I found a thread talking about an issue that sounds like it could be the same issue: https://bbs.archlinux.org/viewtopic.php?id=282429 There the conclusion was that it's a kernel bug that's been fixed in 6.2.9, and also that GHC 9.4 avoids the problem. If that's true then nothing to do here, really. (The details of the GHC & kernel level went completely over my head, so I couldn't get an impression of how likely it was that unrelated problems would present with the same |
I'm out of time for tinkering with my config today, but is Taffybar already GHC 9.4 ready? I'll try building with 9.4 and see if that fixes it (default GHC in nixpkgs is 9.2 in my versions) |
I just tried with kernel 6.2.14 and it still crashes for me (with my regular setup) |
I mean I don't think there's any reason taffybar wouldn't work with a newer ghc, but I haven't tried it. You should be able to switch the ghc version by using an alternative haskellpackages. |
Thanks for all your investigation @cumber. Looks like the issue might be happening when we are allocating space for the gtk.Image. I wonder if we can replicate the issue quickly by just making a ton of those in a loop. |
In the link that @cumber posted it was indicated that the patch still hasn't made it into mainline yet (which seems strange, it's a pretty serious bug) I think. Maybe trying to compile with ghc 9.4 is our best bet for now. |
Someone said the patch was queued for 6.2.9, and then someone else said upgrading to 6.2.9 fixed it, so I had assumed the earlier "not in mainline yet" comment was out of date. But now I see that someone else later reported the issue with 6.3.1, so maybe not. |
Building with ~ghc94 in nixpkgs is going to take some work with haskell-gi, because,
Can maybe look at that later in the week, if no one beats me to it. |
@srhb See https://discourse.nixos.org/t/haskell-ghc-9-4-4-flake-extra-dependencies/24777 which seems to be the same issue. I've tried doing what was suggested, but it seems to have required a recompile of everything: |
@cumber the build is now working with ghc94, but I don't think it seems to fix the issue. Mind giving it a shot? EDIT: nvm. I was an idiot before and was still actually using my old taffybar. ghc94 definitelyh seems to fix the issue. |
Looks promising so far! I'll report back in a day or so, zero crashes yet. |
Still no crashes. I'd say we're home free, thanks! 👌 |
@srhb well, I mean its not really fixed per se. There's just a workaround which is to use a new ghc. I'm thinking maybe we should leave this open for the sake of discoverability until its fixed in the kernel? |
Fine by me! :) |
@IvanMalison I'm having a lot of trouble building Taffybar with GHC 9.4; what process did you use? The changes you made to get GHC 9.4 builds working seem to have been all in flake.nix, so I presumed I would need to build from the flake to benefit from them. (As opposed to overriding the source used in the nixpkgs packaging of taffybar, or anything like that) Since my normal config is to use Taffybar as a library to build an executable with my own Cabal project, I added "github:taffybar/taffybar/master" as an input to my system flake, applied Then I just just cloning and building your flake independently, to take all of my stuff out of the mix, but I can't get that to work either. I get an error about |
w.r.t. this specific issue, I don't think taffybar uses the gtk3 pacakge (taffybar switched to gi-gtk package 3 years ago) anymore. Perhaps you have gtk as a dependency of your library even though it isn't necessary?
https://gist.github.com/IvanMalison/44b042239d7eb913a11033422d4be4c8 My suspicion is that its just about nixpkgs version, but also, the way I'm fixing all of these libraries is a little jank. You should be able to use: to do it in a more principled way. |
@cumber also if you have your dotfiles somewhere, I could probably take a look and see if I can figure out what the issue is. |
@cumber Also, it would be great if you joined the taffybar matrix chat: https://matrix.to/#/#taffybar:matrix.org I can probably help you in realtime a bit better there, and I'd love to have more taffybar users (especially ones like you that seem that have haskell and nix experience and could maybe help others) there! |
@IvanMalison You're right, the With that removed, using the taffybar overlays does work. I've been running with a GHC-9.4-built Taffybar for a couple of hours now, and no segfaults! |
Dangit, just had another coredump. First one in about 12 hours of uptime since I got the 9.4 build running. The log messages were slightly different here, though. There's no message about Here's the full log just in case it's useful (probably not though):
Any idea whether this is even the same issue? Given I'm on a "temporary" setup until the kernel issue (and the Cabal issue that required all those flake.nix hacks) is fixed and makes it into nixpkgs, I'd be happy to just wait and see if it still happens once everything is settled. If it's hours between crashes systemd just restarts it and it's not a big deal. |
Hmm, so it is a segfault, but it doesn't seem like it is the same issue, because I don't see the mmap message. I don't think we can definitely rule out that its not the same thing, but I'd definitely lean in that direction. Keep any eye on it? The fact that it seems much less frequent is also a sign that its likely different, but I would be curious to see if this does happen again and with what frequency. |
wdym by temporary setup? Actually I think we should be able to get taffybar build on ghc94 as standard in nixpkgs without too much effort. I just haven't gotten around to it yet. |
@IvanMalison By temporary, I only meant using your flake instead of just using I'll admit I haven't fully grokked what was actually done to get GHC 9.4 builds working, beyond a vague notion that it's explicitly providing a bunch of non-Haskell libraries that used to be found via deeper dependencies. |
Nixpkgs now seems to have a kernel new enough to have fixed the issue that was crashing Taffybar, so no longer need to build with GHC 9.4 (which avoided the issue). See taffybar/taffybar#561
It's happened again a couple of times. Caught the stack trace again just now:
Same as last time. Still not the My system is now running Linux kernel 6.3.3, which seems to have fixed the original issue. I had switched back to building from nixpkgs (with the default |
It now works with the latest NixOS unstable (NixOS 23.11pre494976.7c67f006ea0 Tapir) with 6.1.34 kernel |
Describe the bug
Taffybar often coredumps for me
To Reproduce
I'm not quite sure, but it might be related to virtual monitor set with xrandr (I split my ultrawide into 3 panes like this:)
It seems to happen sporadically in this setup. I am also using the gtk-sni tray.
I am unsure what kind of debug information would be helpful here. I can rebuild taffybar with whatever flags will be helpful.
Expected behavior
No coredump please 😸
Version information
4.0.0 from the overlay in this repo:
taffybar: github:taffybar/taffybar/5d1685f87ecbf283119110d002813d82f74342ea
Installed via the home-manager taffybar module
home.services.taffybar.enable = true
The text was updated successfully, but these errors were encountered: