Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dunkel is dead #66

Open
inducer opened this issue Oct 3, 2022 · 4 comments
Open

Dunkel is dead #66

inducer opened this issue Oct 3, 2022 · 4 comments

Comments

@inducer
Copy link
Contributor

inducer commented Oct 3, 2022

Currently, lots of messages like this in the dmesg:

[Mon Oct  3 13:52:42 2022] EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
[Mon Oct  3 13:52:42 2022] EDAC sbridge MC1: CPU 0: Machine Check Event: 0 Bank 7: cc00418000010091
[Mon Oct  3 13:52:42 2022] EDAC sbridge MC1: TSC 0 
[Mon Oct  3 13:52:42 2022] EDAC sbridge MC1: ADDR 2031c14940 
[Mon Oct  3 13:52:42 2022] EDAC sbridge MC1: MISC 150481a86 
[Mon Oct  3 13:52:42 2022] EDAC sbridge MC1: PROCESSOR 0:406f1 TIME 1664823350 SOCKET 0 APIC 0
[Mon Oct  3 13:52:42 2022] EDAC MC1: 262 CE memory read error on CPU_SrcID#0_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x2031c14 offset:0x940 grain:32 syndrome:0x0 -  OVERFLOW area:DRAM err_code:0001:0091 socket:0 ha:0 channel_mask:2 rank:1)

cc @rckirby

@inducer
Copy link
Contributor Author

inducer commented Oct 3, 2022

They seem to be happening every few seconds or so.

Rebooted it, to see if that helps.

@rckirby
Copy link

rckirby commented Oct 3, 2022 via email

@inducer
Copy link
Contributor Author

inducer commented Oct 15, 2022

https://www.complang.tuwien.ac.at/anton/failing-memory.html has a description of someone troubleshooting a similar issue.

@inducer
Copy link
Contributor Author

inducer commented Oct 19, 2022

@lukeolson What's the latest here? @kaushikcfd will transfer the GPU out of dunkel to keep it usable this afternoon.

@inducer inducer changed the title Dunkel may have a memory failure Dunkel is dead Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants