New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpu-o3: executing MFENCE in two parallel SMT threads causes hang #1049
Comments
I think I've found the root cause of the issue. Thread 0 exits in the same cycle that thread 1's MFENCE instruction [sn:74] is sent from IEW to commit (via the void
CPU::removeThread(ThreadID tid)
{
.....
// Flush out any old data from the time buffers.
for (int i = 0; i < timeBuffer.getSize(); ++i) {
timeBuffer.advance();
fetchQueue.advance();
decodeQueue.advance();
renameQueue.advance();
iewQueue.advance(); // This causes any completed instructions in-flight to commit, even those belonging to different threads, to be lost!
}
.... |
One fix would just be to clear out all thread-specific state from each time buffer (or just |
Fix gem5#1049. Clear only thread-specific state from the O3 CPU time buffers, rather than clearing *all* state for *all* threads. Change-Id: I48cd50e39b3cd5e0068ddfc19a03d9bbd3f31bd5
Fix gem5#1049. Clear only thread-specific state from the O3 CPU time buffers, rather than clearing *all* state for *all* threads. Change-Id: I48cd50e39b3cd5e0068ddfc19a03d9bbd3f31bd5
Fix gem5#1049. Clear only thread-specific state from the O3 CPU time buffers, rather than clearing *all* state for *all* threads. Specifically, this patch adds a `clearStates()` method for all O3 time buffer structs in src/cpu/o3/comm.hh. Upon thread exit, `CPU::removeThread()` now invokes this method for all structs in each time buffer, rather than flushing out the time buffers (which nukes the states for all threads, not just the exiting one). Change-Id: I48cd50e39b3cd5e0068ddfc19a03d9bbd3f31bd5
Thanks @nmosier ! I think your solution is correct. |
Describe the bug
Executing an x86
mfence
instruction in two sibling SMT threads on the O3 CPU causes the CPU to get stuck. Specifically, it appears that one of themfence
s reaches the head of the ROB but never becomes ready to execute.This simple assembly proof of concept (
mfence.asm
) triggers the bug:Affects version
develop @ c54039d
gem5 Modifications
None
To Reproduce
scons build/X86/gem5.opt
nasm -felf64 -o mfence.o mfence.asm && ld -o mfence mfence.o
./build/X86/gem5.opt configs/deprecated/example/se.py --cpu-type=X86O3CPU --smt --caches -c './mfence;./mfence'
Terminal Output
As you can see, the simulation times out after 10 seconds without both threads exiting.
Expected behavior
Both threads should exit immediately.
Host Operating System
Ubuntu 22.04
Host ISA
X86
Compiler used
GCC 11.4.0
Additional information
None yet
The text was updated successfully, but these errors were encountered: