New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DolphinQt: JIT Widget Refresh #12714
Draft
mitaclaw
wants to merge
8
commits into
dolphin-emu:master
Choose a base branch
from
mitaclaw:jit-widget-refresh
base: master
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mitaclaw
force-pushed
the
jit-widget-refresh
branch
4 times, most recently
from
April 17, 2024 21:30
d2e8ef4
to
a9c3c6c
Compare
AArch64 platforms without LLVM's disassembler available (e.g. Windows on ARM) have no alternative HostDisassembler to use. Might be worth looking into this in the future. |
mitaclaw
force-pushed
the
jit-widget-refresh
branch
2 times, most recently
from
April 19, 2024 08:08
e5d7213
to
fd5e22c
Compare
mitaclaw
force-pushed
the
jit-widget-refresh
branch
from
May 8, 2024 03:16
fd5e22c
to
d99baa6
Compare
mitaclaw
force-pushed
the
jit-widget-refresh
branch
4 times, most recently
from
May 10, 2024 04:46
2da5f42
to
7f3b68a
Compare
It now supports variable-sized data payloads and memory range freeing. It's a little faster, too.
This saves two register pushes / pops.
WritePC is now needed far less, only for instructions that end the block. Unfortunately, WritePC still needs to update `PowerPCState::npc` to support the false path of conditional branch instructions. Both drawbacks should be smoothed over by optimized cached instructions in the future.
This was a bigger performance boost than I expected.
I tried making InterpretAndCheckExceptions test only the relevant exceptions (EXCEPTION_DSI, EXCEPTION_PROGRAM, or both) using templating, but didn't find a significant performance boost in it. As I am learning, the biggest bottleneck is the number of callbacks emitted, not usually the actual contents of them.
Accessible from DolphinQt and Android.
I added a new `get_stats` member function to the upstream for use in the JIT Widget Refresh.
mitaclaw
force-pushed
the
jit-widget-refresh
branch
from
May 10, 2024 04:49
7f3b68a
to
ea5f6b0
Compare
Fulfilling a certain six-year-old todo, and revamping LogGeneratedX86 as a bonus.
mitaclaw
force-pushed
the
jit-widget-refresh
branch
from
May 10, 2024 07:10
ea5f6b0
to
16c4ef7
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Dolphin's current (soon to be former) JIT widget, even back when it was the JIT window in DolphinWx, has never really done much. The only thing it is equipped to do is respond to the
CodeWidget::RequestPPCComparison(u32 addr)
signal emitted by the "PPC vs Host" action in the Code View widget's context menu. Furthermore, this single thing it does is misleading thanks to a series of questionable choices. In short, the JIT widget (attempts) to show a disassembly of recompiled code and the PowerPC code it came from.How and where does it get all of this from just a single effective address? That's where it gets ugly.
First, the current JIT widget does the host code disassembly by calling
DisassembleBlock
, which in turn usesJitInterface::GetHostCode
.JitInterface::GetHostCode
searches to find a JitBlock with a starting address matching the provided effective address... or any JitBlock with a starting address up to -2000 away. It also needs to know the PowerPC feature flags of the block for this lookup, so it is just uses whatever is current†. If any block is found,JitInterface::GetHostCode
will next check if the effective address lies within the bounds ofblock->effectiveAddress
toblock->effectiveAddress + block->originalSize
††.JitBlock::originalSize
is the size of the originalPPCAnalyst::CodeBuffer
that was recompiled for the JitBlock, and these code buffers do not necessarily represent a contiguous sequence of instructions! What's more more, thanks to branch following, the same effective address can be recompiled multiple times by far-off JitBlocks with starting addresses this function is designed to disregard. I repeat, multiple times; this function does not account for the fact that there could even be more than one answer.† While this would be unacceptable in my refreshed JIT widget, I admit it was probably good enough within the context of the current JIT widget which only receives an address from the Code View widget, which is also bound by the current feature flags.
†† Actually, the upper bound is off-by-one due to it using
>=
rather than>
.Putting all of that aside, the
JitInterface::GetHostCode
will, if a JitBlock is found, return aJitInterface::GetHostCodeResult
containing the starting effective address of the JitBlock, a pointer within the near code cache, and the size of the recompiled near code†, otherwise it will return some flavor ofJitInterface::GetHostCodeError
.DisassembleBlock
will thenstd::visit
the variant to either disassemble†† the near code, or get an error message. All of this is then returned via a second result struct,DisassembleResult
, and finally we are back in the current JIT widget's update function.† Information about the far code cache is completely left out of this exchange; the current JIT widget was designed before that existed.
†† In the case of the CachedInterpreter, the disassembly is done despite the data not containing host instructions. The result is nonsense.
After putting the text from the
DisassembleResult
into a QTextBrowser†† for the host code disassembly, the horror continues. Next, a whole slew of PPCAnalyst classes are spun up with a bunch of hard-coded configurations† to perform an analysis on the PPC address from theDisassembleResult
. This is to create a CodeBuffer like what was originally used to recompile the near code. With this new CodeBuffer in hand, it is displayed in the other QTextBrowser†† for the PPC code disassembly. Finally, some extra stats about instruction counts and blowup are calculated to be stuffed into the end of the PPC code box "since it's generally the shortest". One of these statistics is an "estimated cycle count", which is not accurate for any block with early exits.† While there may be other incongruencies, the one I am aware of is that these hard-coded configuration include
OPTION_CONDITIONAL_CONTINUE
andOPTION_BRANCH_FOLLOW
, which the CachedInterpreter does not use.†† I just want to vent a little. Why did this widget use QTextBrowsers for this instead of QPlainTextEdits? Why was it so important to make the error messages be italic with HTML tags?
So that's everything the current JIT widget is able to do, and honestly it's not very good at it. Technically, the JIT widget also has a refresh button that can run the update function again using the last effective address, but I'm not counting that. So you might be wondering what is the point of the table that takes up half of the widget, and the answer is: it does absolutely nothing! In JITWidget.cpp, this comment says it all:
dolphin/Source/Core/DolphinQt/Debugger/JITWidget.cpp
Line 147 in 3527d97
So with introductions out of the way and motivations clear, I introduce my JIT Widget Refresh. Here are some screenshots of it in action:
Yes, that is fourteen columns. Don't worry, they are can be configured to hide and move around just like in my last major project, the Branch Watch dialog. It's almost difficult to explain all of the features this refreshed JIT widget has, but I will try anyway. First of all, the table now shows JitBlocks contained by the JitCache. This information is only available when emulation is paused. Also shown when emulation is paused is free memory and memory fragmentation info from the JIT on the status bar. Selecting any row will show a disassembly of that JitBlock's original CodeBuffer†, near host code, far host code, and instruction blowup stats on the status bar. The CachedInterpreter is even fully supported, featuring a disassembly of the simple bytecode it uses in the near code box. There are two actions in the context menu for the table. "View Code" shows the selected JitBlock's†† effective address in the Code widget. "Erase Block(s)" will erase individual JitBlocks from the JitCache, forcing those JitBlocks to recompile again without a full JitCache clear. Along the top of the widget are various controls. First, there are filters for min / max effective address, symbol name (derived from effective address), plus a filter for "Recompiles Physical Address" that allows you to see only JitBlocks that recompile a given physical address. Whenever the "PPC vs Host" action is signaled from the Code View widget, the address is translated and put into this filter, solving the multiple recompilations problem. This widget also has buttons mirroring actions in the JIT menu of the MenuBar, including "Clear Cache", a new "Wipe Profiling" action, and a button to start and stop software JIT profiling. Pro tip: you can click on the status bar to show memory stats if the instruction count stats are what's currently being shown.
† This is now done by storing a stripped-down copy of the original code buffer in the JitBlock, and this is only copied into the JitBlock when the debug interface is enabled.
†† When multiple rows are selected, the row that is right clicked (look for the dotted border on the cell) is the JitBlock which the effective address is taken from.
Unfortunately, I was a bit opportunistic with the following change; I wasn't satisfied with just fixing up one crusty relic at a time. There is a compile-time feature in the x86_64 JIT called LogGeneratedX86. Because the changes I made for the JIT widget made it so easy to do, I've updated this old feature to work for all three JITs. I know, not good for review purposes, but it just made sense and was hard to resist. Did you know the current LogGeneratedX86 doesn't even compile in the current master branch?
Alright, lightning round for the remaining changes. UICommon's Disassembler has been totally rewritten to not suck so bad and is now part of Common. Android also has access to JIT profile data wiping in the same debug menu I put the other JIT profiling stuff in. The Qt menu bar as well (this was implied earlier). I added a new
Common::Unreachable
function to mirror C++23'sstd::unreachable
, because I wanted this widget to be as optimized as possible. 🦀JitBlockData::codeSize
is gone 🦀. The physical addressstd::set
is now moved into the JitBlock instead of copied, it should be a lot faster that way. Oh, and I tried something new with the Qt class layouts this time, and I know it sort of violates a coding style guideline by having multiple public/private sections, but I found that by ordering it in this way, it made function dependencies flow chronologically in the header file and the cpp file really nicely. Maybe consider it? Alright, thank you for reading. I'm open to suggestions for how to split this into multiple commits, btw.