New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore page_collection_lock #1838
base: dev
Are you sure you want to change the base?
Conversation
@@ -677,9 +676,7 @@ page_collection_lock(struct uc_struct *uc, tb_page_addr_t start, tb_page_addr_t | |||
continue; | |||
} | |||
if (page_trylock_add(uc, set, index << TARGET_PAGE_BITS)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you also remove these function calls, which have side effects obviously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think page_trylock_add
also does nothing other than allocating memories since it's ignoring locks. or did I miss anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another side effect is page_entry_destroy
call when destroying objects, but it's also same. page_lock
is no-op in Unicorn, so I think it's a meaningless call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
page_collection_lock
is used by tb_invalidate_phys_range
, which is not non-ops in my experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
page_trylock_add
also inserts new pages, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm interesting. I'm running it without page_collection_lock
/unlock
, but no problem yet. tb_invalidate_phys_range
uses the page collection only for page_collection_unlock
call.
page_trylock_add
does not insert new pages. It creates a page entry and insert it to the tree, but it's not used anywhere. I guess it's only for managing order of locks.
For the speedup, I suggest you sharing the backtrace and |
Here is a snippet of profiling. Profiled by WPR and tested on Windows. Dev branch
With this fix:
Percentage means a proportion of CPU usage. Execution time is reduced from 370s to 116s. |
That's it and it matches my guess, this is slow because unicorn (qemu) has to ensure previous writes does invalidate any previous dirty code pages. This is specially handled due to Self Modifying Code (refer to qemu whitepaper if you wish).
I think a better solution is to check how qemu solves this because I feel that it shouldn't have too much overhead. Maybe we miss some core mechanism when manipulating qemu code. |
Oh, thanks for the hint. Now I see this problem: Unicorn always returns |
Thanks for your hint. This is an old known issue I notice when I developed Unicorn2. The root cause is a bit long story but keep it for short: for uc1 compatibility . However, now we have some ctl for flushing code cache so this could be improved in some way but I need to figure it out. |
Unicorn is already ignoring page locks, so we also don't need to manage the page collection and its tree objects. It helps us to avoid redundant object allocations and speed up the execution 3x faster for my case.