Skip to content
This repository has been archived by the owner on Nov 1, 2020. It is now read-only.

Question: Software write watch and Background GC? #8291

Open
RalfKornmannEnvision opened this issue Aug 31, 2020 · 8 comments
Open

Question: Software write watch and Background GC? #8291

RalfKornmannEnvision opened this issue Aug 31, 2020 · 8 comments

Comments

@RalfKornmannEnvision
Copy link
Contributor

Maybe I am trying to open a box that I should better left closed. In this case just tell me that this is currently way beyond the scope.

I noticed that the runtime code for the ARM64 write barrier has changed since I originally ported it. Now it contains support for  FEATURE_MANUALLY_MANAGED_CARD_BUNDLES and FEATURE_USE_SOFTWARE_WRITE_WATCH_FOR_GC_HEAP. Therefore I was porting these changes over and tried to make use of them.

As there was some code missing in StompWriteBarrier I copied the parts over from the runtime, too. 
This got me to the point were the GC is starting the background thread and is trying to do a background collection. Unfortunately it runs in an assert in a validate method.

The question is if it even make sense at this point to try getting it working or is there just too much stuff missing?

And in any case what parts of my changes do you want to have if any at all?

PS The FEATURE_MANUALLY_MANAGED_CARD_BUNDLES part alone seems to work. At least the code in the writer barrier is executed without crashing. But I am not sure id there is a real test case to ensure it works correctly.

@jkotas
Copy link
Member

jkotas commented Aug 31, 2020

The only thing missing for background GC should be the changes in the write barrier helper.

You may also need to add this to InlinedBulkWriteBarrier in C++ runtime code. It should match InlinedSetCardsAfterBulkCopyHelper in CoreCLR.

The rest should just work. It should be pretty doable to turn this on.

@RalfKornmannEnvision
Copy link
Contributor Author

Will update the other places, too.

In the meantime I found the issue why the background GC gives me an assert. The GC expect that virtual commit behaves in a specific way. All committed memory should be cleared but only the first time a page is committed. When the memory for the background GC is committed it might commit the same page twice as this memory is part of a large block with multiple sections. The sections them self are not aligned to page boundaries. Unfortunately the virtual memory I have doesn't behave that well and I end up with either uninitialized memory or the page is cleared again the second time it is committed.

From a performances point of view the best solution would properly to add a new feature to the GC were is just clear the sections it needs to be zeroed after a commit. But as this would be only need in some special cases were CoreRT needs to run on a system were the virtual memory manager doesn't behave that well I am unsure if you would accept such a pull request. I expect to see this mostly in bare metal situations were the memory is not shared between different processes and therefore there is no risk of leaking information.

If not I need to add some code in the GC to OS interface that makes sure that the same memory is not cleared again when committed multiple times.

@jkotas
Copy link
Member

jkotas commented Sep 1, 2020

The GC should have a good idea about the sections that are committed already. Can this be fixed by fixing the ranges that the GC is passing into the virtual alloc calls?

@RalfKornmannEnvision
Copy link
Contributor Author

Maybe. I haven't stepped through the GC code enough to be sure.

The overlapping commit is caused at gc_heap::commit_mark_array_by_range  (line 25929)
gc.cpp 

    size_t beg_word = mark_word_of (begin);
    size_t end_word = mark_word_of (align_on_mark_word (end));
    uint8_t* commit_start = align_lower_page ((uint8_t*)&mark_array_addr[beg_word]);
    uint8_t* commit_end = align_on_page ((uint8_t*)&mark_array_addr[end_word]);
    size_t size = (size_t)(commit_end - commit_start);

during gc_heap::commit_mark_array_bgc_init for multiple segments. (line 26002)

The code knows the exact range it needs to be committed but align it to the next page boundaries to make sure the region is completely included. It might be that the first page is always already committed but I don't know enough about the GC to be sure about this.

@RalfKornmannEnvision
Copy link
Contributor Author

Found an issue related to this that keep the Background GC from working, too: #8292

@RalfKornmannEnvision
Copy link
Contributor Author

Beside of this double commit issue the background GC is now working. I did some more debugging and it looks like that this memory block contains different management data for a segment. While the whole block is reserved in one go the actual commit is split in two parts. Directly after the reserve anything beside the mark array at the end of the block. The commit of the mark array is postponed until the background GC thread is started. Based on this it looks save for me to move the start of the commit for the mark array to the next page if it starts somewhere in the page that is already committed.

@jkotas
Copy link
Member

jkotas commented Sep 1, 2020

It is best to submit GC fixes to dotnet/runtime repo, and get better testing and review from GC experts there.

@RalfKornmannEnvision
Copy link
Contributor Author

Well it's not really a bug as any of the currently officially supported OS have a virtual memory manager that behaves nicely. I will open a issue over there and ask what solution they prefer

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants