Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark #2

Open
zhoub opened this issue Mar 30, 2016 · 16 comments
Open

Benchmark #2

zhoub opened this issue Mar 30, 2016 · 16 comments
Milestone

Comments

@zhoub
Copy link
Contributor

zhoub commented Mar 30, 2016

Platform Windows

Test file: Zombie Static

https://drive.google.com/open?id=0B-wMG_5skSbASk5DamxyV2FpMEE
3M

Local Git

12.206579s wall, 0.499203s user + 3.151220s system = 3.650423s CPU (29.9%)

Milliways

0.323581s wall, 0.312002s user + 0.015600s system = 0.327602s CPU (101.2%)

_store.mwdb 3.2M_

Ogawa

0.282190s wall, 0.109201s user + 0.171601s system = 0.280802s CPU (99.5%)

Test file: Zombie HDF5 2sec

https://drive.google.com/open?id=0B-wMG_5skSbASmE4MHpTbUZRVm8
196M

Local Git

238.399374s wall, 37.689842s user + 52.572337s system = 90.262179s CPU (37.9%)

Milliways

38.366685s wall, 35.147025s user + 3.198021s system = 38.345046s CPU (99.9%)

_store.mwdb 264M_

Ogawa

22.015381s wall, 11.013671s user + 10.951270s system = 21.964941s CPU (99.8%)

@aghiles
Copy link

aghiles commented Mar 30, 2016

Cool ... revisions handle well in Milliways ?

@pberto
Copy link
Contributor

pberto commented Mar 30, 2016

We are testing But the speed is amazing. Now I will log a couple of questions for @panta.

@pberto
Copy link
Contributor

pberto commented Mar 30, 2016

So, first question is about size, I think we could get a better size if we tweak the block size. What do you think @panta?

My second question is about the file. With Milliways we have everything in one file (say archive.mwdb). Now it would be great if this file could be archive.abc so that the repository is not a directory anymore. I understand this may not be possible since this is handled by libgit/git but anyway I want to ask if you have any idea.

@panta
Copy link
Contributor

panta commented Mar 30, 2016

Well, for starters, speed is not yet amazing imho. I think I can do better, maybe better than Ogawa, the code is not yet optimized for speed :)

Then, regarding block size: we can experiment but probably 4k is the best size for disk I/O performance. We can improve space utilization by using a more advanced logic in space allocation, or by using some really fast compression algorithm, like LZ4. To keep in mind: when changing block size it's necessary to change also the B+-Tree B factor (currently 68, see BTreeFileStorage_Compute_Max_B() function).

About the file, maybe it's possible but we'd need to store also the refs db inside milliways (quite easy) and probably to modify libgit2 to dispense completely of the .git directory (no idea of how complex), which means also keeping the .git/config and other amenities inside milliways. We need to investigate this further.

@panta panta added this to the ng.it milestone Mar 30, 2016
@pberto
Copy link
Contributor

pberto commented Mar 30, 2016

Speed: it would be awesome if we can read faster than ogawa. Really it would be insane.Do your best.

Block size: 4k (& B+ tree factor) is the best for I/O on all operating systems?
Remember that windows is the one that suffers the most. On the other hand with the new SSD storage this gets minimized a lot.

Advanced space allocation / LZ4: whatever can be done fast I am up for it.

File: let's put this in the backburner for now, but maybe you could ask in the git/libgit mailing list to gain the info.

@panta
Copy link
Contributor

panta commented Apr 1, 2016

Ok, latest commit in panta/pre-alpha/optimizations-1 is about 7x than the initial version. In multiverse there will be some other overhead, but should be significantly faster than before. There are other smaller optimizations possible.
Let me know.

@pberto
Copy link
Contributor

pberto commented Apr 1, 2016

I am literally drooling.

@pberto
Copy link
Contributor

pberto commented Apr 5, 2016

Hello Marco,

ok we now have a build cross platform.
Read/write performance seems good (let's do some profiling later and figure out if things can be improved). Right now we have some caches where the size is actually larger. Can you try it on the optimusBotOgawa.abc? We have 1GB (milliways) vs 768MB (ogawa). Note that this is just one commit, we should gain from the second commit (except if topology changes completely).

@pberto
Copy link
Contributor

pberto commented Apr 5, 2016

Katana Robot Testing (anime keyframes, but no deform in robot)

24 frames, 2 motion samples.
Done in Maya 2015, MacBook Pro with SSD.

Write Time Size 1st commit (Size on Disk) Size after 2nd commit Read Time
HDF5 7.4 s 136.5 MB 273 MB 1.4 s
Ogawa 6.4 s 134.4 MB 269 MB 1.1 s
Git 18.2 s 69 (95.6) MB 9.2 s
Git Milliways 9.7 s 157.3 MB 157.6 MB 3.1 s

Zombie (Full Deform)

96 frames, 2 motion samples.
Done in Maya 2015, MacBook Pro with SSD.

Write Time Size 1st commit (Size on Disk) Size after 2nd commit Read Time
HDF5 22.0 s 305.6 MB 611 MB 5.9 s
Ogawa 15.9 s 287.7 MB 576 MB 3.8 s
Git 56.6 s 169.5 (253) MB 338 MB (499.6) MB 15.9 s
Git Milliways 25.8 s 405.3 MB 810.5 MB 8.6 s

40K cubes (copies, not instances)

1 frame, 1 motion sample.
Done in Maya 2015, MacBook Pro with SSD.

Write Time Size 1st commit (Size on Disk) Read Time
HDF5 139.0 s 231.5 MB 960 s
Ogawa 99.0 s 41.2 MB 904 s
Git 820.1 s 57.7 (1350) MB undone
Git Milliways 173.7 s 96.4 MB 994 s

@pberto
Copy link
Contributor

pberto commented Apr 6, 2016

Added more results.

@panta
Copy link
Contributor

panta commented Apr 6, 2016

it's definitely better than with the "classic" git backend, except for the worse space utilisation, but I think there is still a considerable margin for improvement and many optimization opportunities.
Here are some possible optimization opportunities I've identified:

  • hint block and node cache operations with operation kind (i.e. to avoid reading a block from disk if the block is to be written immediately after with new contents)
  • explicitly cache last search result in key value store operations (since almost always git performs a sequence of has() - get() - put())
  • wire down in caches blocks and nodes used every time
  • evaluate different cache architectures
  • check if eliminating shared pointers and direcly use actual block and nodes could provide benefits
  • when using milliways as a backend skip JSON (and maybe even msgpack) and use our fast binary serialization
  • misc minor optimizations

@aghiles
Copy link

aghiles commented Apr 6, 2016

The problem I see is that the space at the first commit is always larger than Ogawa. That is a bit of a show stopper because 80% of all Alembic assets (and even more) will only have one commit. Am I missing something here?

@pberto
Copy link
Contributor

pberto commented Apr 6, 2016

I have no doubts there is large space for optimization, let's move onto them! 👍

Some comments:

  • even the classic git backend had troubles with production scenes, you can see how it starts to go near ogawa in size in the robot and it completely dies in the 40K cubes nightmare.
  • Ideally I would like Milliways to be the default writing choice, is the optimizations Marco suggests are going to work nice, it will be the natural choice. Basically the classic git backend cannot be used for large assets... I would even remove it from the choice of blackened form the DCC App UI (still leave it for conversion purposes).
  • personally I think it would be nice to bypass JSON/msgpack and use seriouslyTM (is that the name no? :) ) one question though: would it be still possible to abcconvert to a classic git cache after?

@pberto
Copy link
Contributor

pberto commented Apr 6, 2016

Further edits on my comments.

@panta
Copy link
Contributor

panta commented Apr 6, 2016

regarding the space utilisation, we can improve also there. I am focused on speed now, but after that I'll tackle this too.

yes, seriously is the name, seriously :) And yes, when converting to "classic" git, we have to use the old format, JSON and all (it's too useful to be able to edit a text format from time to time).

@pberto
Copy link
Contributor

pberto commented Apr 7, 2016

@panta according to libgit2/libgit2#3566 we are not using any compression. My opinion is that right now zlib is not even used, we need to experiment putting in lz4 or whatever compression it makes sense as this will affect performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants