New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abysmal performance for hash algorithms in dart:crypto #5
Comments
<img src="https://avatars.githubusercontent.com/u/5858078?v=3" align="left" width="48" height="48"hspace="10"> Comment by larsbak As a starting point we should add a benchmark to trace performance of MD5: #import("dart:io"); main() { Set owner to @madsager. |
<img src="https://avatars.githubusercontent.com/u/2909286?v=3" align="left" width="48" height="48"hspace="10"> Comment by madsager There is a lot that can be done to improve performance here. The data is being copied around way too much. Also, the code overflows smi range and (at least it used to be the case that) a lot of the medium-size integer operations ended up in runtime calls. cc @iposva-google. |
<img src="https://avatars.githubusercontent.com/u/2909286?v=3" align="left" width="48" height="48"hspace="10"> Comment by madsager Added Triaged label. |
<img src="https://avatars.githubusercontent.com/u/5449880?v=3" align="left" width="48" height="48"hspace="10"> Comment by iposva-google Mads, let's work together at creating a repeatable test case. I don't think file reading is really needed here. |
<img src="https://avatars.githubusercontent.com/u/4865287?v=3" align="left" width="48" height="48"hspace="10"> Comment by lrhn Removed Area-Library label. |
<img src="https://avatars.githubusercontent.com/u/5449880?v=3" align="left" width="48" height="48"hspace="10"> Comment by iposva-google I wrote a little test just now and it looks like we can close this bug. Abysmal is certainly not what we see anymore: import "dart:io"; readFileFully(var path) { main() { // Now compute the MD5 hash on the read bytes. Gives this set of results on my MacBook: dalbe[runtime] ./xcodebuild/ReleaseX64/dart --package-root=./xcodebuild/ReleaseIA32/packages/ ~/dart/Bug4611.dart dalbe[runtime] time md5 ./xcodebuild/ReleaseX64/dart This means we calculate the MD5 hash for a 9.41MB file in 810ms, which is significantly better than the reported 6MB in 22 seconds. Added AssumedStale label. |
<img src="https://avatars.githubusercontent.com/u/17034?v=3" align="left" width="48" height="48"hspace="10"> Comment by kevmoo Removed Library-Crypto label. |
This may be a slightly contrived example, but I was recently comparing dart performance to python performance on a set of 'programming challenges' and am still seeing VERY weak performance from dart's MD5 hashing. the following code computes about 10 million hashes, and takes roughly 18-20 seconds on my macbook, whereas the equivalent python code (using hashlib) runs in under 5 second.
of course hashlib is implemented in C, so the speed is to be expected, but it also clearly demonstrates that dart is nowhere near the same ballpark without implementing this stuff in C. |
Thanks for the report. Would you mind providing the version of Dart that you used? Did you have checked mode turned on? And, what version of crypto? |
I actually just updated my comment, seems there was an issue with my initial python run, and the real number is closer to 5 seconds vs. dart at 20 seconds. this is using dart 1.13.0. The numbers are a little more reasonable now, but it's still a significant performance hit. |
Thanks for the additional comments. I'm not sure this issue should be closed. Reopening. cc @sgmitrovic if he's curious. |
I see that |
Is this issue still being worked on? On my laptop (Ubuntu 19.10): $ dart2native dart_md5.dart -o dart_md5
$ time dart_md5 video.mpg
54cd56fa1fcf144b21d48357e4e694a3
real 0m1,498s
user 0m1,584s
sys 0m0,038s
$ time md5sum video.mpg
54cd56fa1fcf144b21d48357e4e694a3 video.mpg
real 0m0,191s
user 0m0,187s
sys 0m0,004s On my Raspberry Pi (raspbian): $ dart2native dart_md5.dart -o dart_md5
$ time dart_md5 video.mpg
54cd56fa1fcf144b21d48357e4e694a3
real 0m7.923s
user 0m7.972s
sys 0m0.181s
$ time md5sum video.mpg
54cd56fa1fcf144b21d48357e4e694a3 video.mpg
real 0m0.646s
user 0m0.514s
sys 0m0.133s Here's the relevant part of my code: static Future<String> _hash(File file) async =>
(await md5.bind(file.openRead()).first).toString(); Dart VM version: 2.7.0 |
Originally opened as dart-lang/sdk#4611
This issue was originally filed by dha...@google.com
Test code:
new File('recordroyale.ogg').readAsBytes().then((buf) {
var md5 = new MD5();
md5.update(buf);
print(md5.digest());
});
where 'recordroyale.ogg' is about six megabytes. This takes 22 seconds on my machine. Shelling out to md5sum(1) takes under 0.02 seconds. I understand that Dart won't ever be as fast as C, but with this disparity, the hash algorithms should be implemented in C[++] rather than Dart.
I had a vague suspicion that this was slow due to unnecessary allocations, but the allocations are mostly small and consumed quickly. Refactoring to eliminate unnecessary allocations resulted in no change.
My use case involved checksumming a byte array that existed only in memory at the relevant location; it is far less convenient to write data to a temporary file, shell out to md5sum(1), then delete that file.
The text was updated successfully, but these errors were encountered: