Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try Camillo's dictionaries #174

Open
GoogleCodeExporter opened this issue Mar 24, 2015 · 8 comments
Open

Try Camillo's dictionaries #174

GoogleCodeExporter opened this issue Mar 24, 2015 · 8 comments

Comments

@GoogleCodeExporter
Copy link

Look at mail thread titled "Migrated HashTable from ss to SmalltalkHub".

Original issue reported on code.google.com by tinchod...@gmail.com on 11 Feb 2013 at 4:12

@stale
Copy link

stale bot commented May 18, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will remain open but will probably not come into focus. If you still think this should receive some attention, leave a comment. Thank you for your contributions.

@stale stale bot added the stale label May 18, 2021
@stale stale bot removed the stale label Oct 30, 2021
@theseion
Copy link
Owner

Might be nice to revisit at some point.

@tinchodias
Copy link
Collaborator

I searched the subject in my email account and found it, but tried with Google to place here a URL to the mailing-list discussion and didn't find it.

However, I can summarize it: In Feb, 10th 2013, Stef sent an email announcing the repository migration of HashTable, then Sven asked what about it, and Camillo Bruni answered:

I think it's a dictionary implementation based on buckets (as any other decent
dictionary implementation does...)

I've shown in my master thesis that a simple bucket-based dictionary can outperform our poor-mans single array collection dictionary in almost every operation. The only drawback is the slightly bigger size in memory.

http://scg.unibe.ch/archive/masters/Brun11a.pdf

I just looked at sthub and found a match: http://smalltalkhub.com/Moose/HashTable/
But I remembered there is gh/pharo-containers gathering this kind of projects and found a repo that should be a more recent successor: https://github.com/pharo-containers/Containers-HashTable

@theseion
Copy link
Owner

It's neither. That HashTable is pretty much a slower version of the standard dictionary with some additional abstractions (e.g. uses a wrapping element). Cami's dicts are part of Pinocchio and Pinocchio includes the benchmark suite he describes in his thesis. Pretty nice!

http://squeaksource.com/@7M9PFWFa1IUf1xjv/jOl2eyM7

@theseion
Copy link
Owner

I spent a bit of time analyzing FLLargeIdentityDictionary using Camillo's benchmarks. I noticed that the performance is generally comparable to Camillo's dictionaries but becomes really bad when inserting a large number of objects with little hash collisions, which is pretty much our standard case (since we use #identityHash). The issue is that we use a fixed number of buckets and these buckets can, therefore, become very large which, in turn, is an issue because we need to iterate over the entire bucket and check every entry before we know that it's a new entry.

The other issue I found, although I didn't look at it too much, is, that SmallInteger>>largeIdentityHash performs much worse then SmallInteger>>identityHash, both in terms of performance and (from what I saw in my brief look) hash distribution.

I will analyse a bit more and then maybe replace FLLargeIdentityDictionary with an adapted version of P4IdentityDictionary.

@stale
Copy link

stale bot commented Jan 12, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will remain open but will probably not come into focus. If you still think this should receive some attention, leave a comment. Thank you for your contributions.

@stale stale bot added the stale label Jan 12, 2022
@theseion
Copy link
Owner

Not stale

@stale stale bot removed the stale label Jan 12, 2022
@stale
Copy link

stale bot commented Mar 13, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will remain open but will probably not come into focus. If you still think this should receive some attention, leave a comment. Thank you for your contributions.

@stale stale bot added the stale label Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants