Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the Hashable interface on Container classes ? #65

Open
castarco opened this issue Oct 16, 2016 · 14 comments
Open

Implement the Hashable interface on Container classes ? #65

castarco opened this issue Oct 16, 2016 · 14 comments

Comments

@castarco
Copy link

Hi,

is there a reason to not implement the Hashable interface on the "container" classes? I think that it will be very useful in order to compare containers.

It seems that the current implementation forces users to loop over collections to compare them. I know that if this comparison is implemented as a method, it will work the same way, but I think is better to implement the comparison one single time rather than reimplement it again and again in "userland".

@rtheunissen
Copy link
Member

I like this, but I'm not sure how I would handle the hash value. I would prefer to implement an equals method on Collection, but not implement Hashable.

@tyx
Copy link

tyx commented Nov 15, 2016

Hi @rtheunissen

any plan about this equals feature ? Can help ?

Just see that new \Ds\Set([1, 3]) == new \Ds\Set([1, 2, 3]) returns true and it makes my unit tests very sad.

@rtheunissen rtheunissen added this to the 2.0.0 milestone Nov 28, 2016
@rtheunissen rtheunissen self-assigned this Nov 28, 2016
@shouze
Copy link

shouze commented Nov 28, 2016

Ok, this is not a BC break feature... any chance to get it backported to 1.x or does it need a big internal refactoring?

@rtheunissen
Copy link
Member

Doesn't need internal refactoring, I'm really just lumping everything into 2.0 because it's my active branch. Happy to implement in 1.x now and just merge into 2.0.

@rtheunissen rtheunissen removed this from the 2.0.0 milestone Nov 28, 2016
@rtheunissen
Copy link
Member

rtheunissen commented Feb 12, 2017

See #67 for status on equals.

If we did implement Hashable on collections, they would simply use spl_object_hash and $this === $that, which is what the extension already falls back on.

We either have to use and support equals everywhere, or not at all.

An example of a consistency I want to avoid at all costs:

$a = new Ds\Set([1, 2, 3]);
$b = new Ds\Set(["1", "2", "3"]);

$a == $b;       // true, because of loose property comparison.
$a === $b;      // false, because they are not the same instance.
$a->equals($b); // false, because the sets don't contain the same values.

The inconsistency is between the == and the equals, which is something we can fix in the extension, but not the polyfill. This is the primary blocker. There is no way to implement "not the same instance but strict comparison of properties" in userland PHP as at 7.1.

Like atoum/atoum#671 mentions, even if we implement equals on Collection,== won't behave as expected and won't resolve the problem outlined in this issue.

@morrisonlevi
Copy link

As mentioned on another ticket it might be good to extract the equals method into an Equatable interface which Hashable extends from. I'm not sure collections should be Hashable but they ought to be able to check for equality.

The issues with ==, ===, and equals are not that important in my opinion. Users cannot use == and === on userland objects reliably already, so providing equals and documenting it is the best way forward, in my opinion.

@Fleshgrinder
Copy link

Usually an Equatable, Equalable, Identifiable, … interface goes hand in hand with the hash method because they actually just compare the hashes of the objects.

interface Identifiable {
    function equals($other): bool;
    function hash(): string;
}

trait IdentifiableTrait {
    public function equals($other): bool {
        return $other instanceof $this && $other->hash() === $this->hash();
    }

    public function hash(): string {
        return \spl_object_hash();
    }
}

final class SomeEntity implements Identifiable {
    use IdentifiableTrait;

    private $id;

    public function hash(): string {
        return \md5(__CLASS__ . $this->id, \false);
    }
}

final class SomeValueObject implements Identifiable {
    use IdentifiableTrait;

    private $some_prop;

    public function hash(): string {
        return \md5(\var_export($this, \true), \false);
    }
}

This might be a way how you could implement it.

@morrisonlevi
Copy link

morrisonlevi commented Nov 29, 2017

Usually an Equatable, Equalable, Identifiable, … interface goes hand in hand with the hash method because they actually just compare the hashes of the objects.

I'm not sure where you get that idea but that is fundamentally broken. The whole reason equals is required by Hashable is to distinguish between two objects whose hashes collide but they aren't the same. The hash gets you into the correct bucket or region and then you iterate until you find one that equals matches.

Edit to expand on this a bit: If $lhs->equals($rhs) is true then $lhs->hash() == $rhs->hash() must also be true. However the latter does not imply the former, meaning that just because two hashes are considered equal does not mean the objects are. This is because fast hash values (which we want) are typically lossy. Their goal is to get us into the correct bucket or region and then we can use the more expensive but completely accurate equals to narrow in on the exact location.

@Fleshgrinder
Copy link

The purpose of the hash function as it is used here in the extension is what you describe. However, that is different in programming languages that have equality built-in, e.g.:

Also, as explained in #67, lhs.equals(rhs) does not imply lhs.hash == rhs.hash for all types, only for most.

@morrisonlevi
Copy link

Also, as explained in #67, lhs.equals(rhs) does not imply lhs.hash == rhs.hash for all types, only for most.

Can you directly link me to the explanation? I see some type-juggling issues but not where two objects which are of the same type, are Hashable, and are equal by their equals call do not have the same hash code.

@Fleshgrinder
Copy link

@morrisonlevi there you go: #67 (comment)

@morrisonlevi
Copy link

Floating point values are not objects, do not implement Hashable, and thus do not have an equals method. While INF and NAN semantics are something to be aware of it's not the same case as I'm talking about here. In every case where you have a Hashable object you must use the semantics outlined or your type is not well-behaved and will not work properly in the collection.

@Fleshgrinder
Copy link

Not in PHP, that is true. What about two DateTime instances that have the same time but are in different time zones? Are they equal? Probably. Do they have the same hash? Probably not.

I’m only trying to make clear that partial equality and ordering is something that is true for some types. The constraint you are laying out is normal for the keys of hash map implementations (if x == y then x.hash == y.hash) because they are broken if that does not hold.

@rtheunissen
Copy link
Member

I think the decision here is that only Hashable types should be supported by hash-based structures, and the contract assumes that if x == y then x.hash == y.hash. Containers that are mutable will not implement Hashable. You would need to convert to an immutable type first.

Hashable and Equatable should be separated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants