Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiered compilation should generate optimized overrides for ValueType.GetHashCode and ValueType.Equals #36613

Open
YairHalberstadt opened this issue May 17, 2020 · 7 comments
Labels
area-VM-coreclr tenet-performance Performance related issue
Milestone

Comments

@YairHalberstadt
Copy link
Contributor

It's well known that if a valuetype is used as a dictionary or hashset key, or in any other situation where .Equals or .GetHashCode is often used, you should override the default ValueType.Equals and .GetHashCode as the default implementation is reflection based and very slow.

For example here is a benchmark comparing a struct using the Equals and GetHashCode roslyn generates to one without any override:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Collections.Generic;

public static class Program
{
    public static void Main()
    {
        BenchmarkRunner.Run<Benchmarks>();
    }
}

public class Benchmarks
{
    [Benchmark]
    public void ValueTypeEquals()
    {
        var dictionary = new Dictionary<S1, int>();
        for (int i = 0; i < 10000; i++)
            dictionary[new S1 { s = i.ToString(), i = i }] = i;
    }

    [Benchmark]
    public void GeneratedEquals()
    {
        var dictionary = new Dictionary<S2, int>();
        for (int i = 0; i < 10000; i++)
            dictionary[new S2 { s = i.ToString(), i = i }] = i;
    }
}

struct S1
{
    public string s;
    public int i;
}

struct S2 : IEquatable<S2>
{
    public string s;
    public int i;

    public override bool Equals(object obj)
    {
        return obj is S2 s && Equals(s);
    }

    public bool Equals(S2 other)
    {
        return s == other.s &&
               i == other.i;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(s, i);
    }
}
Method Mean Error StdDev
ValueTypeEquals 7.146 ms 0.1570 ms 0.4629 ms
GeneratedEquals 3.902 ms 0.1078 ms 0.3177 ms

However the override is functionally identical to the default one, and so it seems pointless to generate this for every struct used as a dictionary key. It's also too easy to forget to do so for a struct which needs it, or alternatively do it for too many structs and lead to code bloat.

Is it possible for the Jit to generate the same override at runtime? Whilst it would not be practical to do so for every struct, tiered compilation might be able tell you which structs have Equals and GetHashCode used in hot paths.

@YairHalberstadt YairHalberstadt added the tenet-performance Performance related issue label May 17, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Runtime untriaged New issue has not been triaged by the area owner labels May 17, 2020
@EgorBo
Copy link
Member

EgorBo commented May 17, 2020

It's well known that if a valuetype is used as a dictionary or hashset key, you should override the default ValueType.Equals and .GetHashCode

it should always be overridden if used. Your S1 and S2 implementations are not the same, e.g.

int hc1 = new S1 { s = "Hello", i = 10 }.GetHashCode();
int hc2 = new S1 { s = "Hello", i = 12 }.GetHashCode();

hc1 and hc2 are the same value for S1 and different for S2. The only case you can rely on default GetHashCode for value types is when your struct consists of primitives without padding in between.

@YairHalberstadt
Copy link
Contributor Author

What is the reason not to use all the fields in GetHashCode?

@EgorBo
Copy link
Member

EgorBo commented May 17, 2020

What is the reason not to use all the fields in GetHashCode?

// change the implementation here to use all fields instead of just the 1st one.

@YairHalberstadt
Copy link
Contributor Author

So it would theoretically be possible to make both changes in one go, and use all the fields for GetHashCode, and then optimize that on the fly when it seems necessary.

@EgorBo
Copy link
Member

EgorBo commented May 17, 2020

So it would theoretically be possible to make both changes in one go, and use all the fields for GetHashCode, and then optimize that on the fly when it seems necessary.

Maybe? I personally agree with #9802 (comment)

@mangod9 mangod9 added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-System.Runtime labels May 26, 2020
@AndyAyersMS
Copy link
Member

The jit can't generate overrides like this; that has to be done when the method table for the type is created. Changing this to area-vm.

@AndyAyersMS AndyAyersMS added area-VM-coreclr and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels May 27, 2020
@mangod9 mangod9 added this to the Future milestone Jun 22, 2020
@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jun 22, 2020
@GSPP
Copy link

GSPP commented Jul 12, 2020

Related: #4414

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

6 participants