Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FirstOrDefault returns lambda function on empty collections from .NET 6 and forward #1761

Open
Simon900225 opened this issue Dec 14, 2023 · 8 comments

Comments

@Simon900225
Copy link

Simon900225 commented Dec 14, 2023

If you run
List[Object]().FirstOrDefault(lambda x: x)
after the appropriate imports the result will be something like this
<function <lambda$172> at 0x000000000000002B>
when it should be None

It does work as expected if you instead of object use another type like this
List[DateTime]().FirstOrDefault(lambda x: x)

I reproduced this in the IronPythonConsole by changing target framework to .NET 6 (same issue in .NET 7 and .NET 8)

Tested on
IronPython 3.4.1 DEBUG (3.4.1.1000) [.NETCoreApp,Version=v6.0 on .NET 6.0.25 (64-bit)] on win32

It seems like there is an explicit cast to PythonFunction when it should still be an Object
image

@slozier
Copy link
Contributor

slozier commented Dec 14, 2023

For reference, see also: IronLanguages/ironpython2#812

In .NET 6, the overloads are:

public static TSource FirstOrDefault<TSource> (this IEnumerable<TSource> source, TSource defaultValue); // new in .NET 6
public static TSource? FirstOrDefault<TSource> (this IEnumerable<TSource> source, Func<TSource, bool> predicate);

So in the example where TSource is object it picks the first one. This sort of makes sense, since lambda x: x isn't a Func<object, bool> it picks the object overload. Although I can see that in practice that is not the expected outcome...

I'm not super familiar with the method resolution code but it might be possible to give more weight to the Func overload if you're passing in a Python function.

Other question that comes to mind is does this behave the same way for non-extension methods?

@Simon900225
Copy link
Author

Okay. Now I understand the issue.
So a "fix" for this is wrapping the lambda in a Func[Object,bool] like so List[Object]().FirstOrDefault(Func[Object,bool](lambda x: x))

@BCSharp
Copy link
Member

BCSharp commented Dec 14, 2023

Other question that comes to mind is does this behave the same way for non-extension methods?

Yes, it does.

I'm not super familiar with the method resolution code but it might be possible to give more weight to the Func overload if you're passing in a Python function.

I think it might be a good idea, as long as the number of parameters match. I'm not sure what would be best in case of Action parameters, but those cases are more rare than Func.

@Simon900225
Copy link
Author

I'm not sure I've done it correctly but I've found and solved the issue for my test cases at least.

I added the following code to CanConvertFrom in PythonOverloadResolver

if (fromType == typeof(PythonFunction)) {
    if (toParameter.Type.IsGenericType &&
        toParameter.Type.GetGenericTypeDefinition() == typeof(Func<,>)) {
        return true;
    }
}

This makes it so that it accepts PythonFunctions as Func<,> parameters and it then selects the correct overload without any other changes.

I've tested this both in IronPython2 and IronPython3. It also solves this issue (IronLanguages/ironpython2#812)

If you think this is a correct way to solve it I could create a PR in both repos.

@BCSharp
Copy link
Member

BCSharp commented Dec 22, 2023

From memory, the place looks about right. However I see that it will try to convert any Python function to a .NET function with one parameter. Using your modified code, what happens when you run List[Object]().FirstOrDefault(lambda x, y: True)?

(expected: the function object, which you would get in C#)

@slozier
Copy link
Contributor

slozier commented Dec 22, 2023

Using your modified code, what happens when you run List[Object]().FirstOrDefault(lambda x, y: True)?

Seems to work as you expect.

Side note which is not really related to the proposed change, but on .NET Framework using the current codebase this raises a TypeError and then breaks subsequent calls to the "good" lambda:

>>> List[object]().FirstOrDefault(lambda x: x)
>>> List[object]().FirstOrDefault(lambda x, y: True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: The type arguments for method 'FirstOrDefault' cannot be inferred from the usage. Try specifying the type arguments explicitly.
>>> List[object]().FirstOrDefault(lambda x: x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: The type arguments for method 'FirstOrDefault' cannot be inferred from the usage. Try specifying the type arguments explicitly.

@BCSharp
Copy link
Member

BCSharp commented Dec 22, 2023

this raises a TypeError and then breaks subsequent calls to the "good" lambda:

The way the overload resolver works is that it caches results of previous resolutions for a given call site keyed by argument types. This greatly improves the resolution speed. Apparently it also caches resolution failures, which I don't see having much value (what is a benefit of a fast resolution failure if it is still an exception?) but maybe I am overlooking something.

BTW, a way to bypass the problem after breaking the resolver (and indeed to avoid it altogether on all frameworks) is to wrap the lambda in System.Func.

@BCSharp
Copy link
Member

BCSharp commented Dec 22, 2023

Seems to work as you expect.

Not exactly. After getting the lambda object from the call List[Object]().FirstOrDefault(lambda x, y: True), subsequent calls with the "good" lambda return the lambda rather than None:

>>> List[Object]().FirstOrDefault(lambda x: True)
>>> List[Object]().FirstOrDefault(lambda x, y: True)
<function <lambda$174> at 0x000000000000002B>
>>> List[Object]().FirstOrDefault(lambda x: True)
<function <lambda$176> at 0x000000000000002C>

I think the root cause of it (and the above mentioned failures on .NET Framework as well) is that IronPython is not using the DLR resolver caching mechanism as it was intended and designed. If a resolution succeeds or fails for an argument type and the cache is updated with that type as key, the DLR assumes it applies to all instances of that type. If this assumption would not be correct, the DLR has an option to designate a specific instance as the cache key. This last bit is theoretical, I got it from reading the DLR docs; I have never actually seen it used in the IronPython codebase. Besides being less efficient, it would not be practical in this particular case as all lambdas are different objects.

PythonFunction represents any Python function or lambda, I don't see a type that represents a function with exactly 1 argument. So the way IronPython uses the resolution cache could have been an "imperfect but practical" choice, which unfortunately breaks down in some situations as the .NET API evolves. So what would be another "imperfect but practical" tweak now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants