Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Int32.TryParse and F#+ tryParse give different results #541

Open
abelbraaksma opened this issue Mar 27, 2023 · 6 comments
Open

Int32.TryParse and F#+ tryParse give different results #541

abelbraaksma opened this issue Mar 27, 2023 · 6 comments
Labels

Comments

@abelbraaksma
Copy link
Member

Description

Basically (tested with culture on en-US).

> let value: int option = tryParse "123,45";;
val value: int option = Some 12345

> Int32.TryParse "123,45";;
val it: bool * int = (false, 0)

In other words, integers cannot have thousand-separators with Int32.TryParse, but can with tryParse.

Conversely, it doesn't allow decimals, which is good:

> let value: int option = tryParse "123.45";;
val value: int option = None

> Int32.TryParse "123.45";;
val it: bool * int = (false, 0)

Likewise there's a similar difference with e-notation:

> let value: int option = tryParse "12345e4";;
val value: int option = Some 123450000

> Int32.TryParse "12345e4";;
val it: bool * int = (false, 0)

Repro steps

See above.

Expected behavior

From the description of the method, you'd expect the same behavior as the .NET functions, but they allow certain illegal values, or at least ambiguous, considering how .NET parses numbers.

Actual behavior

See above. This is caused by NumberStyles.Any, which is not used by Int32.TryParse or Decimal.TryParse. These use NumberStyles.Integer and NumberStyles.Decimal respectively (see source of .NET), which limit the allowed range.

Known workarounds

Create your own types or call Int32.TryParse directly, or create your own helper function.

Related information

Most recent F#+ and tested with .NET 6, but assuming it is the same on .NET 7.

@abelbraaksma
Copy link
Member Author

Related: regardless of whether we'd fix this issue or not, perhaps we can update the documentation to specify that parse and tryParse uses InvariantCulture. Currently, it not being mentioned, may lead people to believe it works the same as XXX.Parse/TryParse from the BCL, that is, that it changes behavior depending on the current culture.

@wallymathieu
Copy link
Member

Yes, I think it makes sense to have it documented as using invariant culture.

@gusty
Copy link
Member

gusty commented Apr 2, 2023

As always with parsers there's the question of how smart my parser should be, where do you draw the line?

Our current parsers uses NumberStyles.Any which makes them very flexible, you could argue they should be smarter and make them able to parse "thirtyseven" as 37, or make them very strict and expect "clean" numbers, without any , or . separator, or redundant + signs.

The problem is, parsers are kind of the inverse of printers, but it's not bijective as there are many ways to print and many ways to parse.

We can only implement one way to print, however making parsers so strict that they only can parse that specific printing way could be a nice property but a pain to work with in real world scenarios.

So we can come up with this property: print >> parse = id (and we could add FsChecks for it) which doesn't imply parse >> print = id.

How different styles we allow? That's an interesting discussion (thanks for kicking it off), I think we should think in practical cases.

Then as you say we have the compatibility problem, if we eventually decide to make them more strict, but we can always implement a parseStrict something like ParseExact in order to satisfy cases where the developer is more interested in validating than getting results as far as practical.

@abelbraaksma
Copy link
Member Author

abelbraaksma commented Apr 17, 2023

Our current parsers uses NumberStyles.Any

While I agree with everything you say, the problem with Any is that invalid numbers are parsed, which I don't think is part of what most people consider valid input.

Since this is about calling the parse methods of .NET, I think, apart from forcing Invariant Culture, that we should stay as close as possible to what .NET itself implements, to stick with the "principle of least suprise".

Certainly, creating invalid parsing functions will not have been the goal here. But changing this leads to a backwards compat issue. I'd vote for your proposal with parseStrict.

Note that this is not the same as ParseExact, as the standard Parse functions on number types are quite strict already. The only big problem with these is the absence of invariant culture in the default behavior.

@gusty
Copy link
Member

gusty commented Apr 18, 2023

OK, let's go with parseStrict, then the question remains for V2, I mean we can do breaking changes then but still which should be the default ? We have some time to think about it.

@abelbraaksma
Copy link
Member Author

abelbraaksma commented Jul 3, 2023

I would set the defaults to what I wrote before:

we should stay as close as possible to what .NET itself implements, to stick with the "principle of least suprise".

not sure if parseStrict is the ideal choice, but I can't think of anything better right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants