Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad regex expressions - "." not escaped, "/fr(i(day)?)?/" instead of "/^fr(i(day)?)?/" and more #281

Open
mdeweerd opened this issue May 28, 2023 · 0 comments

Comments

@mdeweerd
Copy link

mdeweerd commented May 28, 2023

Related to #241, #249 and possibly others.

My first issue was that Date.parse("10 mars 2023") resolved to a date in May (current month).

It turns out that the original expression for Tuesday (Mardi in French) matched because it was :"^ma(r(.(di)?)?)?" ; The '.' actually matches any character and therefore "mars" was matched as the day of the week. However, an issue also arises when matching 'mai' in the month of june. This was fixed by adding an end of word condition.

When I tested Date.parse("vendredi next week"), this failed because the regex for en-US did not have the leading ^, while the french version had /^fr(i(day)?)?/":"^ve(n(\\.|dredi)?)?", as for the other dates. So I updated the en-US regex for that.
It's possible thought that the day of the week expression would need to have the leading "^" removed to make Date.parse("next vendredi") work.
There is a test case for that - maybe it passed by chance.

Also and expression like "^ma(r(.(di)?)?)?" is invalid as it does not match "mardi" which is the normal name for the day in the week, it has to be "^ma(r(\.|di)?)?", so that "ma", "mar", "mar." or "mardi" work. But that could stil partially match mai which is another valid date field value. So we need "^ma(r.|(r(di)?)?\b)" which matches 'ma' only when it's a word.

Applying this on the fr-FR culture file gives the following working options.
I also added accent free versions. "aout" is the official preference over "août" and a lot of french people type without the accents to speed up typing.

   "/jul(y)?/": "juil(\\.|let)?",
    "/aug(ust)?/": "ao[u\u00fb]t",
    "/sep(t(ember)?)?/": "sept(\\.|embre)?",
    "/oct(ober)?/": "oct(\\.|obre)?",
    "/nov(ember)?/": "nov(\\.|embre)?",
    "/dec(ember)?/": "d[e\u00e9]c(\\.|embre)?",
    "/^su(n(day)?)?/": "^di(m(\\.|anche)?)?",
    "/^mo(n(day)?)?/": "^lu(n(\\.|di)?)?",
    "/^tu(e(s(day)?)?)?/": "^ma(r.|(r(di)?)?\\b)",
    "/^we(d(nesday)?)?/": "^me(r(\\.|credi)?)?",
    "/^th(u(r(s(day)?)?)?)?/": "^je(u(\\.|di)?)?",
    "/^fr(i(day)?)?/": "^ve(n(\\.|dredi)?)?",
    "/^sa(t(urday)?)?/": "^sa(m(\\.|edi)?)?",

and in the Default CultureInfo:

    fri: "/^fr(i(day)?)?/",

If somebody makes a fork with these changes, please leave a comment here.

@mdeweerd mdeweerd changed the title Bad regex expressions - "." not escaped, "/fr(i(day)?)?/" instead of "/^fr(i(day)?)?/" Bad regex expressions - "." not escaped, "/fr(i(day)?)?/" instead of "/^fr(i(day)?)?/" and more May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant