Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Regex in Query #698

Open
jogibear9988 opened this issue May 31, 2017 · 19 comments · May be fixed by #4392
Open

Use Regex in Query #698

jogibear9988 opened this issue May 31, 2017 · 19 comments · May be fixed by #4392
Labels
status: has-pr There is active PR for issue type: feature
Milestone

Comments

@MaceWindu
Copy link
Contributor

If by support you mean handling of RegEx class usage in linq queries, then no.

@jogibear9988
Copy link
Member Author

I'll look if this is possible to implement...

@sdanyliv
Copy link
Member

sdanyliv commented Jun 1, 2017

It's just another extension. PR will be appreciated.
I do not think that we have to deal with RedEx class, it is closer to Sql.Like implementation.

@ili
Copy link
Contributor

ili commented Jun 2, 2017

for example

[Sql.Expression(ProviderName.MySql, "{0} REGEXP {1}")]
public static bool RegExp(string regex, string data)
{
    throw new NotImplementedException();
}

db.GetTable<Person>().Where(_ => RegExp("bla-bla-bla", _.Name)).ToList()

the only thing it would be sensitive to database RegEx dialect.

To make honest support it would be needed to create extension and convert c# RegEx to DB RegEx. If somebody knows how to convert one RegEx dialect to other it would be easy to implement support like:

public static class Sql
{
    public static class Regex
    {
        public static bool Match(Regex regex, string data) => regex.IsMatch(data);
    }
}

db.GetTable<Person>().Where(_ => Sql.Regex.Match(new Regex("bla-bla-bla"), _.Name)).ToList();

@MaceWindu MaceWindu added this to the 1.9.0 milestone Jun 12, 2017
@jogibear9988
Copy link
Member Author

pull #1077

@jogibear9988
Copy link
Member Author

Is it also possible to support this:

			var q =
				from t in db.Category
				where Regex.IsMatch(t.CategoryName, "Condim.*")
				select t;

it means use the .NET Regex Class. Where can I look to implement this?

@sdanyliv
Copy link
Member

sdanyliv commented Apr 3, 2018

It is possible. But, how different regexp syntax in other databases? If dialects are too different - i prefer do not introduce Regex.IsMatch replacement.

@jogibear9988
Copy link
Member Author

Don't know how different they are. But I think most should be the same (if it is supported).

Where do I have to look to implement smth. like this?

@sdanyliv
Copy link
Member

sdanyliv commented Apr 3, 2018

We have to be sure that syntax is not different. We supports more than 10 databases and it will introduce undefined behavior for others. If you want to replace Regex.IsMatch function - you have to create all needed transformations for other databases. As i have mentioned before - it's not worth our time.

@jogibear9988
Copy link
Member Author

@sdanyliv
Copy link
Member

sdanyliv commented Apr 8, 2018

Link is about third party npgsql EF Provider - only Postgre SQL.
I suggest to open API for such kind of tricks
https://github.com/linq2db/linq2db/blob/master/Source/LinqToDB/Linq/Expressions.cs#L448
If you know what are you doing in your project - go on.

@Hermholtz
Copy link

Of course regular expression syntax can have subtle differences in every engine, even in different versions of the same engine. Only some basic regular expressions, using the common subset of features may work. It's just like the SQL language itself. It's pretty much standardized, but only to a limited degree.

Here are my findings:

  • Prior to 8.0.4, MySQL was using the Henry Spencer regular expression library to support regular expression operations.
  • Starting with MySQL 8.0.4, ICU’s Regular Expressions package is used.
  • PostgreSQL uses POSIX Regular Expressions with some extensions borrowed from Perl and Tcl, calls them "ARE".
  • Oracle generally implements POSIX Regular Expressions but also includes selected parts of PCRE.
  • SQL Server doesn't have any RE support by default. If you want it, you have to resort to using a CLR integration (for example this library), or one of the external languages available in latest versions: R, Java or Python. You then have 4 flavors of regular expressions possible - each of those differs a bit from the others. And to use it you first have to set it up:
    • In .NET you have to have a database into which you'll import the CLR functions with CREATE ASSEMBLY followed by several CREATE FUNCTIONs and during runtime qualify the functions with the DB name.
    • In R and Python you have to CREATE EXTERNAL LANGUAGE and then use sp_execute_external_script with a bunch of code, and for Java you have to compile it upfront and load into the server with CREATE EXTERNAL LIBRARY.
  • In SQLite there's the REGEXP keyword, but it doesn't have the regular expression functionality until you load an extension containing the regexp() function. Then the flavor of supported regexps will be whatever the extension provide.
  • That's enough. If you want more information, please check your favorite engine's documentation.

So, having a truly portable regular expressions in LINQ to DB is a hard task. I think implementing a simple generic wrappers is doable, but it's going to be a supportability nightmare. Users will have hard time understanding why are there error messages returned by SQL Server when their perfect LINQ to DB should just run.

@jods4
Copy link
Contributor

jods4 commented Feb 7, 2022

My take on this:

(1) I don't see why someone would attempt to create an abstraction layer to unify all regexps flavors:

  • It's a very big task that requires a lot of difficult effort and testing;
  • It will incur extra parsing/compilation work at runtime;
  • It won't work for dynamically built regexes based on data;
  • Users will be confused as they need to learn a different regex flavor than the actual one from their db (speaking from experience, I have to use 3 different regexes flavors at my job and man it's mentally taxing);
  • It opens a long-term, huge, can of worms. I can foresee the bug reports on specific RDBMS or even RDBMS point releases, the issues asking support for a provider-specific non-standard extension that is not in your lingua franca, etc.

(2) On the other hand, just like all other provider-specific SQL extensions linq2db exposes, I see value in using these functions even when they're not portable.
In fact I have mapped Oracle regexp_like myself because we need to use it in our queries.

I see linq2db more as "full access to RDMBS power" than a "minimal common layer between all RDBMS". In fact, even basic operations can't be bridged between providers, did you know that 'abc' + null does not yield the same result in SQL Server and Oracle? Try to blindly port your code from one to the other now 🤣. Cross-provider code is always a lot of work and care.

Bottom line: I think adding those as some kind of provider-specific bool Match(string, string) with no attempt to look at the regex itself is useful, simple and the way to go.

@jogibear9988
Copy link
Member Author

My intention was not to have portable regexes for every database, but a unified way to use the regexes, even if they work different on every database.

We have a UI with a tableview where a customer could build different complex filters
image
and here I also wanted to support regex. On every customer these could be different cause he used another databases (we support oracle, mysql, mssql, sqlite and postgres), but our code to execute the regex could be the same.

And also in unitests where we ran our queries agains objects, it would work if we support it with standard regex.match

@jogibear9988
Copy link
Member Author

But atm. I do not work on this (okay the issue is already 5 years old :-) )

@MaceWindu MaceWindu removed this from the Future milestone Dec 24, 2022
jogibear9988 added a commit that referenced this issue Jan 26, 2024
@jogibear9988
Copy link
Member Author

fixed in #4392

@viceroypenguin viceroypenguin linked a pull request Jan 26, 2024 that will close this issue
@viceroypenguin
Copy link
Contributor

don't close until the PR is merged.

@jogibear9988
Copy link
Member Author

/azp run test-all

@jods4
Copy link
Contributor

jods4 commented Jan 26, 2024

Something new to add to the discussion of: "should we simply use Regex.IsMatch as the entry point (assuming we pass the pattern "as-is" with no attempt at adapting to RDBMS), or should we add our own Sql.RegexLike":

Now VS IDE is attempting to provide colorization and intellisense for known string DSLs, including regexes. I think it would be even more confusing to get pattern tooltips and completions that are incorrect in this instance.

@MaceWindu MaceWindu added the status: has-pr There is active PR for issue label Jan 28, 2024
@MaceWindu MaceWindu added this to the In-progress milestone Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: has-pr There is active PR for issue type: feature
Development

Successfully merging a pull request may close this issue.

7 participants