Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Oracle: implementation and tests #699

Open
8 of 9 tasks
Ste1io opened this issue Sep 30, 2023 · 81 comments
Open
8 of 9 tasks

Add support for Oracle: implementation and tests #699

Ste1io opened this issue Sep 30, 2023 · 81 comments
Assignees
Milestone

Comments

@Ste1io
Copy link
Collaborator

Ste1io commented Sep 30, 2023

PetaPoco has "unofficially" supported Oracle for quite some time now... I'd like to implement the integration tests for Oracle so that can change finally, and we can claim official support without a big "but" after. This will also allow us to confidently investigate, address, and close multiple issues that are unclosed, and can most likely be readily fixed provided tests are possible.

Happy to spearhead this myself alongside the remaining test refactoring.

Addresses #626
Addresses #613

@Ste1io Ste1io self-assigned this Sep 30, 2023
@Curlack
Copy link
Contributor

Curlack commented Sep 30, 2023 via email

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 2, 2023

Let me know how setup goes with your local environment, @Curlack, and if you encounter anything you think should be added to the updated wiki page.

Given your background in Oracle, I don't mind passing this one over to you if you want, or collaborating together on it, whichever is best for you.

@Curlack
Copy link
Contributor

Curlack commented Oct 2, 2023 via email

@Curlack
Copy link
Contributor

Curlack commented Oct 2, 2023

I'm gonna need the docker setup for oracle. I can see a comment for oracle pointing to "sath89/oracle-12c:latest" image, but to be honest, I have no clue how all the docker stuff works. Will figure it out as I go. Not sure if we'll start with 12c and go up to 23c or for all intents and purposes 12c will always be enough.

PS: Also wanted to ask if this will be our primary medium of communication and (since we're 6 hours apart) what times would suite you best?

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 3, 2023

For the docker images, most of it's been trial and error and checking docs... the sathu89 image (that commented service predates my recent PR) is no longer available when I checked (apparently due to a dmca from Oracle ~2019). afaik, 19c is current LTS, so that would be my suggestion. Including 12c for backwards compatibility with the older Oracle provider (the orphan in the root directory, I suppose) may be necessary.

Matrix testing hasn't really been brought up yet, but you'll notice I added comments about firebird also as the current image used for it is a couple major versions behind, like 12c.

Just noticed I neglected to update the .vscode extensions with it, but if you install the the Docker extension for VSCode, testing images is fairly straightforward simply by right-clicking on the compose.yml after making your changes and selecting the services you want to fire up.

@Curlack
Copy link
Contributor

Curlack commented Oct 9, 2023

So far I've been hitting brick walls. I'm trying to build a docker image using Docker Images for Oracle repository. Should be simple only need to execute buildContainerImage.sh file, but keep getting "not found" error in terminal. My VSCode session is connected to WSL and Desktop Docker is running. Not sure what I'm missing.
Want to build the "Oracle Database Free" (23c) image with 19c compatibility. Any pointers would be welcome.

@Curlack
Copy link
Contributor

Curlack commented Oct 10, 2023

Success at last! Used installation described on the Oracle Container Registry. Opted for version 23c (free). Also installed Oracle SQLDeveloper ide (less "consoley" hehehe). Turns out compatibility has nothing to do with database features, only disk format. Also I just needed to read the same thing enough times for it to "click" :)

The command I ended up using is:

docker run --name oracle-free-v23c -d -p 1522:1521 -p 5700:5500 -e ORACLE_PWD=petapoco -e ORACLE_CHARACTERSET=AL32UTF8 container-registry.oracle.com/database/free:latest

One thing I'm not sure of, is how the docker-compose.yml file should be changed.

Now that I'm able to connect I can maybe begin with small scaffold of oracle test group and work my way up as I go.

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 11, 2023

Nicely done @Curlack. Translating from cmd line to yml is typically fairly straightforward (-e as environment variables, -p being ports). Something like this should work, added under the services...untested:

oracle-free-v23c:
  image: container-registry.oracle.com/database/free:latest
  container_name: oracle23c_petapoco
  environment:
    - ORACLE_PWD=petapoco
    - ORACLE_CHARACTERSET=AL32UTF8
  ports:
    - "1522:1521"
    - "5700:5500"

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 11, 2023

Now that I'm able to connect I can maybe begin with small scaffold of oracle test group and work my way up as I go.

The SqlServer and SQLite tests, and the accompanying test providers provide a fairly straightforward bare-bones multi-driver implementation. As with the other tests, instantiate a class inheriting from one of the abstract classes to activate those tests; override or skip any as needed, or add additional test methods if specific to oracle.

If you want me to help scaffold an initial oracle test class just lmk.

I'm waiting on @pleb to review my changes to the tests, but you should be safe rebasing onto the changes from my feature branch for #701 (Ste1io:Ste1io/separate-dbms-driver-tests) and building up the oracle tests from the changes I did there.

[edit] PR approved ^, thanks @pleb! :)

@Curlack, I've setup an upstream branch, feature/oracle-support, which you can fork to your local. Any changes you or I make can be PR'd back to this central feature branch. This should streamline our collaboration and help keep things organized.

I'll merge any changes on Development into this feature branch as needed, so you can fetch and rebase onto this branch (upstream/feature/oracle-support) to keep in sync.

@Curlack
Copy link
Contributor

Curlack commented Oct 16, 2023

Quick update from my side. Btw, thanks for the branch, good idea.
docker-compose.yml updated and working (thanks for the tip). Beginning to understand docker a little better now.
app.config and appsettings.json file changes in place and oracle database connection successful. Using Oracle.ManagedDataAccess 21.12.0 nuget package.
Created OracleTestProvider.cs, generated a setup script and "et voila!" I was ready to start converting build scripts to their oracle equivalent....until I got to the Stored Procedures :-(.

CREATE PROCEDURE dbo.SelectPeople
AS
BEGIN
	SET NOCOUNT ON;
	SELECT * FROM [People]
END

This simple procedure returns a dataset as if it's a table. Only difference is we execute it instead of selecting from it. After doing some reading and even consulting Bing AI, they all mention an output parameter of type Ref Cursor, but the caller has to explicitly bind the parameter.

As of v12c and above they boast to be able to return query results to the caller implicitly via DBMS_SQL.RETURN_RESULT and DBMS_SQL.GET_NEXT_RESULT.

I suppose if it's possible to do by any means, it's fine as long as the end result prevents the caller from modifying the underlying table, which is why these kind of procs exist in the first place imo. Just having bit of a tough time trying to get around this one.

Any ideas or alternatives guys?

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 18, 2023

Nicely done, @Curlack.

My initial inclination would be to implement an Oracle-specific override of

/// <param name="commandType">The type of command to execute.</param>
/// <param name="sql">The SQL statement.</param>
/// <param name="args">The parameters to embed in the SQL statement.</param>
/// <inheritdoc cref="IQuery.Query{T}(string, object[])"/>
protected virtual IEnumerable<T> ExecuteReader<T>(CommandType commandType, string sql, params object[] args)

Ref Cursor would need passed in, and of course this would also require overrides of the Fetch/Query helpers as well, Then inside, push the cursor param containing the output from the stored proc into our command...

A naïve approach being something to the affect of

using (var cmd = CreateCommand(_sharedConnection, commandType, sql, args))
{
    if (_provider is Providers.OracleDatabaseProvider && commandType == CommandType.StoredProcedure)
    {
        var refCursorParam = /* create parameter here */;
        cmd.Parameters.Add(refCursorParam);
    }
    //...

So given the stored procedure in question,

CREATE OR REPLACE PROCEDURE SelectPeople (p_cursor OUT SYS_REFCURSOR)
AS -- or IS, not sure off hand
BEGIN
    OPEN p_cursor FOR
    SELECT * FROM People;
END;

We'd end up with something akin to

[Fact]
public virtual void QueryProc_NoParam_ShouldReturnAll()
{
    var results = DB.QueryProc<Person>("SelectPeople", "p_cursor").ToArray();
    results.Length.ShouldBe(6);
}

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 18, 2023

docker-compose.yml updated and working (thanks for the tip). Beginning to understand docker a little better now.
app.config and appsettings.json file changes in place and oracle database connection successful. Using Oracle.ManagedDataAccess 21.12.0 nuget package.

When you get a chance, if you want to open a PR against the upstream's oracle feature branch with these changes (excluding the unfinished changes being discussed, ofc), I'll pull them in so we're both working in the same test environment, and lock your contribution with these changes.

@Curlack
Copy link
Contributor

Curlack commented Oct 18, 2023 via email

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 18, 2023

If you're happy with ref cursor, I'll finish the rest of the build script then submit a PR.

Using Ref Cursor seems to me the least invasive option and most viable direction given the scope of this branch, and should be implementable by extending the current code without introducing any major breaking changes to the API.

I do think it would be prudent to leave the door open to the newer approach, however, I see that more as a future extension of the library's Oracle support than a requirement for official support.

You have more experience working with Oracle than me, also, so if you feel strongly either way, I'm keen to hear. I'd like to hear @pleb and @asherber's thoughts as well.

@Curlack
Copy link
Contributor

Curlack commented Oct 18, 2023 via email

@pleb
Copy link
Member

pleb commented Oct 19, 2023

Love your work @Ste1io and @Curlack.

Don't forget to update the readme for oracle support and add yourselves to the PetaPoco is maintained and extended by line :)

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 20, 2023

Don't forget to update the readme for oracle support

There's a few other places throughout the docs as well I'm in the process of updating already, Will be sure to do that. @Curlack hammered this out fast, very excited with the progress.

@Curlack
Copy link
Contributor

Curlack commented Oct 20, 2023

Always a pleasure. Glad I'm able to contribute to such an awesome codebase. 😎

@Curlack
Copy link
Contributor

Curlack commented Oct 21, 2023

Quick update: I've added the integration tests for Oracle and ran the tests. Couple of them passed, but the majority failed. But we expected as much...

Looking into the first failure: Delete_GivenPoco_ShouldDeletePoco, very first line i.e. the Insert(person), I had Oracle.ManagedDataAccess.Client.OracleException : ORA-01465: invalid hex number. This is most likely due to RAW(16) data type used for storing Guid.

As I'm debugging, I saw the following insert statement:

INSERT INTO "PEOPLE" ("ID","FULLNAME","AGE","HEIGHT","DOB") VALUES (:0,:1,:2,:3,:4)

I found that the default implementation of EscapeSqlIdentifier in the PetaPoco.Providers.OracleDatabaseProvider is confusing the issue.
An identifier is either delimited (surrounded by double quotes), which makes it case sensitive, or ordinary which has limits (must start with character, can only contain letters, digits and underscore), but is case insensitive.

public override string EscapeSqlIdentifier(string sqlIdentifier) => $"\"{sqlIdentifier.ToUpperInvariant()}\"";

By default oracle objects will be created in uppercase unless surrounded by double quotes. The the user is pretty much in complete control of naming tables, columns, etc.

Shortcut would be to just return the identifier that was passed in, but I can also use Regex to determine whether quotes are required and include them if not already included.

It seems the intention was to automagically give the user the correct syntax, so obviously I'm leaning towards the latter.
The question is what do you guys prefer @Ste1io, @pleb, @asherber ?

PS: Technically this me getting distracted, so just wanted to mention this, put a FIXME or TODO and continue.

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 21, 2023

Great timing, @Curlack. Was just getting on to check if you had started on that or wanted me to get rolling on that.

Shortcut would be to just return the identifier that was passed in, but I can also use Regex to determine whether quotes are required and include them if not already included.

For now, I'm leaning towards adding a TODO, and changing the EscapeSqlIdentifier for Oracle to essentially be a no-op, as you mentioned. We can sort out how best to handle that between now and finishing the remaining tests. Personally, I'm not that keen on using regex for this; even if compiled; that should be a fairly simple check?

On that same note, I don't believe there's any error checking going on for multiple escaping, or if that's even something that's an issue to begin with. Figured I'd mention it nonetheless, since Oracle's use of quotes it seems could become problematic if double-escaped, though I could be wrong. e.g., (using a legal but discouraged table name): SELECT * FROM ""Foo Table"".

Worth noting that there is no escaping in the stored procedures in Oracle's sql build script, currently; if there's any in the create statements, that could be causing a fair few test failures as well.

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 21, 2023

btw @Curlack, so you're aware and to prevent any unnecessary work... I'm currently wrapping up some general cleanup across the tests in a separate private branch, proofing my doc comments from earlier, and also addressing my inline todo's (those that don't involve breaking changes).

@Curlack
Copy link
Contributor

Curlack commented Oct 21, 2023

So the "invalid hex number" issue was due to the hypens in the value.

  1. Either OracleDatabaseProvider needs to handle Guid a little different for RAW(16) data type i.e. Id.ToString("N") instead of Id.ToString() OR column type should be VARCHAR2(36) i.e. we're not going to support the recommended UNIQUEIDENTIFIER data type. Telling the OracleDatabaseProvider it HasNativeGuidSupport doesn't work.
  2. DATETIME is also finicky. OracleDatabaseProvider needs to use TO_TIMSTAMP or TO_TIMESTAMP_TZ functions and be sure to at least abbreviate the month in the format string to not confuse day for month in less than 13 cases lol.

The following sql successfully inserted the record.

INSERT
    INTO PEOPLE
        (ID,
        FULLNAME,
        AGE,
        HEIGHT,
        DOB)
    VALUES
        ('9b3e6f88d0eb4fa387fad7a33b09d2d9',
        'Peta',
        18,
        180,
        TO_TIMESTAMP_TZ('1945-DEC-01 05:09:04 -8:00', 'yyyy-mon-dd hh24:mi:ss tzh:tzm'))

I'll make the "noop" change and push the tests as-is. Then we divide and conquer, unless you also want me to change the db scripts based on the route we decide to take for item number 1 above?

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 21, 2023

Divide and conquer works for me. If you don't see any issues with how the db scripts are currently, no need change them until we need to. I'd err on the side of "best practice" when it gets to db scripts in general.

VARCHAR2(36) sounds like the right approach to me if we're dealing with the GUIDs as strings... 9b3e6f88d0eb4fa387fad7a33b09d2d9 appears to be the hexadecimal conversion of the string. What about working with GUIDs as 16-byte arrays? That's commonly used for GUID storage, and might be what oracle is wanting in this case? [EDIT: Just noticed you had (16) after RAW, so I guess you already went that route initially).

Re timestamps, sounds like DateTimeOffset might be appropriate here, beyond the script usage... Need to double check how the other db's are handling datetime fields again as a refresher.

dateTimeOffset.ToString("yyyy-MMM-dd HH:mm:ss zzz", CultureInfo.InvariantCulture);

@Curlack
Copy link
Contributor

Curlack commented Oct 21, 2023

Ok cool. I changed the db scripts to use VARCHAR2(36), I agree with you, keep it simple. We won't have any issues with primary keys and referential integrity, but it will be case sensitive. Small price to pay considering it's all handled by the system.

Good news: over 60% of the tests pass.
Bad news: the ones that are failing, are tough.

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 21, 2023

Good news: over 60% of the tests pass.
Bad news: the ones that are failing, are tough.

Easy. New game plan:

zsqg5x9d8pr11.jpg

Ste1io added a commit that referenced this issue Oct 21, 2023
Merging changes from private feature branch to collaboratively address known failing integration tests. See #699 (comment) for details.
@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 29, 2023

Yes...not sure what causes it, but restarting the msaccess service fixes it.

@Curlack
Copy link
Contributor

Curlack commented Oct 29, 2023

I don't think you mean "windows service", but here it goes...
How do I do that?

@Curlack
Copy link
Contributor

Curlack commented Oct 29, 2023

Btw, quickly trying to retype everything from my head. Wanted to undo a local commit and clicked "Reset (--hard)" by accident and like magic all my changes gone, ai! Still have some big parts of the code changes left to work with. Hold thumbs.

PS: I'm, still crying little bit.

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 29, 2023

Btw, quickly trying to retype everything from my head. Wanted to undo a local commit and clicked "Reset (--hard)" by accident and like magic all my changes gone, ai! Still have some big parts of the code changes left to work with. Hold thumbs.

PS: I'm, still crying little bit.

OK, don't freak out... :) any other git commands besides that one? and any additional flags with the reset command?

For reference:

git reset --hard will undo your last commit, including the changes in it (e.g., your working tree is reset to the state it was in prior to that commit)
git reset --soft will undo the last commit, but leave those changes staged (e.g., your working tree is reset to the prior commit, with that commit staged; useful if you need to fix something in the previous commit, assuming it hasn't been pushed to remote yet).

You can possibly still recover the changes from the last commit that was reset using reflog. Is that what you're wanting to do at this point @Curlack?

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 29, 2023

Don't make any commits. If you have any changes on your working tree, stash them. Then run git reflog, and grab the commit for the missing commit. You might need to guess at this one, but assuming you haven't done anything since the reset, it's probably the last commit hash). Then make a new branch (like "recovered-branch") using that commit hash with git branch recovered-branch <hash> Checkout that branch, and let me know if your changes are there (git checkout recovered-branch).

@Ste1io
Copy link
Collaborator Author

Ste1io commented Oct 29, 2023

I don't think you mean "windows service", but here it goes... How do I do that?

I was referring to restarting the msaccess db instance.

@Curlack
Copy link
Contributor

Curlack commented Oct 29, 2023

Don't make any commits. If you have any changes on your working tree, stash them. Then run git reflog, and grab the commit for the missing commit. You might need to guess at this one, but assuming you haven't done anything since the reset, it's probably the last commit hash). Then make a new branch (like "recovered-branch") using that commit hash with git branch recovered-branch <hash> Checkout that branch, and let me know if your changes are there (git checkout recovered-branch).

image

Man, you know your stuff! Unfortunately I don't see any recovered changes, but also I lost changes I haven't committed yet (local only). Anyway almost done recalling from memory. Thanks bro!

@Curlack
Copy link
Contributor

Curlack commented Oct 29, 2023

Ok so all tests now pass, see #709.
Hope the branch issues I had on my local branch/fork didn't bleed into the PR.

Calling it a night, tired of doing these changes twice hahaha.
Enjoy!

@Ste1io Ste1io changed the title Add integration testing for Oracle, for official DBMS support Add support for Oracle: implementation and tests Oct 30, 2023
@Curlack
Copy link
Contributor

Curlack commented Nov 1, 2023

Found weird issue when running tests, requiring custom provider and/or mapper in parallel with tests using the defaults. It seems the handling of the data, especially in my case (TitleCase vs UPPERCASE) bleeds into one another.
On further investigation, turns out that it's not right for PocoData to use a global cache (_pocoDatas), but not take the mapper into consideration. Seems like cache key should be Type AND mapper for 'PocoData`.

I bet the same thing is true for PocoFactories cache.

It can be reproduced fairly easily by debugging two tests in parallel, as long as they both use the same type poco e.g. Note and different mappers. Whoever gets to ForType first will dictate the way queries get generated.

Something like this will do:

private class CustomConventionMapper : ConventionMapper
{
    public CustomConventionMapper() => MapColumn = Customize(MapColumn);

    private Func<ColumnInfo, Type, PropertyInfo, bool> Customize(Func<ColumnInfo, Type, PropertyInfo, bool> original)
    {
        return (ci, t, pi) =>
        {
            if (!original(ci, t, pi)) return false;

            ci.ColumnName = ci.ColumnName.ToUpperInvariant();
            return true;
        };
    }
}

The issue came up in the Fetch_ForDynamicTypeGivenSql_ShouldReturnValidDynamicTypeCollection test (one of many). The line giving the error was in QueryTests.AddOrders when inserting a new Person.

I can either run all "Ordinary" or "Delimited" tests in parallel, but not all together...Pop goes the weasel!

I think my changes are ready to be committed. Can I create the PR or would you like me to take a stab at fixing this issue beforehand?

PS: I still having those pesky commits attached to my tail.

@Curlack
Copy link
Contributor

Curlack commented Nov 1, 2023

I was able to flatten the history (sync fork/fetch/pull/rebase), unfortunately I wasn't able to get rid of the commits since 22 Oct.

This is what my branch look like atm.
image

Tried checkout and cherry picking again, but was losing too many changes when I looks at the diffs, so just rebased to origin.
Tried a couple of things, got scared and stopped. Kept me up till 3AM this morning.
So I'm still picking up these commits.

image
image

Anything I should/can do to try and get rid of it (without losing changes)?

@asherber
Copy link
Collaborator

asherber commented Nov 1, 2023

On the PocoData cache issue: #517

@asherber
Copy link
Collaborator

asherber commented Nov 1, 2023

On the git issue: Which branch in your repo, and what are you trying to do with it?

@Curlack
Copy link
Contributor

Curlack commented Nov 1, 2023

My feature/oracle-support branch is apparently 13 commits behind when 10 of them have already been merged into the upstream branch of the same name. The original issue came in when I didn't sync my fork after those merges.
I want to know if there is something I can do to have my PR only contain the changes not yet merged.

In the meantime I've thinking about making a backup of the code on my drive, recreating the fork and copying the code over the download (should only see the differences). Would that work?

EDIT: Would also get rid of the dud PR, lol

@asherber
Copy link
Collaborator

asherber commented Nov 1, 2023

13 commits behind what? What branch do you want to merge into, and which 3 commits do you want to keep?

@Curlack
Copy link
Contributor

Curlack commented Nov 1, 2023

13 commits behind what? What branch do you want to merge into, and which 3 commits do you want to keep?

See my screenshots above and this one below I mean ahead

image

When I click "Compare & pull request", I see the commits from the previous 2 PR @stelio already merged into the upstream branch. OMG I'm screwing this up really bad, lol. Sorry I'm not yet fluent in git.

So plan to create PR on this branch that has this upstream branch

@asherber
Copy link
Collaborator

asherber commented Nov 1, 2023

On git: The O'Reilly Git Pocket Guide is an excellent intro, even if it is 10 years old. And all of Git Pro is free online

Every git commit has a hash value (the 7 chars on the right of the GH screenshot above). Which three commits do you want to keep?

As for your comment about making a backup of the code on your drive, git calls that a clone, and it's the usual way of working with a repo. You forked PetaPoco into your account, now you clone it and work locally. All your dev work and commits are made locally, and then when you're ready they're pushed to your fork on GitHub. Then you open a pull request to get those changes merged to the main PetaPoco repo.

@Curlack
Copy link
Contributor

Curlack commented Nov 1, 2023

Thanks so much. I'll study that in my spare time.
I wanted to keep bbcfcb0, 4c84dea and b0649a0, but I realize I can't do that otherwise I'll lose changes again. So what I'll do is start a new fork, clone. Copy my files over to the new folder from the backup and then go through the changes and do proper commits before the PR. Thanks again for the assist.

@asherber
Copy link
Collaborator

asherber commented Nov 1, 2023

Hold on, there might be an easier way.

@asherber
Copy link
Collaborator

asherber commented Nov 1, 2023

An important thing to keep in mind is that a git commit is not (just) a changeset. Technically, it's defined by the hash value. When you do things like rebasing and force pushing, you're creating entirely new commits which happen to have the same content as other commits. Now when you try to merge, git sees new commits (hashes it doesn't know about), but it will also understand what changes need to be made to the source files.

For example, I see your commit 31bcf, "Cater for optional parameter prefix". Looks like you rebased or did something else which created 0dfc8 -- same content, different commit. Depending on what you're merging, git will try to preserve both commits, but all that does is pollute the commit history (which we try to avoid) -- if they've got the same content, you're actually not introducing any other changes.

Take a look at feature/oracle-support...asherber:foo Nevermind the list of commits -- does that represent the contents of what your want to merge?

@Curlack
Copy link
Contributor

Curlack commented Nov 1, 2023

Eureka! That's exactly what I was after, unfortunately I saw your message too late. Started new fork already.
I couldn't unclutter the commit history. Did you do that straight from the web UI?

@asherber
Copy link
Collaborator

asherber commented Nov 1, 2023

The GitHub UI is not good for any kind of branch management. What I did locally was create a new branch from 719731 and then cherry pick bbcfc and b0649.

Here are git commands you could use locally to accomplish this, using your existing oracle-support branch. Note that this will throw away any commits on your branch earlier than 71973.

git checkout feature/oracle-support

git reset --hard 71973

git cherry-pick bbcfc

git cherry-pick b0649

git push --force

I do many git things from the command line, but a GUI tool is invaluable to see the branch relationships. My personal favorite is https://gitextensions.github.io/

@Curlack
Copy link
Contributor

Curlack commented Nov 1, 2023

At long last success. All Oracle tests are passing (#710). @asherber I only needed the cache fix and a custom provider and mapper for the "Ordinary" identifiers case. As far as dynamic object casing is concerned, created overrides for the failing tests to use the correct casing coming from the poco factory, which the user would be aware of since he's the one that created the custom provider and mapper.
This time touch points in the core was kept to a minimum and all the "Delimited" tests were able to pass with out of the box functionality without a sweat.

Viva PetaPoco! 😄 ✌️

@Ste1io
Copy link
Collaborator Author

Ste1io commented Nov 5, 2023

Found weird issue when running tests, requiring custom provider and/or mapper in parallel with tests using the defaults. It seems the handling of the data, especially in my case (TitleCase vs UPPERCASE) bleeds into one another. On further investigation, turns out that it's not right for PocoData to use a global cache (_pocoDatas), but not take the mapper into consideration. Seems like cache key should be Type AND mapper for 'PocoData`.

@Curlack what's the status on this? If this is indeed a bug, could you file an issue for it specifically so it's not forgotten? That should be patched separately outside of the feature branch, though, as a bugfix. Still have a lot of catching up to do on here, so apologies in advance if I'm touching on something that's already been handled.

On the plus side I was able to spent most the morning pulling last week's thoughts and codes together so I have something coherent for you two to rip into. :)

PS: I still having those pesky commits attached to my tail.

Did you get your fork and branches squared away with git?

@Curlack
Copy link
Contributor

Curlack commented Nov 5, 2023

@Ste1io I'll submit another PR for this once the core change has been finalized on the develop branch. Issue has already been logged (See #517). You can follow our discussion in PR #711.

I was able to get rid of my tail btw, thanks.

@Ste1io
Copy link
Collaborator Author

Ste1io commented Nov 6, 2023

@Curlack, I've got an early day tomorrow and have to get off shortly, so I'll post the discussion I mentioned in my last message tomorrow. It details quite a few things that I encountered over the weekend while exploring some approaches to the quoting issue with runtime objects like anonymous types and dynamic objects.

Regarding my solution, however, here's a very abbreviated synopsis, so you and @asherber can mule over it between now and then.

Several months ago, I created a helper class, PocoDataHelper<T> designed to act as a bridge of sorts (I guess decorator would be the more correct term) between PocoData and the escape methods in IProvider. Goals:

  • Remove the boilerplate code currently necessary (creating a PocoData instance, followed by verbose calls to the provider, which wrap even more verbose linq queries for looking up the column name)
  • Accommodate more readable and expressive code in my personal projects
  • Promote more refactor-friendly and maintainable code, leveraging expressions to my advantage to eliminate the use of strings wherever possible
  • Provide intellisense support when referencing typed columns (which comes for free thanks to expressions)
  • Make it easier for the end user - aka myself - to do the right thing because it's easier than cutting corners or c/ping from another section of code
  • Do all this in a non-intrusive manner by applying the open/closed principle

Regarding my use of expressions, it's worth noting that my implementation does not incur the typical performance hit as is usually the case with compiled expressions: there is actually no compilation being done, I'm simply extracting the property name from the expression, and doing a lookup for it in the PocoData.Columns cache.

An optional IMapper can optionally be injected during construction with the same default that the core library uses already. I also am testing a version today that sets the PocoData.TableData.TableName property when an instance is constructed from the non-generic base PocoDataHelper class constructor, to further eliminate repeated code when used with runtime objects.

Escaping/quoting behavior is configured on a per-instance basis for flexibility, with the ability to override it as needed when referencing an identifier, such as in this test:

var pd = new PocoDataHelper<Order>(DB, escapeIdentifiers: false);
var sql = $"SELECT * FROM {pd.GetTableName(escapeIdentifier: true)} " +
          $"WHERE {pd.GetColumnName(c => c.Status, escapeIdentifier: true)} = @0";

My intentions were to finish full test coverage and do some perf profiling before bringing it to the table as something to consider adding, but as it currently offers the most feasible and safest solution to handling the intricacies with Oracle, I'm letting her enter the fray early. I've also been using it in two projects since originally writing her, with no problems.

I'm sure you have an idea the direction I'm suggesting by now, but just to be clear, a few considerations below. The current implementation I'm using for identifier escaping could be applied to the Oracle-specific configurations also, with sane defaults. Currently I'm leaning towards something along the lines of the following:

QuotingPolicy: Never (default), Always, WhenNecessary
CasingPolicy: FoldToUpper (default), FoldToLower, PreserveCase

This could be set on a global level,

var db = DatabaseConfiguration.Build()
                              .UsingConnectionStringName("Oracle")
                              .UsingQuotingPolicy(QuotingPolicy.Never)     // Never (default), Always, WhenNecessary
                              .UsingCasingPolicy(CasingPolicy.FoldToUpper) // FoldToUpper (default), FoldToLower, PreserveCase
                              .Create();

which the Oracle-specific helper class would respect unless overridden for specific instances (per-table override) or call sites (per-column override).

Keep in mind, one reason there are currently so many overloads for the base CRUD functionality, is to pass through the needed data for that particular database operation, when dealing with a runtime type. An anonymous or dynamic type can't have attributes, obviously. So the primarykey gets passed in through the call to Fetch, Query, Update, and all the others.

Currently there's one Oracle-specific attribute: SequenceName, along with several others that haven't been brought up yet in these discussions, but do exist and will spring their ugly head the minute we think we got it. If we're not using a typed class and attributes, how do we pass it through?

Option 1: More overloads, add a parameter for it, handle it at the core level
Option 2: Add it to our decorator class which extends PocoDataHelper, perform any logic necessary there, then allow the user's method call to work with our fixed-up final value no different than any other provider.

This also has the added benefit of being able to specify something, such as quoted/unquoted, in a flexible manner even inline, with no adverse effects. True, it can be done manually, though I think we all agree that argument has nothing going for it given how error-prone it is (besides locking you into that dbms without a major overhaul that touches everything everywhere), As a trivial example just to illustrate:

// Quoted or unquoted column name, with following signature:
db.SomeOperation<T>(string tableName, string primaryKeyColumnName);

// At callsite:
dynamic myFooTable = //...
var pdh = new PocoDataHelper(db, myFooTable, "Id"); // assume unquoted is default
db.SomeOperation<dynamic>(tableName: "FooTable", pdh.GetColumnName("BarColumn", true)); // quoted override for db column 'BarColumn'
db.SomeOperation<dynamic>(tableName: "FooTable", pdh.GetColumnName("BazColumn")); // unquoted default for db column 'BAZCOLUMN'

In closing:

  • Yes, I'm introducing a new design pattern, which I'm usually hesitant to do. However, it improves the usability of the entire codebase, not just Oracle, and provides an extensible and flexible tool to accommodate Oracle's edge cases without introducing potential regressions from invasive core library changes.
  • Implementing Oracle-specific behavior from the outside, without coupling it to core code that touches other providers, keeps ensures that none of the current providers are affected. This means no performance hit on other providers, and no change to the current API surface (out of necessity to deal with Oracle).
  • Quoting logic would be handled here in PocoDataHelper<T> (or a derived OraclePocoDataHelper<T>), with the escaping method in OracleDatabaseProvider being a non-transformative pass-through method.

This was far less "brief" than I meant when I started, but hopefully it gives everyone something to chew on, and I'm looking forward to your thoughts.

I updated all synchronous test methods for both PetaPoco.Tests.Integration/Databases/QueryTests.cs and SQLite's derived QueryTests, with all tests passing. Please take a few minutes to look over the diffs for those two linked files, as well as the class itself (under 85 lines), in the context of Oracle's identifier handling especially with anonymous and dynamic types.

Commit history including diffs from test cases here - click here.

@Curlack
Copy link
Contributor

Curlack commented Nov 6, 2023

I like this a lot. Can't wait to see the full version.

@Curlack
Copy link
Contributor

Curlack commented Nov 8, 2023

Would this PocoDataHelper also be able to make edge cases like #657 easier to deal with?

@Ste1io
Copy link
Collaborator Author

Ste1io commented Nov 9, 2023

I've only briefly skinned over the thread, and that was a few months ago, so can't say for sure until I can review it a little more thoroughly; at this point you probably would know better than I.

The property-to-column map is just a simple string lookup that saves PetaPoco's cached column on first access; no need for more than that in this case really. As with the column cache tho, it really depends on if the performance gains from not making repeated lookups to the global cache justified what (relatively little) extra data storage it incurred, as well as synchronizing cache flushes in case there's any long-living instances being used. I didn't want to invest too much time into caching until I had a chance to profile it though; what I originally wrote it for, it does damn well, so until now the rest was just a bonus.

Since it is self contained and only represents a single table per instance, I guess it could potentially help from what little I read, possibly with a couple minor adjustments. Just keep in mind it's still using the global cache as the source of truth on the first column lookup, if that makes any difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

4 participants