Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting two database streams from the same database connection in parallel #427

Open
mickvikt opened this issue Mar 22, 2023 · 5 comments
Open
Labels
question Further information is requested

Comments

@mickvikt
Copy link

mickvikt commented Mar 22, 2023

Hello, nice package you have here for ETL in .NET!
I have a question regarding database streams:

Suppose I need to define two streams for two different tables from the same ODBC database connection:

        var arch1 = contextStream
            .CrossApplySqlServerQuery("first query from source 1", o => o
                    .FromQuery("select * from dbo.carr")
                    .WithMapping(i => new
                    {
                        carr_code = i.ToColumn("carr_code"),
                        carr_name = i.ToColumn("carr_name")
                    })
                , "source1")
            .Select("create row to save source1 from first query", i => new { i.carr_name, i.carr_code });
        
        var arch2 = contextStream
            .CrossApplySqlServerQuery("second query from source 1", o => o
                    .FromQuery("select * from dbo.carr_old")
                    .WithMapping(i => new
                    {
                        carr_code_old = i.ToColumn("carr_code"),
                        carr_name_old = i.ToColumn("carr_name")
                    })
                , "source1")
            .Select("create row to save source1 from second query", i => new { i.carr_name_old, i.carr_code_old });

When executing this I get this error from the ODBC driver:

Unhandled exception. Paillave.Etl.Core.JobExecutionException: Job execution failed
 ---> System.Data.Odbc.OdbcException (0x80131937): ERROR [HY000] [Microsoft][ODBC SQL Server Driver]Connection is busy with results for another hstmt

This implies that the cursor for the connection has to be looped through to completion before being able to get another stream. Is there any way that this could be remedied? The source database is an old SQL Server 2000, so we're forced to use ODBC.
Thank you.

BTW, it works if I output one stream to a file and then define a new stream sequentially like this:

        var arch = contextStream
            .CrossApplySqlServerQuery("query from source 1", o => o
                    .FromQuery("select * from dbo.carr")
                    .WithMapping(i => new
                    {
                        carr_code = i.ToColumn("carr_code"),
                        carr_name = i.ToColumn("carr_name")
                    })
                , "source1")
            .Select("create row to save source1", i => new { i.carr_name, i.carr_code })
            .ToTextFileValue("to file for source1", @"C:\temp\carr_out_arch.csv",
                FlatFileDefinition.Create(f => new { carr_name = f.ToColumn("Name"), carr_code = f.ToColumn("Code") })
                    .IsColumnSeparated('|'))
            .WriteToFile("save to source1 output file", i => i.Name)
            .CrossApplySqlServerQuery("query from source 2", o => o
                    .FromQuery("select * from dbo.carr_old")
                    .WithMapping(i => new
                    {
                        carr_code_old = i.ToColumn("carr_code"),
                        carr_name_old = i.ToColumn("carr_name")
                    })
                , "source1")
            .Select("create row to save source2", i => new { i.carr_name_old, i.carr_code_old })
            .ToTextFileValue("to file for source2", @"C:\temp\carr_out_arch2.csv",
                FlatFileDefinition.Create(f => new { carr_name_old = f.ToColumn("Name"), carr_code_old = f.ToColumn("Code") })
                    .IsColumnSeparated('|'))
            .WriteToFile("save to source2 output file", i => i.Name);
@Radamonas
Copy link

Just worth to mention, that this is the new beta version which we are using to exercise data transfers from old SQL instance.

@paillave
Copy link
Owner

I am working on ODBC and OLEDB at the moment. I will let you know when it is OK. The related issue is this one: #423

@paillave paillave added duplicate This issue or pull request already exists question Further information is requested and removed duplicate This issue or pull request already exists labels Mar 22, 2023
@paillave
Copy link
Owner

I realise I didn't answer correctly to your question.
Actually, OLDB drivers don't support concurrent calls like mentionned here: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/d51d71e3-512f-4143-a8db-3e39286cb746/microsoftodbc-sql-server-driverconnection-is-busy-with-results-for-another-hstmt?forum=sqldataaccess

So it you want to avoid parallelization, you will have to use WaitWhenDone operator. An example of how to use it is here: #294 (comment)

@Radamonas
Copy link

Radamonas commented Mar 23, 2023

@paillave could you add WaitWhenDone to documentation, this was a surprise for me as I haven't found any mentions on the main page of this. If that not absolute.

@paillave
Copy link
Owner

It will be added... the (complete) documentation is an ongoing project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants