Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running a stored procedure only once after all records in a CSV file have been saved to the database #445

Open
mickvikt opened this issue Jun 8, 2023 · 2 comments
Assignees
Labels
documentation Documentation may be added or completed on the portal question Further information is requested

Comments

@mickvikt
Copy link

mickvikt commented Jun 8, 2023

Let's say we have a data pipeline like this:

        var stream = contextStream
            .ToSqlCommand("Create tables", Queries.CreateTables)
            .CrossApplyFolderFiles("Get data file", "data.csv", true)
            .CrossApplyTextFile("Parse data", FlatFileDefinition.Create(
                i => new
                {
                    code = i.ToColumn(0),
                    name = i.ToColumn(1)
                }).IsColumnSeparated(','))
            .SqlServerSave("Populate data table", o => o
                .SeekOn(i => code)
                .ToTable("[dbo].[some_table]")
            )
	   .ToSqlCommand("Run a stored procedure", Queries.RunStoredProcedure);

A CSV is parsed, its records are saved to a table on a SQL Server database and then I want to run a stored procedure which depends on that data.

If I try to run it and the CSV file contains 5 rows, stored procedure is being run 5 times, and I only need to run it once.

I tried to create a second data pipeline to run the stored procedure like this:

        var stream = contextStream
            .ToSqlCommand("Create tables", Queries.CreateTables)
            .CrossApplyFolderFiles("Get data file", "data.csv", true)
            .CrossApplyTextFile("Parse data", FlatFileDefinition.Create(
                i => new
                {
                    code = i.ToColumn(0),
                    name = i.ToColumn(1)
                }).IsColumnSeparated(','))
            .SqlServerSave("Populate data table", o => o
                .SeekOn(i => code)
                .ToTable("[dbo].[some_table]")
            );

       contextStream
	   .ToSqlCommand("Run a stored procedure", Queries.RunStoredProcedure);         

but when I try to execute it, stored procedure is not being run after whole CSV import is complete and rows are stored in the table, it's being run somewhere in between and the stored procedure can't get the data it needs since it has not been saved yet.

How can I signal to the pipeline, that I need to run the stored procedure just once?
I tried to create a separate DefineProcess method for running SP, but that seems really clumsy.

What would you advice?
Thank you.

@paillave
Copy link
Owner

I suggest you to use WaitWhenDone operator the following way:

        var stream = contextStream
            .ToSqlCommand("Create tables", Queries.CreateTables)
            .CrossApplyFolderFiles("Get data file", "data.csv", true)
            .CrossApplyTextFile("Parse data", FlatFileDefinition.Create(
                i => new
                {
                    code = i.ToColumn(0),
                    name = i.ToColumn(1)
                }).IsColumnSeparated(','))
            .SqlServerSave("Populate data table", o => o
                .SeekOn(i => code)
                .ToTable("[dbo].[some_table]")
            );

       contextStream
	   .WaitWhenDone("wait everything is done", stream)
	   .ToSqlCommand("Run a stored procedure", Queries.RunStoredProcedure);

@paillave paillave self-assigned this Jun 12, 2023
@paillave paillave added question Further information is requested documentation Documentation may be added or completed on the portal labels Jun 12, 2023
@mickvikt
Copy link
Author

mickvikt commented Jun 12, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation may be added or completed on the portal question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants