Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk calls to DBT based on string length #1354

Conversation

willbowditch
Copy link
Contributor

@willbowditch willbowditch commented Jan 5, 2024

Fixes issue #1353

Bug

In some scenarios the argument list Elementary calls DBT with when uploading source freshness exceeds the limit of the operating system.

This is because the chunking of calls to DBT is currently based on the number of sources and not the length of the arguments. So some combination of sources results in longer argument lists than others.

This PR changes the chunking to use the character length and adds a buffer to allow for a bit of variation.

Testing

I've tested this locally by patching elementary in our deployment and confirming the results are uploaded.

On occasion 100 sources will result in an argument length that is too long, instead chunk using the serialised json string representation of the argument
@haritamar
Copy link
Collaborator

Hi @willbowditch !
Thank you for opening this PR and sorry for only just commenting on it.

I noticed that meanwhile we made a different change to upload_source_freshness that allows controlling the chunk size + lowering the default:
#1379

I know this is not the same, but the mechanisms are now conflicting.
If you feel this is still needed - I believe it makes sense to combine the two mechanisms so the rows_per_insert param would keep functioning (e.g. the default of it can be None, but if it's not None, then in any case don't create chunks with more rows than this value).

I'm closing this PR for now, but if you'd like please feel free to re-open with the changes above (or an alternative that makes sense)

@haritamar haritamar closed this May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants