You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched in the issues and found nothing similar.
Motivation
In stream mode, the unaware bucket mode will use the FIFOSplitAssigner, So we could avoid Bin-Packing serveral files into one split. IMO, it has two benefit:
In the FIFOSplitAssigner, we have a work-stealing mechanism that allows for higher total throughput as the split size decreases.
We can avoid the issue of skewed files by comparing two files with similar row counts but very different file sizes.
Solution
One file one split for unaware bucket mode
Anything else?
No response
Are you willing to submit a PR?
I'm willing to submit a PR!
The text was updated successfully, but these errors were encountered:
Search before asking
Motivation
In stream mode, the unaware bucket mode will use the
FIFOSplitAssigner
, So we could avoid Bin-Packing serveral files into one split. IMO, it has two benefit:FIFOSplitAssigner
, we have a work-stealing mechanism that allows for higher total throughput as the split size decreases.Solution
One file one split for unaware bucket mode
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: