Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcat without copy #3336

Open
xgdgsc opened this issue May 24, 2023 · 3 comments
Open

vcat without copy #3336

xgdgsc opened this issue May 24, 2023 · 3 comments
Labels
Milestone

Comments

@xgdgsc
Copy link
Contributor

xgdgsc commented May 24, 2023

Currently it' s possible with CatViews on columns and then do a DataFrame construction with copycols = false . Would it be better if it become a builtin?

@bkamins bkamins added this to the 1.6 milestone May 24, 2023
@bkamins
Copy link
Member

bkamins commented May 24, 2023

Typically we prefer to follow the "composability" pattern. This means that users wanting such a functionality are recommended to use the extra package to get it.

The additional point is that most of the time DataFrames.jl users expect that operations like push! would work on a data frame. If I understand how CatViews.jl works it would not be possible then. Right?

@xgdgsc
Copy link
Contributor Author

xgdgsc commented May 25, 2023

What happens when the current copycols=false constructed DataFrame are push!ed to? Is there a copy without warning?

I just thought if construction by columns without copy is already supported, it might make sense to support construction by rows without copy too. Or move all of no copy construction to another package?

I don' t know if issue in lesser known packages like ahwillia/CatViews.jl#23 would be more discoverable in either case. Composability is nice if user already know a lot of packages, https://github.com/mcabbott/LazyStack.jl has an interesting summary which seem hard to choose for a new beginner (I only discovered this page after fixing CatViews). Indeed it' s hard to decide the boundaries as the lengthy Fixing Package Fragmentation discussion shows.

@bkamins
Copy link
Member

bkamins commented May 25, 2023

Is there a copy without warning?

No, then the source object is mutated. And in your case it cannot be mutated, so you would get an error.

it might make sense to support construction by rows without copy too.

Construction by rows must copy, as DataFrame uses column storage internally.

Indeed it' s hard to decide the boundaries

This can be hard indeed. In DataFrames.jl we use the rule (relevant to your case) that we do not introduce new column types, users can use column types they have from other packages; sadly - even this rule is violated by stack, but this is due to historical reasons when DataFrames.jl was a catch-all package)

@bkamins bkamins modified the milestones: 1.6, 1.7 Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants