Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separator parameter in pivot() has no effect #15775

Closed
2 tasks done
jsarbach opened this issue Apr 19, 2024 · 5 comments · Fixed by #15957
Closed
2 tasks done

separator parameter in pivot() has no effect #15775

jsarbach opened this issue Apr 19, 2024 · 5 comments · Fixed by #15957
Labels
documentation Improvements or additions to documentation python Related to Python Polars

Comments

@jsarbach
Copy link
Contributor

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

pl.DataFrame({'ix': [0, 1], 'col1': ['a', 'b'], 'col2': [1, 2], 'val': [1, 2]}).pivot(index='ix', columns=['col1', 'col2'], values='val')
pl.DataFrame({'ix': [0, 1], 'col1': ['a', 'b'], 'col2': [1, 2], 'val': [1, 2]}).pivot(index='ix', columns=['col1', 'col2'], values='val', separator='_')
pl.DataFrame({'ix': [0, 1], 'col1': ['a', 'b'], 'col2': [1, 2], 'val': [1, 2]}).pivot(index='ix', columns=['col1', 'col2'], values='val', separator='letsseparate')

Log output

shape: (2, 3)
┌─────┬─────────┬─────────┐
│ ix  ┆ {"a",1} ┆ {"b",2} │
│ --- ┆ ---     ┆ ---     │
│ i64 ┆ i64     ┆ i64     │
╞═════╪═════════╪═════════╡
│ 0   ┆ 1       ┆ null    │
│ 1   ┆ null    ┆ 2       │
└─────┴─────────┴─────────┘

Issue description

According to the documentation, separator is used as separator/delimiter in generated column names. However, it seems to be ignored (incl. the default _).

Expected behavior

>>> pl.DataFrame({'ix': [0, 1], 'col1': ['a', 'b'], 'col2': [1, 2], 'val': [1, 2]}).pivot(index='ix', columns=['col1', 'col2'], values='val')
shape: (2, 3)
┌─────┬─────────┬─────────┐
│ ix  ┆ a_1     ┆ b_2     │
│ --- ┆ ---     ┆ ---     │
│ i64 ┆ i64     ┆ i64     │
╞═════╪═════════╪═════════╡
│ 0   ┆ 1       ┆ null    │
│ 1   ┆ null    ┆ 2       │
└─────┴─────────┴─────────┘

Installed versions

--------Version info---------
Polars:               0.20.21
Index type:           UInt32
Platform:             Linux-6.5.0-27-generic-x86_64-with-glibc2.38
Python:               3.10.6 (main, Oct  7 2022, 20:19:58) [GCC 11.2.0]
----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          2.2.0
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            0.10.3
fsspec:               2022.8.2
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           <not installed>
nest_asyncio:         <not installed>
numpy:                1.23.4
openpyxl:             3.0.10
pandas:               2.0.3
pyarrow:              15.0.0
pydantic:             1.10.2
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             0.8.2
xlsxwriter:           <not installed>
@jsarbach jsarbach added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Apr 19, 2024
@MarcoGorelli MarcoGorelli added documentation Improvements or additions to documentation and removed bug Something isn't working needs triage Awaiting prioritization by a maintainer labels Apr 20, 2024
@MarcoGorelli
Copy link
Collaborator

hey - the separator is used if there are multiple values

@jsarbach
Copy link
Contributor Author

Ah I see, thanks for clarifying.

@reswqa
Copy link
Collaborator

reswqa commented Apr 22, 2024

@jsarbach fancy updating the document about this?

@cmdlineluser
Copy link
Contributor

Are there any plans to allow expressions?

i.e.

df.pivot(
    index="ix"
    columns=pl.format("{}_{}", "col1", "col2"),
    values="val"
)   

In order to:

(df.with_columns(columns=pl.format("{}_{}", "col1", "col2"))
   .pivot(index="ix", columns="columns", values="val")
)

# shape: (2, 3)
# ┌─────┬──────┬──────┐
# │ ix  ┆ a_1  ┆ b_2  │
# │ --- ┆ ---  ┆ ---  │
# │ i64 ┆ i64  ┆ i64  │
# ╞═════╪══════╪══════╡
# │ 0   ┆ 1    ┆ null │
# │ 1   ┆ null ┆ 2    │
# └─────┴──────┴──────┘

@jsarbach
Copy link
Contributor Author

@jsarbach fancy updating the document about this?

Ok :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation python Related to Python Polars
Projects
None yet
4 participants