Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The right way to convert a column ? #321

Open
leobenkel opened this issue Sep 4, 2018 · 3 comments
Open

The right way to convert a column ? #321

leobenkel opened this issue Sep 4, 2018 · 3 comments

Comments

@leobenkel
Copy link

leobenkel commented Sep 4, 2018

Hello,
I am starting with Frameless and I am having a hard time converting my code based on spark-Dataframes to the Frameless framework.
The blocking point I reach now is how to override a column.

Let's say I have a dataframe with col1,col2,..., myColumn .
myColumn is a String and it was an export from a database where this column is actually a Seq[String] so I now need to convert it back to it's type.
I used to do

df
  .withColumn("myColumn", toArray($"myColumn")

How would you do the same thing with Frameless ? Do you need two case class ? And use withColumnTuple and dropTuple ?

@imarios
Copy link
Contributor

imarios commented Sep 11, 2018

Hi @leobenkel sorry I've missed this question! yes, withColumnTuple is the way to add and drop columns with Frameless. If you go with withColumn you will need to define a new case class.

@leobenkel
Copy link
Author

But when using withColumnTuple I am loosing the name of all my columns. If I had index, feature, label and I use withColumnTuple I expect to see index, feature, label, _1 but instead i see _1,_2,_3,_4

@imarios
Copy link
Contributor

imarios commented Nov 15, 2018

@leobenkel for some reason I missed this comment, apologies! So in the case where you need to keep the types it's better to use an projection. For example, (x:X).project[B]. The catch here is that you will need to define a new type B. The docs have more examples on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants