Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

varlen features implementations #29

Open
minarastgar opened this issue Aug 28, 2020 · 12 comments
Open

varlen features implementations #29

minarastgar opened this issue Aug 28, 2020 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@minarastgar
Copy link

Do you have a plan to implement Varlen sparse features and different pooling layers?

@jackguagua
Copy link
Member

pooling layers? can you say it more specifically?

@minarastgar
Copy link
Author

I meant mostly varlan sparse features, for example, sequence of item_ids. Every item id is a sparse feature, and the last 10 items purchased by a user is a sequence of embeddings of item-ids which can be aggregated with a pooling layer like averagepooling.

@jackguagua
Copy link
Member

I got it, it will be supported in next release.

@minarastgar
Copy link
Author

Thanks for your quick reply. Looking forward to it. Any ETA for the next release?

@jackguagua
Copy link
Member

It should be around October this year.

@minarastgar
Copy link
Author

sorry for bugging you. Wonder if the release mentioned above is available. Thank you very much

@jackguagua
Copy link
Member

I'm very very sorry for the delay of the original plan due to some other urgent tasks in the past two months. I will strive to release this new feature by the end of November.Sorry again.

@jackguagua jackguagua added the enhancement New feature or request label Nov 2, 2020
oaksharks added a commit that referenced this issue Nov 25, 2020
@jackguagua
Copy link
Member

@minarastgar varlen features is ready. here for details #44 (comment)

@minarastgar
Copy link
Author

@jackguagua thank you so much. This is absolutely fantastic

@minarastgar
Copy link
Author

minarastgar commented Jan 28, 2021

Hi @jackguagua , I have a quick question about Varlen Features. Let's say there is a varlen feature like streams of movie_ids, and a categorical feature that is the movie_id we want to show to user. So we want to have an embedding for movie_id which is used by movie_id as well as streams of movie_ids . How can I specify that the embedding used for streams_of_movie_ids and movie_id is the same

                                   task=consts.TASK_REGRESSION,
                                   categorical_columns=["movie_id", "user_id", "gender", "occupation", "zip", "title", "age"],
                                   metrics=['mse'],
                                   fixed_embedding_dim=True,
                                   embeddings_output_dim=4,
                                   apply_gbm_features=False,
                                   apply_class_weight=True,
                                   earlystopping_patience=5,
                                   var_len_categorical_columns=[('stream_of_movie_ids', "|", "max")]) ```

@jackguagua
Copy link
Member

DT can't do what you want now. I'm not very clear about the purpose of doing this. If you have the code that uses keras to implement it, pls send to me for reference.

@minarastgar
Copy link
Author

Let me please clarify this, let say we have a list of movie_id [movie_id1, movie_id2,...,movie_id10] which are the last 10 movies watched by the user. On the other hand, we have a target movie which is movie_id100 (sparse_feature). for both streams (list of movie_id ) and sparse (target_title), we want to use movie_ids to build the embeddings. We do not want to generate different embeddings for entities in streams and sparse. the are coming from the same root which is movie_id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants