Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classification Multivariate time series : values both categorical and continues #4628

Open
Sandy4321 opened this issue Jul 30, 2023 · 5 comments
Labels
Feature Request New feature requested in system

Comments

@Sandy4321
Copy link

can not find

Short description

Classification Multivariate time series : values both categorical and continues

features
color | weight | gender |height |age | target
**** data samples examples ****
sequence 1 target is YES
time 1| black 56 m 160 34
time 2| white 77 f 170 54
time 3| yellow 87 m 167 43
time 4| white 55 m 198 72
time 5| white 88 f 176 32 YES

sequence 2 target is NO
time 1| yellow 86 f 120 14
time 2| white 27 m 150 44
time 3| yellow 58 f 156 22 NO

sequence 3 target is YES
time 1| yellow 8 f 177 23
time 2| white 15 m 138 82
time 3| yellow 8 f 177 23
time 4| white 15 m 138 82 YES

etc...
for example , can it be done by
https://vowpalwabbit.org/docs/vowpal_wabbit/python/latest/examples/search_sequence.html#

How this suggestion will help you/others

seems to be there are no such a capacities in web , but such a task is important to have solution for practice
then you will be the first

Possible solution/implementation details

do not know, do you know github repos with code to do this ?

Example/links if any

none

@Sandy4321 Sandy4321 added the Feature Request New feature requested in system label Jul 30, 2023
@cheng-tan
Copy link
Collaborator

You can try search sequence. If the sequence is short, then add a namespace for each timestep and a long context to predict the binary label. For weight, height, age you can experiment different encodings and break them into groups like the example in this post: https://stackoverflow.com/questions/28640837/vowpal-wabbit-how-to-represent-categorical-features

@Sandy4321
Copy link
Author

great thanks
do you have code example with some data as input pls

@cheng-tan
Copy link
Collaborator

Since the sequence is not long, maybe try classification first before sequence search? Each sequence will be one example for VW.

An example input would be(input format wiki):

1 |a time:time_1 black weight_bucket_1 m height_bucket_3 age_bucket_3 |b time:time_2 white weight_bucket_2 f height_bucket_3 age_bucket_5 ....
0 |a time:time_1 yellow weight_bucket_3 f height_bucket_1 age_bucket_1....

@Sandy4321
Copy link
Author

it is only example to develop mutual language
in real live sequences ar long
also each sequence has different number of time stamps

sequence 1 -> 5 time stamps
sequence 2 -> 3 time stamps
sequence 3 -> 4 time stamps

@JohnLangford
Copy link
Member

The easiest reasonable solution seems to be creating a custom learning-to-search application. To do this, you would modify and/or create a new task (like those here: https://github.com/VowpalWabbit/vowpal_wabbit/blob/master/vowpalwabbit/core/src/reductions/search/search_sequencetask.cc or in the parent directory). You would want to have a potentially large number of dummy labels associated with the nonterminal in order to allow for the transmission of information about the sequence to the terminal, and then you would have only the prediction at the end affect the declared loss.

Alternatively, (and with more work), you could implement an RNN and work off of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request New feature requested in system
Projects
None yet
Development

No branches or pull requests

3 participants