Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to prepare training data, especially the size ? #2

Open
FajunChen opened this issue Jun 10, 2018 · 11 comments
Open

How to prepare training data, especially the size ? #2

FajunChen opened this issue Jun 10, 2018 · 11 comments

Comments

@FajunChen
Copy link

FajunChen commented Jun 10, 2018

I have a batch of time serial data for regression analysis. Every timestamp has 30 features. At the beginning data are prepared as numpy ndarries. Then, I transform them into tensor datasets and set the batch_size=15 for data loader, just like this:

data_tensors = TensorDataset(torch.Tensor(x_tr), torch.Tensor(y_tr))

loader_tr = DataLoader(
            data_tensors, batch_size=batch_size, shuffle=False, num_workers=4)

However, I got an error as follows.

~/miniconda3/envs/py36/lib/python3.6/site-packages/echotorch/nn/ESNCell.py in forward(self, u, y, w_out)
    128 
    129                 # Compute input layer
--> 130                 u_win = self.w_in.mv(ut)
    131 
    132                 # Apply W to x

RuntimeError: mv: Expected 1-D argument vec, but got 0-D

It looks like the forward method need parameter "u" to be a 3-D tensor, and time_length need to be set explicitly. Is the time_length mean the number of reservoirs ? but we already have the hidden_dim.

I am quite confused about how to prepare the training data for LiESN. Could you please help me?

@nschaetti
Copy link
Owner

Hi!

The input to the LiESN/ESN should be a 3D tensor of size "batch size" x "time length" x "input dimension". The input dimension is set at the creation of the LiESN. To use the batch size superior to one, all the input time series should have the same length. If it is not the case I use batch_size = 1.
What is the size of our input tensor? (if you print x.size())?
It seems that ut has zero dimension, so your input is probably 1D.

Hope it will help you.

Nils

@FajunChen
Copy link
Author

Hi Nils,

Thank you for your reply.

However, I am afraid your guess is not correct exactly. In the example above, x_tr is a 2D training ndarray with shape of (10000, 30), while y_tr is a 1D ndarray of shape (10000, 1).

What should I do? Could you please help me find a way to reshape the input data? Thanks a lot.

@nschaetti
Copy link
Owner

Hi,

So your input data is a 30-dim time series of length 10000, right?

The class TensorDataset will take samples along first dimension of x_tr. So the tensor you give to the ESN is probably of size "batch_size" x 30.

If x_tr is a single dataset, you can give it directly to the ESN after adding a batch dimension :

u = torch.Tensor(x_tr)
y = torch.Tensor(y_tr)
u = u.view(1, -1, 30)
y = y.view(1, -1, 1)
u, y = Variable(u), Variable(y)
esn(u, y)
esn.finalize()

Can you show me the complete code?

Regards,

Nils

@FajunChen
Copy link
Author

FajunChen commented Jun 25, 2018

Dear Nils,

My code is as follows:

class TorchEsnModelTrainer(object):
    def pre_fit(self, dfx, y=None):
        x = torch.Tensor(dfx).view(dfx.shape[0], -1, dfx.shape[1])
        y = torch.Tensor(y).view(y.shape[0], -1, 1)

        return Variable(x), Variable(y)

    def train(self, x_tr, y_tr, hidden_size=60, **kwargs):
        """ train the model """
        num_features = x_tr.shape[1]
        x_tr, y_tr = self.pre_fit(x_tr, y_tr)

        # model
        esn = etnn.LiESN(
            num_features,
            hidden_size,
            1,
            learning_algo='inv',
        )

        esn(x_tr, y_tr)
        esn.finalize()
        self.model = esn

        return self.model

Since the raw input data of x_tr is a 10000X30 dims ndarray, thus I first use pre_fit to transform it to the required format, then fit into the esn model. This time, I got my notebook quit directly from the following errors:

** On entry to SLASWP, parameter number 6 had an illegal value

Thank you very much for your patience.

Regards,

@sebastienwood
Copy link

sebastienwood commented Nov 6, 2018

Hi,

I'm trying to replicate the examples provided (https://github.com/nschaetti/EchoTorch/blob/master/examples/timeserie_prediction/narma10_esn.py). The same kind of issue appears :

RuntimeError: size mismatch, [100 x 10], [1] at /Users/soumith/minicondabuild3/conda-bld/pytorch_1524590658547/work/aten/src/TH/generic/THTensorMath.c:1928

The line that make this issue appears is :
u_win = self.w_in.mv(ut)

It seems related to the issue you ran into @FajunChen.

Also, when trying on a custom dataset, using the view(1,-1,input_dim) to conform to Pytorch's RNN format, the issue moves :
RuntimeError: size mismatch, [100 x 1], [5] at /Users/soumith/minicondabuild3/conda-bld/pytorch_1524590658547/work/aten/src/TH/generic/THTensorMath.c:1928
y_wfdb = self.w_fdb.mv(yt)

@nschaetti maybe one way to adress this issue in the future would be to include an util to import and automatically convert tabular data like .csv ?
Thanks !

@jlousada315
Copy link

Hi,

I have the same error while trying to compile the Switch Attractor Example. The input_dim is equal to 1 by default , but if I try to change it i get a size mismatch error.
Can you help ?
thanks in advance !

@FajunChen
Copy link
Author

FajunChen commented May 5, 2019 via email

@jlousada315
Copy link

Then how do I fix it ?

@matthewewreed
Copy link

I've become interested in solving the problem that gives rise to this bug (having an input dimension greater than 1). Coupled nonlinear systems are, on their own, pretty cool. Having accurate (over relatively short timescale) models would be extraordinarily useful. I've brute-forced my way through every windows bug and edited Nils' code so it can be called without producing pickling errors. But now I can't even figure out where w_in.mv is defined.

@sebastienwood
Copy link

If I'm not mistaken it has been corrected by the fix 6e1ea94 2 months ago : 6e1ea94

I believe @nschaetti may want to review this issue to decide to close it or not ! :)

@matthewewreed
Copy link

Interesting. I'm using the corrected code.

@nschaetti nschaetti added this to To do in EchoTorch 1.0 Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
EchoTorch 1.0
  
Fixes
Development

No branches or pull requests

5 participants