Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to increase the context length? #70

Open
Riofd opened this issue May 11, 2024 · 4 comments
Open

How to increase the context length? #70

Riofd opened this issue May 11, 2024 · 4 comments
Labels
FAQ Frequently asked question

Comments

@Riofd
Copy link

Riofd commented May 11, 2024

Thanks for your great work. Chronos gets good performance on our own dataset by zero-shot prediction. But it seems that Chronos is unable to capture patterns of long periods, for example, I have a feature with a cycle of weeks, and my data is 15 minutes per step, with a sequence length of 96 * 7=602 for a single cycle, which exceeds the context_length of the model. Is there any way for the model to capture such periodic features, or can I only retrain the model?

@lostella lostella added the FAQ Frequently asked question label May 12, 2024
@lostella
Copy link
Contributor

Hi @Riofd. Indeed, the model internally limits the context length (currently happens here).

You can increase the model context length after loading it, as in the example below. However the models were trained with windows of data of limited length, so they may not be able to make sense of the increased context: it's something for experiments to verify. I believe that for proper handling of longer context, we will need to pretrain the models with longer windows of data, but let us know if the following works!

Here I use pipeline.embed to be able to inspect the encoder output shape, but this would have the same effect if you do pipeline.predict instead:

import torch
from chronos import ChronosPipeline

pipeline = ChronosPipeline.from_pretrained("amazon/chronos-t5-tiny")
context = torch.ones(size=(2000,))

# get encoder embeddings
embedding, _ = pipeline.embed(context=context)
print(embedding.shape)

# patch the context length
pipeline.tokenizer.config.context_length = 1024

# get encoder embeddings again
embedding, _ = pipeline.embed(context=context)
print(embedding.shape)

outputs

torch.Size([1, 513, 256])
torch.Size([1, 1025, 256])

where 513 is 512 (the original model context length) + the EOS token embedding, and similarly 1025 after the patching.

@Riofd
Copy link
Author

Riofd commented May 13, 2024

Thank you for your reply. I try your way and successfully get different result comparing with context_length=512, but the forecasting performance decreases significantly. Maybe I should try to fine-tune it?

@lostella
Copy link
Contributor

Yes, fine-tuning may be the best way to go. We added code and some instructions here, let me know if you run into any issue

@Riofd Riofd closed this as completed May 30, 2024
@abdulfatir
Copy link
Contributor

Let's keep this open as a FAQ.

@abdulfatir abdulfatir reopened this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FAQ Frequently asked question
Projects
None yet
Development

No branches or pull requests

3 participants