Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running "python training_script.py --batch_size 100 --dataset_name IWSLT --language_direction G2E #6

Open
minertom opened this issue Nov 29, 2021 · 2 comments

Comments

@minertom
Copy link

Not sure what is going on here but the best that I can tell is that there is a gzip file that seems to be missing.

Thank You
Tom

Traceback (most recent call last):
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1670, in gzopen
t = cls.taropen(name, mode, fileobj, **kwargs)
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1647, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1510, in init
self.firstmember = self.next()
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 2311, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1102, in fromtarfile
buf = tarfile.fileobj.read(BLOCKSIZE)
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/gzip.py", line 292, in read
return self._buffer.read(size)
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/gzip.py", line 479, in read
if not self._read_gzip_header():
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/gzip.py", line 427, in _read_gzip_header
raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'<!')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "training_script.py", line 192, in
train_transformer(training_config)
File "training_script.py", line 103, in train_transformer
train_token_ids_loader, val_token_ids_loader, src_field_processor, trg_field_processor = get_data_loaders(
File "/home/tom/Downloads/pytorch-original-transformer/utils/data_utils.py", line 223, in get_data_loaders
train_dataset, val_dataset, src_field_processor, trg_field_processor = get_datasets_and_vocabs(dataset_path, language_direction, dataset_name == DatasetType.IWSLT.name)
File "/home/tom/Downloads/pytorch-original-transformer/utils/data_utils.py", line 151, in get_datasets_and_vocabs
train_dataset, val_dataset, test_dataset = dataset_split_fn(
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/site-packages/torchtext/datasets/translation.py", line 144, in splits
path = cls.download(root, check=check)
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/site-packages/torchtext/data/dataset.py", line 191, in download
with tarfile.open(zpath, 'r:gz') as tar:
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1617, in open
return func(name, filemode, fileobj, **kwargs)
File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1674, in gzopen
raise ReadError("not a gzip file")
tarfile.ReadError: not a gzip file

@minertom minertom changed the title Error when running "python training_script.py --batch_size 100 --dataset_name IWSLT --languate_direction G2E Error when running "python training_script.py --batch_size 100 --dataset_name IWSLT --language_direction G2E Nov 29, 2021
@Lyttonkeepfoing
Copy link

I got the same bug now,how to solve it?

@dejangrubisic
Copy link

Same problem here, is there any solutions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants