You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am new to OLMo and I want to retrain(like finetune) several checkpoints provided by the csv from checkpoints/official.
``
However, I followed the instructions in readme and downloaded the checkpoint via the link, but the 'RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory' always throws out.
According to some solution to this kind of questions from Stackoverflow, they pointed out it might caused by the corrupted checkpoint file or wrong torch version.I changed different checkpoints and varied torch version from 2.0.0 to 2.3.0, but the error is still there. Also, the checkpoints download progress seems done, reaching 100%, so the ckpt files should not be corrupted.
馃悰 Describe the bug
I am new to OLMo and I want to retrain(like finetune) several checkpoints provided by the csv from checkpoints/official.
``
However, I followed the instructions in readme and downloaded the checkpoint via the link, but the 'RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory' always throws out.
According to some solution to this kind of questions from Stackoverflow, they pointed out it might caused by the corrupted checkpoint file or wrong torch version.I changed different checkpoints and varied torch version from 2.0.0 to 2.3.0, but the error is still there. Also, the checkpoints download progress seems done, reaching 100%, so the ckpt files should not be corrupted.
Here is my terminal command:
torchrun --nproc_per_node=1 scripts/train.py configs/official/OLMo-1B.yaml --load_path=https://olmo-checkpoints.org/ai2-llm/olmo-small/46zc5fly/step369000-unsharded --save_folder=/opt/data/private/OLMo/olmo/step369000 --wandb=null
AND THE ERROR:
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
Versions
Python 3.8.10
accelerate==0.25.0
ai2-olmo==0.3.1
aiofiles==23.2.1
aiohttp==3.8.6
aiosignal==1.3.1
albumentations==1.3.1
altair==5.1.2
annotated-types==0.6.0
antlr4-python3-runtime==4.9.3
anyio==4.0.0
apache-beam==2.55.1
async-timeout==4.0.3
attrs==23.1.0
backports.zoneinfo==0.2.1
beautifulsoup4==4.12.3
bitsandbytes==0.41.3.post2
boto3==1.34.93
botocore==1.34.93
braceexpand==0.1.7
cached-path==1.6.2
cachetools==5.3.3
certifi==2019.11.28
chardet==3.0.4
charset-normalizer==3.3.1
click==8.1.7
clip==0.2.0
clip-benchmark==1.5.0
cloudpickle==2.2.1
cmake==3.27.7
contourpy==1.1.1
crcmod==1.7
cycler==0.12.1
dataclasses==0.6
datasets==2.14.5
dbus-python==1.2.16
dill==0.3.1.1
dnspython==2.6.1
docker-pycreds==0.4.0
docopt==0.6.2
exceptiongroup==1.1.3
ExifRead-nocycle==3.0.1
fastapi==0.104.1
fastavro==1.9.4
fasteners==0.19
ffmpy==0.3.1
filelock==3.12.4
fire==0.4.0
fonttools==4.44.0
frozenlist==1.4.0
fsspec==2023.9.2
ftfy==6.1.1
gdown==5.1.0
gitdb==4.0.11
GitPython==3.1.40
google-api-core==2.18.0
google-auth==2.29.0
google-cloud-core==2.4.1
google-cloud-storage==2.16.0
google-crc32c==1.5.0
google-resumable-media==2.7.0
googleapis-common-protos==1.63.0
gradio==3.39.0
gradio-client==0.7.0
grpcio==1.62.2
h11==0.14.0
hdfs==2.7.3
httpcore==1.0.2
httplib2==0.22.0
httpx==0.25.1
huggingface-hub==0.22.2
idna==2.8
imageio==2.31.6
img2dataset==1.42.0
importlib-resources==6.1.1
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.3.2
Js2Py==0.74
jsonpickle==3.0.4
jsonschema==4.20.0
jsonschema-specifications==2023.11.1
kiwisolver==1.4.5
lazy-loader==0.3
lightning-utilities==0.11.2
linkify-it-py==2.0.2
lit==17.0.3
loguru==0.7.2
loralib==0.1.2
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.7.3
mdit-py-plugins==0.3.3
mdurl==0.1.2
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
networkx==3.1
numpy==1.21.0
nvidia-cublas-cu11==11.10.3.66
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu11==8.5.0.96
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu11==10.9.0.58
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu11==10.2.10.91
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu11==11.7.4.91
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu11==2.14.3
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu11==11.7.91
nvidia-nvtx-cu12==12.1.105
objsize==0.7.0
omegaconf==2.3.0
open-clip-torch==2.23.0
openai-clip==1.0.1
opencv-python-headless==4.8.1.78
orjson==3.9.10
packaging==23.2
pandas==1.5.3
pathtools==0.1.2
peft==0.7.1
Pillow==9.1.1
pkgutil-resolve-name==1.3.10
promise==2.3
proto-plus==1.23.0
protobuf==3.20.3
psutil==5.9.6
pyarrow==10.0.1
pyarrow-hotfix==0.6
pyasn1==0.6.0
pyasn1-modules==0.4.0
pycocoevalcap==1.2
pycocotools==2.0.7
pydantic==2.5.1
pydantic-core==2.14.3
pydot==1.4.2
pydub==0.25.1
pygments==2.17.2
PyGObject==3.36.0
pyjsparser==2.7.1
pymongo==4.7.0
pyparsing==3.1.1
PySocks==1.7.1
python-apt==2.0.0+ubuntu0.20.4.7
python-dateutil==2.8.2
python-multipart==0.0.6
pytz==2023.3.post1
PyWavelets==1.4.1
PyYAML==6.0.1
quant-cuda==0.0.0
qudida==0.0.4
referencing==0.31.0
regex==2023.10.3
requests==2.31.0
requests-unixsocket==0.2.0
rich==13.7.1
rpds-py==0.13.0
rsa==4.9
s3transfer==0.10.1
safetensors==0.4.3
scikit-image==0.21.0
scikit-learn==1.3.2
scipy==1.10.1
semantic-version==2.10.0
sentencepiece==0.1.99
sentry-sdk==1.33.1
setproctitle==1.3.3
shortuuid==1.0.11
six==1.14.0
smmap==5.0.1
sniffio==1.3.0
soupsieve==2.5
starlette==0.27.0
sympy==1.12
termcolor==2.3.0
threadpoolctl==3.2.0
tifffile==2023.7.10
timm==0.9.10
tokenizers==0.19.1
toolz==0.12.0
torch==2.0.0
torch-summary==1.4.5
torchaudio==0.9.0
torchmetrics==1.3.2
torchvision==0.15.2+cu117
tqdm==4.66.1
transformers==4.40.1
triton==2.0.0
typing-extensions==4.11.0
tzdata==2023.3
tzlocal==5.2
uc-micro-py==1.0.2
urllib3==1.26.18
uvicorn==0.24.0.post1
wandb==0.12.21
wcwidth==0.2.8
webdataset==0.2.72
websockets==11.0.3
xxhash==3.4.1
yarl==1.9.2
zipp==3.17.0
zstandard==0.22.0
The text was updated successfully, but these errors were encountered: