Skip to content
This repository has been archived by the owner on Apr 8, 2024. It is now read-only.

Error training intel optimised version of model #5

Open
krishnashed opened this issue Feb 1, 2023 · 0 comments
Open

Error training intel optimised version of model #5

krishnashed opened this issue Feb 1, 2023 · 0 comments

Comments

@krishnashed
Copy link

I'm trying to run the model on 12 core, 16 GB RAM, without any NVIDIA GPU

Key errors in the trace :
2023-02-01 09:39:13.852788: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-02-01 09:39:13.852845: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-02-01 09:39:13.852900: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ubuntu): /proc/driver/nvidia/version does not exist

WARNING:tensorflow:Gradients do not exist for variables ['tf_bert_model/bert/pooler/dense/kernel:0', 'tf_bert_model/bert/pooler/dense/bias:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?

2023-02-01 09:39:48.954473: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at mkl_batch_matmul_op.cc:126 : RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[128,12,128,64] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator mklcpu

Below is the entire output when running the model
https://drive.google.com/file/d/120TYttMpS56WRbVDUBtLijrY5yZLhsaO/view?usp=share_link

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant