Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to run for the dataset DRKG #21

Open
chiajungchang opened this issue Jan 15, 2021 · 7 comments
Open

Failed to run for the dataset DRKG #21

chiajungchang opened this issue Jan 15, 2021 · 7 comments

Comments

@chiajungchang
Copy link

Hi,

Thanks for all the work. It looks amazing and I am looking forward to integrating my data with other diseases.
It's a pity that I cannot run the code for training DRKG on my machine, which only has CPUs.

The command is
"dglke_train --dataset DRKG --data_path ./train --data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' --model_name TransE_l2 --batch_size 64 --neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 100000 --log_interval 1000 --batch_size_eval 16 -adv --regularization_coef 1.00E-07 --test --num_thread 1 --num_proc 8 --neg_sample_size_eval 10000"

and the output is
Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293713 test triples.
|Train|: 5286834
random partition 5286834 edges into 8 parts
part 0 has 660855 edges
part 1 has 660855 edges
part 2 has 660855 edges
part 3 has 660855 edges
part 4 has 660855 edges
part 5 has 660855 edges
part 6 has 660855 edges
part 7 has 660849 edges
/opt/conda/lib/python3.7/site-packages/dgl/base.py:25: UserWarning: multigraph will be deprecated.DGL will treat all graphs as multigraph in the future.
warnings.warn(msg, warn_type)
|valid|: 293713
|test|: 293713
Bus error (core dumped)

The command works fine for other data.
"dglke_train --model_name TransE_l2 --dataset FB15k --batch_size 1000 --neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 3000 --log_interval 100 --batch_size_eval 16 --test -adv --regularization_coef 1.00E-09 --num_thread 1 --num_proc 8" worked successfully.

Thanks again for your sharing.

@classicsong
Copy link
Collaborator

Can you give following information:
DGL-ke version
DGL version
PyTorch version
CPU or GPU (it seems it is CPU)

@chiajungchang
Copy link
Author

chiajungchang commented Jan 15, 2021 via email

@classicsong
Copy link
Collaborator

Can you try --num_proc 1?
How many memory your CPU machine have?

@chiajungchang
Copy link
Author

chiajungchang commented Jan 15, 2021 via email

@classicsong
Copy link
Collaborator

Maybe it is due to OOM problem. You can accordingly increase the --num_thread and also try --num_proc 2 or 4

@chiajungchang
Copy link
Author

chiajungchang commented Jan 15, 2021 via email

@yxu1168
Copy link

yxu1168 commented Oct 8, 2022

Hello @classicsong,

I tried to use anaconda Jupyter to run Train_embeddings Notebook using CPU.
The command is:
"!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' --model_name TransE_l2 --batch_size 512
--neg_sample_size 256 --hidden_dim 400 --gamma 12.0 --lr 0.1 --max_step 100000 --log_interval 1000 --batch_size_eval 16 -adv --regularization_coef 1.00E-07 --test --num_thread 1 --num_proc 1 --neg_sample_size_eval 10000 "

Got error below:
'DGLBACKEND' is not recognized as an internal or external command,
operable program or batch file.

If I remove !DGLBACKEND=pytorch,
I got another error:
File "", line 1
dglke_train --dataset DRKG --data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' --model_name TransE_l2 --batch_size 512
^
SyntaxError: invalid syntax

Any advice/idea to fix the issue?
Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants