Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows系统测试用例时出现下面问题 请问应该如何解决? #26

Open
2778817374 opened this issue Feb 28, 2024 · 5 comments

Comments

@2778817374
Copy link

测试代码:
import xuance
runner = xuance.get_runner(method='dqn',
env='classic_control',
env_id='CartPole-v1',
is_test=False)
runner.run()
报错:
test_dqn.py:None (test_dqn.py)
test_dqn.py:6: in
runner.run()
xuance\torch\runners\runner_drl.py:93: in run
self.agent.save_model("final_train_model.pth")
xuance\torch\agents\agent.py:90: in save_model
self.learner.save_model(model_path)
xuance\torch\learners\learner.py:25: in save_model
torch.save(self.policy.state_dict(), model_path)
C:\Users\G11.conda\envs\xuance_env\lib\site-packages\torch\serialization.py:422: in save
with _open_zipfile_writer(f) as opened_zipfile:
C:\Users\G11.conda\envs\xuance_env\lib\site-packages\torch\serialization.py:309: in _open_zipfile_writer
return container(name_or_buffer)
C:\Users\G11.conda\envs\xuance_env\lib\site-packages\torch\serialization.py:287: in init
super(_open_zipfile_writer_file, self).init(torch._C.PyTorchFileWriter(str(name)))
E RuntimeError: Parent directory E:\JYC_CODE\强化学习\xuance./models/dqn does not exist.
collected 0 items / 1 error

是由于Windows系统下文件路径和Linux文件路径不一致导致的么?请问应该如何解决?谢谢~

@wenzhangliu
Copy link
Collaborator

你好,方便让我看一下torch和xuance的版本号吗?先排除一下是不是版本问题

@2778817374
Copy link
Author

谢谢您,
torch==1.13.0
xuance=='v1.0.9'
xuance的版本是在xuance库中的__init__.py中查到的,您看这种查法是正确的么?

@2778817374
Copy link
Author

此外,在运行demo_marl.py测试时,我将parser.add_argument("--device", type=str, default="cuda:0")修改为
parser.add_argument("--device", type=str, default="cpu"),
代码运行时报错:
Traceback (most recent call last):
File "E:\JYC_CODE\xuance\demo_marl.py", line 22, in
is_test=parser.test)
File "E:\JYC_CODE\xuance\xuance\common\common_tools.py", line 241, in get_runner
runner = run_REGISTRYargs[0].runner if type(args) == list else run_REGISTRYargs.runner
File "E:\JYC_CODE\xuance\xuance\torch\runners\runner_pettingzoo.py", line 101, in init
self.marl_agents.append(REGISTRY_Agent[arg.agent](arg, self.envs, arg.device))
File "E:\JYC_CODE\xuance\xuance\torch\agents\multi_agent_rl\maddpg_agents.py", line 21, in init
policy = REGISTRY_Policyconfig.policy
File "E:\JYC_CODE\xuance\xuance\torch\policies\deterministic_marl.py", line 499, in init
normalize, initialize, activation, device)
File "E:\JYC_CODE\xuance\xuance\torch\policies\deterministic_marl.py", line 447, in init
actor_hidden_size, normalize, initialize, activation, device)
File "E:\JYC_CODE\xuance\xuance\torch\policies\deterministic_marl.py", line 393, in init
layers.extend(mlp_block(input_shape[0], action_dim, None, nn.Tanh, initialize, device)[0])
File "E:\JYC_CODE\xuance\xuance\torch\utils\layers.py", line 15, in mlp_block
linear = nn.Linear(input_dim, output_dim, device=device)
File "C:\Users\G11.conda\envs\xuance_env\lib\site-packages\torch\nn\modules\linear.py", line 96, in init
self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=str), but expected one of:

  • (tuple of ints size, *, tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
  • (tuple of SymInts size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
    您看应该如何解决?
    此外,我看代码中torch的版本为1.13.0,如果我要更改torch为GPU版本,您看应该修改torch-gpu版本为多少合适,是否要和我本地原来装的CUDA和CUDNN版本相对应,我的cuda版本为12.0.140,是否要对CUDA和cudnn进行降级,麻烦您了,谢谢!

@wenzhangliu
Copy link
Collaborator

wenzhangliu commented Feb 28, 2024

您好,第二个问题和torch版本无关,是由于我们近期优化MARL部分代码出现的小bug,目前已修复并上传最新代码了。给您带来不便非常抱歉。

但是第一个问题在我电脑上没有发生过,我用的是MacOS,我跟您一样也怀疑是Windows识别路径方式和Linux不一样导致的。感谢您提出这个问题,我们会在Windows上充分测试并排除类似问题。

@2778817374
Copy link
Author

万分感谢!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants