Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows11 运行paddlex,执行图片分类任务的训练时,提示CUSOLVER_STATUS_INTERNAL_ERROR错误 #1734

Open
jiaolongxue opened this issue Sep 15, 2023 · 3 comments
Assignees

Comments

@jiaolongxue
Copy link

描述问题

windows11 运行paddlex,执行图片分类任务的训练时,提示错误
image

单显卡 RTX4090

复现

训练参数

image

错误日志

This log file path is E:\paddlex\projects\P0005\T0007\err.log
注意:标志为WARNING/INFO类的仅为警告或提示类信息,非错误信息
D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\tensor\creation.py:130: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. 
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  if data.dtype == np.object:
Process Process-1:4:
Traceback (most recent call last):
  File "multiprocessing\process.py", line 297, in _bootstrap
  File "multiprocessing\process.py", line 99, in run
  File "paddlexui\pms\model_tasks\tasks.py", line 73, in _call_paddlex_train
  File "paddlexui\pms\model_tasks\train\classification.py", line 118, in train
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddlex\cv\models\classifier.py", line 888, in __init__
    model_name=model_name, num_classes=num_classes, **params)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddlex\cv\models\classifier.py", line 70, in __init__
    self.net = self.build_net(**params)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddlex\cv\models\classifier.py", line 75, in build_net
    **params)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddlex\ppcls\arch\backbone\legendary_models\pp_lcnet.py", line 352, in PPLCNet_x1_0
    model = PPLCNet(scale=1.0, **kwargs)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddlex\ppcls\arch\backbone\legendary_models\pp_lcnet.py", line 183, in __init__
    stride=2)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddlex\ppcls\arch\backbone\legendary_models\pp_lcnet.py", line 93, in __init__
    bias_attr=False)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\nn\layer\conv.py", line 656, in __init__
    data_format=data_format)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\nn\layer\conv.py", line 135, in __init__
    default_initializer=_get_default_param_initializer())
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\fluid\dygraph\layers.py", line 424, in create_parameter
    default_initializer)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\fluid\layer_helper_base.py", line 378, in create_parameter
    **attr._to_kwargs(with_initializer=True))
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\fluid\framework.py", line 3137, in create_parameter
    initializer(param, self)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\fluid\initializer.py", line 719, in __call__
    stop_gradient=True)
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\fluid\framework.py", line 3167, in append_op
    kwargs.get("stop_gradient", False))
  File "D:\soft-install\PaddleX_GUI_2.1.0_win10\paddle\fluid\dygraph\tracer.py", line 45, in trace_op
    not stop_gradient)
OSError: (External) CUSOLVER error(7). 
  [Hint: 'CUSOLVER_STATUS_INTERNAL_ERROR'. An internal cuSolver operation failed. This error is usually caused by a cudaMemcpyAsync() failure.To correct: check that the hardware, an appropriate version of the driver, and the cuSolver library are correctly installed. Also, check that the memory passed as a parameter to the routine is not being deallocated prior to the routine’s completion.] (at ..\paddle/fluid/platform/device_context.h:418)
  [operator < gaussian_random > error]
@NeTChengGH
Copy link

up主,同问该问题,我也是一模一样的问题

@NeTChengGH
Copy link

40系列的显卡,CUDA11.2的版本,运行会有问题。我将window11的CUDA安装了11.7和cudnn安装了8.4.1后,到CUDA的安装目录下,将bin文件夹复制并替换到paddleX根目录下,然后运行exe后,训练就正常启动了。
image

@green512
Copy link

40系列的显卡,CUDA11.2的版本,运行会有问题。我将window11的CUDA安装了11.7和cudnn安装了8.4.1后,到CUDA的安装目录下,将bin文件夹复制并替换到paddleX根目录下,然后运行exe后,训练就正常启动了。 image

非常正确,我按照你的说明成功解决错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants