You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I'm not able to switch my tuner to a RemoteLauncher using SageMaker. I get the following error:
Traceback (most recent call last):
File "remote_main.py", line 53, in <module>
tuner = Tuner.load(tuner_path)
File "/opt/ml/code/syne_tune/tuner.py", line 564, in load
tuner = dill.load(f)
File "/opt/conda/lib/python3.8/site-packages/dill/_dill.py", line 272, in load
return Unpickler(file, ignore=ignore, **kwds).load()
File "/opt/conda/lib/python3.8/site-packages/dill/_dill.py", line 419, in load
obj = StockUnpickler.load(self)
File "/opt/conda/lib/python3.8/site-packages/dill/_dill.py", line 409, in find_class
return StockUnpickler.find_class(self, module, name)
File "/opt/ml/code/syne_tune/backend/sagemaker_backend/sagemaker_backend.py", line 24, in <module>
from sagemaker.interactive_apps import TensorBoardApp
ModuleNotFoundError: No module named 'sagemaker.interactive_apps'
2024-02-08 14:43:13,239 sagemaker-training-toolkit INFO Waiting for the process to finish and give a return code.
2024-02-08 14:43:13,239 sagemaker-training-toolkit INFO Done waiting for a return code. Received 1 from exiting process.
2024-02-08 14:43:13,240 sagemaker-training-toolkit ERROR Reporting training FAILURE
2024-02-08 14:43:13,240 sagemaker-training-toolkit ERROR ExecuteUserScriptError:
ExitCode 1
ErrorMessage "ModuleNotFoundError: No module named 'sagemaker.interactive_apps'"
Command "/opt/conda/bin/python3.8 remote_main.py --no_tuner_logging False --store_logs False --tuner_path tuner/"
2024-02-08 14:43:13,240 sagemaker-training-toolkit ERROR Encountered exit_code 1
2024-02-08 14:43:37 Uploading - Uploading generated training model
2024-02-08 14:43:37 Failed - Training job failed
To reproduce
Steps to reproduce the behavior:
Use SageMaker trial backend
Use RemoteLauncher with SageMaker tuner backend
Expected behavior
Install necessary dependencies and then orchestrate HPO job
Paste the output of the pip freeze command below:
Latest Syne Tune version on SM notebook, launching remote tuner to SM
The text was updated successfully, but these errors were encountered:
Hey @austinmw sorry for the delay. Could you post a script to reproduce the error? From the error script it seems that you resume a previous tuning job? Does the launch_height_sagemaker_remotely.py in examples work for you?
Describe the bug
I'm not able to switch my tuner to a
RemoteLauncher
using SageMaker. I get the following error:To reproduce
Steps to reproduce the behavior:
Expected behavior
Install necessary dependencies and then orchestrate HPO job
Paste the output of the
pip freeze
command below:Latest Syne Tune version on SM notebook, launching remote tuner to SM
The text was updated successfully, but these errors were encountered: