Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205. in Chinese path, not in English path #1887

Open
huangliang0828 opened this issue Jul 12, 2023 · 0 comments

Comments

@huangliang0828
Copy link

huangliang0828 commented Jul 12, 2023

Bug Description

The local Python files (mainly autokeras files)reported the following error in Chinese directory path, but was used normally in English directory/path.
File D:\ProgramData\miniconda3\envs\autokeras_1\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205: invalid continuation byte.

Dataset used
from sklearn.datasets import fetch_20newsgroups:
fetch_20newsgroups( subset="train", shuffle=True, random_state=42, categories=categories))
or others
tf.keras.utils.get_file("train.csv", TRAIN_DATA_URL)
even
“mnist.load_data() ” “ imdb.load_data()”

I don't know what the problem is.

Setup Details

Include the details about the versions of:

  • OS type and version: win10 64bit & minconda 2023
  • Python: 3.9.13 or 3.9.16 Spyder=5.3.3
  • autokeras: >=1.0.20
  • keras-tuner: >=1.1.0
  • scikit-learn:>= 1.0.1
  • numpy: >=1.21.5
  • pandas: >=1.3.5
  • tensorflow: tensorflow >=2.9.1 or 2.10

Additional context

CMD set, chcp 655001(UTF-8) or 936(gbk)

all files run into UnicodeDecodeError at the clf.fit(.....)
Search: Running Trial #1
Hyperparameter |Value |Best Value So Far
text_block_1/bl...|vanilla |?
........
optimizer |adam |?
learning_rate |0.001 |?

Epoch 1/100
Traceback (most recent call last):

Cell In[7], line 1
clf.fit(doc_train, label_train,epochs=100, verbose=2)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\tasks\text.py:160 in fit
history = super().fit(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\auto_model.py:292 in fit
history = self.tuner.search(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\engine\tuner.py:193 in search
super().search(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras_tuner\engine\base_tuner.py:179 in search
results = self.run_trial(trial, *fit_args, **fit_kwargs)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras_tuner\engine\tuner.py:304 in run_trial
obj_value = self._build_and_fit_model(trial, *args, **copied_kwargs)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\engine\tuner.py:101 in _build_and_fit_model
_, history = utils.fit_with_adaptive_batch_size(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:88 in fit_with_adaptive_batch_size
history = run_with_adaptive_batch_size(
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:101 in run_with_adaptive_batch_size
history = func(x=x, validation_data=validation_data, **fit_kwargs)
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\autokeras\utils\utils.py:89 in
batch_size, lambda **kwargs: model.fit(**kwargs), **fit_kwargs
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
raise e.with_traceback(filtered_tb) from None
File D:\ProgramData\miniconda3\envs\autokeras\lib\site-packages\tensorflow\python\eager\execute.py:54 in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205: invalid continuation byte

@huangliang0828 huangliang0828 changed the title Bug: Bug: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205 in Chinese path, but normally in English path Jul 12, 2023
@huangliang0828 huangliang0828 changed the title Bug: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205 in Chinese path, but normally in English path Bug: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 205. in Chinese path, not in English path Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant