Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError #546

Open
abedhammoud opened this issue Apr 27, 2024 · 2 comments
Open

UnicodeDecodeError #546

abedhammoud opened this issue Apr 27, 2024 · 2 comments

Comments

@abedhammoud
Copy link

I am getting this error when running the hook

-   repo: https://github.com/numpy/numpydoc
    rev: v1.7.0
    hooks:
      - id: numpydoc-validation
        exclude: (test|docs|labs|alant-st|apps)/.*

I am not really sure if it is my code that is causing this error, or the hook itself.

C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Lib\site-packages\numpydoc\docscrape.py:456: UserWarning: Unknown section Return
  warn(msg)
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Scripts\validate-docstrings.EXE\__main__.py", line 7, in <module>
  File "C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Lib\site-packages\numpydoc\hooks\validate_docstrings.py", line 400, in main
    findings.extend(process_file(file, config_options))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\abed\.cache\pre-commit\repoh5hb9w09\py_env-python3.12\Lib\site-packages\numpydoc\hooks\validate_docstrings.py", line 336, in process_file
    module_node = ast.parse(file.read(), filepath)
                            ^^^^^^^^^^^
  File "C:\Users\abed\miniconda3\envs\adev\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 7689: character maps to <undefined>
@larsoner
Copy link
Collaborator

Trying a fix in #550!

@larsoner
Copy link
Collaborator

Based on our tests I'm not sure if this is expected or not 🤔

The system encoding on Windows is cp1252 by default IIRC so in that sense it makes sense that UTF-8 would be a problem. Maybe we need to provide an option somehow / somewhere to specify the encoding or update the pre-commit docs about how to set it for the project / all systems? It seems like you can hack in env vars so could maybe use PYTHONUTF8 or something but that's not pretty 🙁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants