Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 512: invalid start byte #4

Open
minn333 opened this issue Jul 3, 2022 · 2 comments

Comments

@minn333
Copy link

minn333 commented Jul 3, 2022

Dear author, I encountered UnicodeDecodeError while runnning mutpred_merge.py. I tried to correct writting as
data = pd.read_csv("intermediates/scores/" + filename, names=cols, header=None, sep="|", encoding = 'unicode_escape')

but failed to correct.

The new error came out as: UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 4598-4599: truncated \xXX escape。

Do your have any suggestion?

Thanks a lots!


Traceback (most recent call last):
File "/$User/MutPredMerge-master/mutpred_merge.py", line 202, in
merged_variants = merge()
File "/$User/MutPredMerge-master/mutpred_merge.py", line 110, in merge
data = pd.read_csv("intermediates/scores/" + filename, names=cols, header=None, sep="|")
File "/$PATH/snakemake/lib/python3.10/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/$PATH/snakemake/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 680, in read_csv
return _read(filepath_or_buffer, kwds)
File "/$PATH/snakemake/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 575, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/$PATH/snakemake/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 933, in init
self._engine = self._make_engine(f, self.engine)
File "/$PATH/snakemake/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1235, in _make_engine
return mapping[engine](f, **self.options)
File "/$PATH/snakemake/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 75, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 544, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 734, in pandas._libs.parsers.TextReader._get_header
File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 1952, in pandas._libs.parsers.raise_parser_error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 512: invalid start byte

@trberg
Copy link
Collaborator

trberg commented Jul 5, 2022

What type of input data are you using? Can you share the first few rows of your input file?

@minn333
Copy link
Author

minn333 commented Jul 7, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants