Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1701: character maps to <undefined> #2

Open
pablonieto0981 opened this issue Dec 30, 2021 · 1 comment

Comments

@pablonieto0981
Copy link

pablonieto0981 commented Dec 30, 2021

Any idea how to fix this? I am getting the following error:

File "C:/Users/******/bookNLP.py", line 28, in
booknlp.process(input_file, output_directory, book_id)

File "C:\Users*******\Anaconda3\envs\BookNLP\lib\site-packages\booknlp\booknlp.py", line 17, in process
self.booknlp.process(inputFile, outputFolder, idd)

File "C:\Users******\Anaconda3\envs\BookNLP\lib\site-packages\booknlp\english\english_booknlp.py", line 426, in process
genderEM=GenderEM(tokens=tokens, entities=entities, refs=refs, genders=self.gender_cats, hyperparameterFile=self.gender_hyperparameterFile)

File "C:\Users******\Anaconda3\envs\BookNLP\lib\site-packages\booknlp\english\gender_inference_model_1.py", line 71, in init
self.read_hyperparams(hyperparameterFile)

File "C:\Users******\Anaconda3\envs\BookNLP\lib\site-packages\booknlp\english\gender_inference_model_1.py", line 167, in read_hyperparams
header=file.readline().rstrip()

File "C:\Users*******\Anaconda3\envs\BookNLP\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1701: character maps to

@pablonieto0981
Copy link
Author

pablonieto0981 commented Dec 30, 2021

Just fixed it 👍🏽 > basically did this...

...in booknlp\english\gender_inference_model_1.py changed open(filename) for open(filename, encoding='UTF8') like...

def read_hyperparams(self, filename):
	self.hyperparameters={}
	with open(filename, encoding='UTF8') as file:
		header=file.readline().rstrip()
		gender_mapping={}
		for idx, val in enumerate(header.split("\t")[2:]):
			if val in self.genderID:
				gender_mapping[self.genderID[val]]=idx+2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant