Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling UTF-8-sig for encoding? #102

Open
b-meson opened this issue Oct 27, 2022 · 0 comments
Open

Handling UTF-8-sig for encoding? #102

b-meson opened this issue Oct 27, 2022 · 0 comments

Comments

@b-meson
Copy link

b-meson commented Oct 27, 2022

Hi all,

I was recently trying to use csvlink to filter the data from two well-formated data sets. I tried to follow the documentation but it was not working at all. I repeatedly got the following error despite the field "sample" being in both my files.

csvlink: error: Could not find field 'sample' in input

Ultimately, I was able to dump the CSV and notice my header was printing as \ufeffsample which left me to figure out this was a byte order mark (BOM) issue. I made the following change to csvlink.py and the code ran for me.

-                self.input_1 = open(self.configuration['input'][0], encoding='utf-8').read()
+                self.input_1 = open(self.configuration['input'][0], encoding='utf-8-sig').read()
             except IOError:
                 raise self.parser.error("Could not find the file %s" %
                                    (self.configuration['input'][0], ))

             try:
-                self.input_2 = open(self.configuration['input'][1], encoding='utf-8').read()
+                self.input_2 = open(self.configuration['input'][1], encoding='utf-8-sig').read()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant