Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DictReader seems to not be handling utf-encoding properly #154

Open
thomasyu888 opened this issue Apr 22, 2023 · 3 comments
Open

DictReader seems to not be handling utf-encoding properly #154

thomasyu888 opened this issue Apr 22, 2023 · 3 comments
Labels
bug Something isn't working
Milestone

Comments

@thomasyu888
Copy link

thomasyu888 commented Apr 22, 2023

Description of the bug

This test samplesheet seems to fail check_samplesheet.py. I removed the data and just provided the headers - see attached files below, but...

cat test.csv 
sample,fastq_1,fastq_2,seq_type

Command used and terminal output

with open("test.csv", "r") as in_handle:
        reader = csv.DictReader(in_handle, dialect=sniff_format(in_handle))
        # Validate the existence of the expected header columns.
        if not required_columns.issubset(reader.fieldnames):
            req_cols = ", ".join(required_columns)
            sys.exit(1)

reader.fieldnames
['\ufeffsample', 'fastq_1', 'fastq_2', 'seq_type']

Relevant files

test.csv

System information

No response

A quick fix is to read in the file and re-write it out with pandas, but thought I would report this.

@thomasyu888 thomasyu888 added the bug Something isn't working label Apr 22, 2023
@christopher-mohr christopher-mohr added this to the 2.1 milestone Jun 21, 2023
@christopher-mohr
Copy link
Collaborator

Thanks for reporting this and sorry for the late reply. We will fix this before the next release.

@martinabetti-97
Copy link

Thank you Thomas for the quick fix indication, however, this approach is not working for me. Have you re-written the csv with the standard pd.to_csv and used default parameters? @thomasyu888

@thomasyu888
Copy link
Author

Hi, @martinabetti-97 I did use the standard pd.to_csv with the default parameters. I forget which version of pandas I was using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants