Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 03 Yelp Dataset has a Typo #30

Open
amancioandre opened this issue Jun 11, 2020 · 0 comments
Open

Chapter 03 Yelp Dataset has a Typo #30

amancioandre opened this issue Jun 11, 2020 · 0 comments

Comments

@amancioandre
Copy link

Hi everyone,

Chapter 3 does not load Yelp data due to a typo on the last line of the dataset:

Line Review
73357: "1","Capital City Transfer han

Using nrows argument passing the number of rows - 1, fixed for me.

train_reviews = pd.read_csv(args.raw_train_dataset_csv, header=None, names = ['rating', 'review'], nrows=73356)

Or

train_reviews = pd.read_csv(args.raw_train_dataset_csv, header=None, names = ['rating', 'review'], error_bad_lines=False)

Or by just appending a " at this line.

Still, would be nice to fix this typo on the dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant