Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

header_row not working when loading from file #195

Open
Mindavi opened this issue Apr 13, 2022 · 0 comments
Open

header_row not working when loading from file #195

Mindavi opened this issue Apr 13, 2022 · 0 comments

Comments

@Mindavi
Copy link

Mindavi commented Apr 13, 2022

I'm trying to load a file from disk using a csv::CSVReader. I'm setting up the CSVFormat to auto-guess the file format.

However, I'm having trouble skipping the first row(s) of a csv file using this library.

I made an example test file and test case to show what's going wrong.

skip_rows.csv:

a;b;c;d
this;is;before;header
this;is;before;header_too
timestamp;distance;angle;amplitude
22857782;30000;314159;0
22857786;30000;314109;0

test_read_csv_file.cpp:

// Could be added to test_read_csv_file.cpp
TEST_CASE("Skip rows loaded from file", "[skip_rows_file]")
{
  auto format = csv::CSVFormat::guess_csv();
  format.header_row(3);

  csv::CSVReader reader("skip_rows.csv", format);

  std::vector<std::string> expected = {
      "timestamp", "distance", "angle", "amplitude"
  };

  // Original issue: Leading comments appeared in column names
  REQUIRE(expected == reader.get_col_names());
}

Test result

PS C:\csv-parser> .\build\tests\Debug\csv_test.exe "[skip_rows_file]"
Filters: [skip_rows_file]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
csv_test.exe is a Catch v2.12.1 host application.
Run with -? for options

-------------------------------------------------------------------------------
Skip rows loaded from file
-------------------------------------------------------------------------------
C:\csv-parser\tests\test_read_csv_file.cpp(78)
...............................................................................

C:\csv-parser\tests\test_read_csv_file.cpp(90): FAILED:
  REQUIRE( expected == reader.get_col_names() )
with expansion:
  { "timestamp", "distance", "angle", "amplitude" }
  ==
  { "a", "b", "c", "d" }

===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed

As can be seen, the header_row setting is not honored in this case, and the header is derived from the first row.

(After looking a bit in the code, this may be intended behavior. Please close this issue if it is 👍. Since it's kind of surprising regardless, I'd like to note it anyway).

I have 2 workarounds for this for whenever someone runs into this:

  1. Hardcode the delimiter (this reduces flexibility but does work)
  2. Get a guess for the format using the guess_format function and use that (example below) when constructing the reader, to disable the auto-guessing feature during construction

Guess workaround:

csv::CSVGuessResult guess = csv::guess_format("filename.csv");
csv::CSVFormat fmt;
fmt.delimiter(guess.delim).header_row(3);
csv::CSVReader reader("filename.csv", fmt);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant