Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Count #948

Closed
RifS opened this issue Mar 12, 2018 · 5 comments
Closed

Getting Count #948

RifS opened this issue Mar 12, 2018 · 5 comments

Comments

@RifS
Copy link

RifS commented Mar 12, 2018

I have similar issue on #92. But I'm not sure whether I should've bumped the issue that has been over 5 years old, so I'm creating a new one.

I'm aware that calling Count() on the enumerated records will cause the cursor goes to the end and further operation/foreach loop on the records will return nothing.

The outline of my code is like this

    var myCSV = new CsvReader(File.OpenText(txtBox_CSVFile.Text));
    myCSV.Configuration.RegisterClassMap<MyCSVClassMap>();
    
    var record = new MyRecordClass();
    var allRecords = myCSV.EnumerateRecords(record);
    int recordsLength = allRecords.Count<MyRecordClass>();

    foreach (var r in allRecords)
    {
        /* some read/write operation here
        */
        scannedCSVEntry++;
        progress.value = scannedCSVEntry/recordsLength;
    }

The recordsLength is important here as I need to update the progress bar while I'm processing the entry

But how do I get around this?

@JoshClose
Copy link
Owner

There are a few ways.

  1. Pull the whole thing into memory by doing a ToList. You can then do several operations over the data, including Count.
  2. Parser the whole file. You can use just the parser to go through the whole file and count the records. This should be pretty fast because it's not creating any class objects or doing any type conversions.
  3. Just count the line endings in the file. You can then use the Context.RawRow property to do your progress.

@RifS
Copy link
Author

RifS commented Mar 14, 2018

Thanks,
I finally picked method 2.

StreamReader sr = File.OpenText(txtBox_CSVFile.Text);
int recordsLength = 0;
while(sr.ReadLine() != null)
{
    ++recordsLength;
}
recordsLength--; // discount 1 line because there are column headers in first row.
Console.WriteLine("recordsLength : " + recordsLength);
sr.Close();

Does EnumerateRecords() automatically close the stream after hydrating all the entries? I'm wondering whether I should call Close() after calling the function.

@JoshClose
Copy link
Owner

My suggestion is to wrap in a using block.

using(StreamReader sr = File.OpenText(txtBox_CSVFile.Text))
{
    int recordsLength = 0;
    while(sr.ReadLine() != null)
    {
        ++recordsLength;
    }
    recordsLength--; // discount 1 line because there are column headers in first row.
    Console.WriteLine("recordsLength : " + recordsLength);
}

@Liero
Copy link

Liero commented Nov 26, 2018

If you want just update progress bar, then using stream.Position worked for me with large files without reading the entire file first:

using(StreamReader sr = File.OpenText(txtBox_CSVFile.Text))
{
   var csvReader = new new CsvHelper.CsvReader(sr);
    while (csvReader.Read())
    {
         double progress = (double)sr.BaseStream.Position / sr.BaseStream.Length;
    }
}

it should work also with GetRecords, but I've not tested:

foreach(var record in csvReader.GetRecords<MyRecordClass>())
{
    double progress = (double)sr.BaseStream.Position / sr.BaseStream.Length;
}

@svrooij
Copy link

svrooij commented Nov 21, 2019

I've checked your solution

foreach(var record in csvReader.GetRecords<MyRecordClass>())
{
    double progress = (double)sr.BaseStream.Position / sr.BaseStream.Length;
}

which didn't work for me.
But you can do this

foreach(var record in csvReader.GetRecords<MyRecordClass>())
{
    double progress = (double)csvReader.Context.CharPosition / csvReader.Context.CharsRead;
}

to get the desired result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants