fixlengths -- insert extra commas not at end #323

ggrothendieck · 2023-06-09T13:56:57Z

One reason to need fixlengths is that there are multiple subfields for one field without quoting it. If that field is the last then fixlengths can be used but not if it is the second last, say. What would be nice is if the position of the insertion point for the extra commas could be specified. For example -1 would mean the extra comma(s) would be inserted at the last comma. If there are no commas then the commas are still added at the end.

Here is an example of sample input taken from https://stackoverflow.com/questions/76423878/reading-a-csv-file-into-r-which-contains-comma-separated-values-in-single-obser/76427295#76427295

 clothes,colours,size 
 shirt,blue,green,grey,small
 shirt,yellow,black,small
 shorts,blue,medium
 shorts,black,large

The corresponding output would be

clothes,colour1,colour2,colour3,size
shirt,blue,green,grey,small
shirt,yellow,black,,small
shorts,blue,,,medium
shorts,black,,,large

although I think it would be sufficient if it did not deal with the header since that can always be skipped by whatever program is reading it in.

To be clear, this gawk program from same source would accept that input and produce that output for this particular example.

# To run: gawk -f process.awk myfile.csv > myfile2.csv
# To configure: edit header= line as needed
BEGIN { 
    header = "clothes,colour1,colour2,colour3,size" 

    commas = gensub(/[^,]/, "", "g", header)
    ncommas = length(commas)
    FS = OFS = ","
}
NR == 1 { print header; next } # skip input header & use header variable instead
{ 
  if (NF > 1) print gensub(",", substr(commas, 1, ncommas - NF + 2), NF-1)
  else print $0 commas
}

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixlengths -- insert extra commas not at end #323

fixlengths -- insert extra commas not at end #323

ggrothendieck commented Jun 9, 2023 •

edited

fixlengths -- insert extra commas not at end #323

fixlengths -- insert extra commas not at end #323

Comments

ggrothendieck commented Jun 9, 2023 • edited

ggrothendieck commented Jun 9, 2023 •

edited