Skip to content

jasonsparc/dsvparser-ahk2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parse CSV files! Parse TSV files! Parse PSV files?!

Yes!! Parse it all! All the DSV files!

DSV parser for AHK v2!

A simple utility for reliably parsing delimiter-separated values (i.e., DSV) in AutoHotkey v2 scripts, whether that be comma-separated (i.e., CSV), tab-separated (i.e., TSV), or something else, possibly even exotic ones.

For AutoHotkey v1, check out, https://github.com/jasonsparc/DSVParser-AHK

Features

  • RFC 4180 compliant.
  • Supports newlines and other weird characters in cells enclosed in text qualifiers.
  • Allows custom delimiters and text qualifiers.
  • Supports multiple delimiters (like Microsoft Excel).
  • Supports multiple qualifiers (unlike Microsoft Excel).
  • Proper support for malformed inputs (e.g., "hello" world "foo bar" will be parsed as hello world "foo bar").
    • Achieved by treating cells as composed of two components: a text-qualified part (i.e., any raw string, excluding unescaped qualifier characters), and a delimited text part (i.e., any raw string, including qualifier characters, except newlines and delimiter characters).
    • The behavior for ill-formed cells are therefore not undefined.
    • The above treatment is also similar to that of Microsoft Excel.
  • Recognizes many ASCII and Unicode line break representations:

Example

Basic usage

Download dsvparser-ahk2.ahk1 then include it in your script (via #Include) as its library.

Once you've done that, here's how you might use the library:

; Load a TSV data string
tsvStr := FileRead("data.tsv")

; Parse the TSV data string
MyTable := TSVParser.ToArray(tsvStr)

; Do something with `MyTable`

MsgBox MyTable[2][1] ; Access 1st cell of 2nd row

; ... do something else with `MyTable` ...

; Convert into a CSV, with custom line break settings
csvStr := CSVParser.FromArray(MyTable, "`n", false)

if (FileExist("new-data.csv"))
    FileDelete("new-data.csv")
FileAppend(csvStr, "new-data.csv")

And there's more!

Both TSVParser and CSVParser are premade instances of the class DSVParser. To read and write in other formats, create a new instance of DSVParser and specify your desired configuration.

Here's a DSVParser for pipe-separated values (aka., bar-separated):

global BSVParser := DSVParser("|")

Many more utility functions are provided for parsing and formatting DSV strings, including parsing just a single DSV cell.

Check out the source code! It's really just a tiny file.

Why not just use Loop parse?

AutoHotkey v2 comes with Loop parse _, "CSV", which allows you to quickly parse a “single line” of CSV string. However, if your string contains several lines of text, it will still treat it as if it was a single line of CSV string. To mitigate this problem, you may first break the string up into several lines using a file-reading loop (either Loop read or Loop parse _, "`n", "`r"), then parse each line separately. However, that ignores the fact that a CSV cell is allowed to contain multiple lines—Yes! All in a single CSV cell! If your CSV data is quite complex, Loop parse won't be able to handle such cases.

For the initial motivation regarding the creation of this library, see the forum post: “[Library] DSV Parser - AutoHotkey Community

P.S. This library can even be used to parse a CSV inside a CSV, inside a CSV, inside a CSV, inside a…—whatever “RFC 4180” allows.

Footnotes

  1. Tip: Right-click this link dsvparser-ahk2.ahk, then "Save link as…" or whatever is the equivalent provided by your browser.