Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: if|table, priority of search patterns #2181

Open
PSLLSP opened this issue Mar 5, 2024 · 12 comments
Open

Documentation: if|table, priority of search patterns #2181

PSLLSP opened this issue Mar 5, 2024 · 12 comments
Labels
csv The csv file format, csv output format, or generally CSV-related. docs Documentation-related.

Comments

@PSLLSP
Copy link

PSLLSP commented Mar 5, 2024

This is just a request to update documentation for CSV import, command if|table

IF table can have several patterns that match record. Important point is that hledger uses THE LATEST pattern that matches the expression. Information describing this behavior should be added to the documentation.

Example, records in CSV file don't have full date, only year and month. We want to translate that information in the way that payment is done in the last day of the month, like 2020-04 -> 2024-04-30. This is not straight forward task, months have different number of days, between 28 and 31 and February can have 28 or 29 days.

NOTE: The payment is done on the first day of the next month but I do not see any "easy" way to code such translation without coding rule for each month, like 2020-04 -> 2020-05-01 or 2020-12 -> 2021-01-01. That is why I use the last day of the month, it is easier to code...

The translation of date from Y-M format to Y-M-D is coded with if|table; note that rules matching any February are before rules those defines years where February has 29 days. Rule with higher priority has to be added to the end of the "if table".

IF|table cannot have "else" rule but such rule can be defined with default value that could be overwritten with a rule in the table.

$ cat demo8.csv
2021-12,1749424,EUR,3.61
2021-11,1722985,EUR,3.48
2021-10,1693457,EUR,2.70
2021-09,1675066,EUR,3.33
2021-08,1634982,EUR,5.41
2021-07,1626884,EUR,5.41
2021-06,1586496,EUR,5.23
2021-05,1558019,EUR,5.41
2021-04,1535096,EUR,5.23
2021-03,1513987,EUR,5.41
2021-02,1473282,EUR,4.90
2021-01,1435894,EUR,5.41
2020-12,1405040,EUR,5.41
2020-11,1381326,EUR,5.23
2020-10,1363741,EUR,4.83
2020-09,1316067,EUR,4.83
2020-08,1289920,EUR,7.05
2020-07,1264568,EUR,10.15
2020-06,1235759,EUR,10.15
2020-05,1200729,EUR,10.15
2020-04,1165835,EUR,10.15
2020-03,1145395,EUR,10.15
2020-02,1122715,EUR,10.58
2020-01,1091401,EUR,10.14
$ cat demo8.csv.rules
# hledger import rules

fields        ym, code, curr, amou

description   DEMO|Payment %ym
status        *

amount1       %amou %curr

account1      expenses:demo
account2      assets:bank:card

# Format "%Y-%m/%d" is a workaround. I tried to use format "%Y-%m-%d" but I cannot enter expression like "%ym-31"
# Expression "%ym/31" is accepted without any error...

date-format   %Y-%m/%d

# handle date: 2020-04 -> 2020-04-30
# default date, month has 31 days; this works like "else" rule...
date %ym/31

# define the last day for each month that has less than 31 days
# February with 28 days has lower priority than February with 29 days
if|date
%ym -02|%ym/28
%ym -04|%ym/30
%ym -06|%ym/30
%ym -09|%ym/30
%ym -11|%ym/30
%ym 2016-02|%ym/29
%ym 2020-02|%ym/29
%ym 2024-02|%ym/29
%ym 2028-02|%ym/29
%ym 2032-02|%ym/29

# debug
comment \n%1,%2,%3,%4

Run it:

$ hledger -f demo8.csv print
2020-01-31 * (1091401) DEMO|Payment 2020-01
    ; 2020-01,1091401,EUR,10.14
    expenses:demo          10.14 EUR
    assets:bank:card

2020-02-29 * (1122715) DEMO|Payment 2020-02
    ; 2020-02,1122715,EUR,10.58
    expenses:demo          10.58 EUR
    assets:bank:card

2020-03-31 * (1145395) DEMO|Payment 2020-03
    ; 2020-03,1145395,EUR,10.15
    expenses:demo          10.15 EUR
    assets:bank:card

2020-04-30 * (1165835) DEMO|Payment 2020-04
    ; 2020-04,1165835,EUR,10.15
    expenses:demo          10.15 EUR
    assets:bank:card

2020-05-31 * (1200729) DEMO|Payment 2020-05
    ; 2020-05,1200729,EUR,10.15
    expenses:demo          10.15 EUR
    assets:bank:card

2020-06-30 * (1235759) DEMO|Payment 2020-06
    ; 2020-06,1235759,EUR,10.15
    expenses:demo          10.15 EUR
    assets:bank:card

2020-07-31 * (1264568) DEMO|Payment 2020-07
    ; 2020-07,1264568,EUR,10.15
    expenses:demo          10.15 EUR
    assets:bank:card

2020-08-31 * (1289920) DEMO|Payment 2020-08
    ; 2020-08,1289920,EUR,7.05
    expenses:demo           7.05 EUR
    assets:bank:card

2020-09-30 * (1316067) DEMO|Payment 2020-09
    ; 2020-09,1316067,EUR,4.83
    expenses:demo           4.83 EUR
    assets:bank:card

2020-10-31 * (1363741) DEMO|Payment 2020-10
    ; 2020-10,1363741,EUR,4.83
    expenses:demo           4.83 EUR
    assets:bank:card

2020-11-30 * (1381326) DEMO|Payment 2020-11
    ; 2020-11,1381326,EUR,5.23
    expenses:demo           5.23 EUR
    assets:bank:card

2020-12-31 * (1405040) DEMO|Payment 2020-12
    ; 2020-12,1405040,EUR,5.41
    expenses:demo           5.41 EUR
    assets:bank:card

2021-01-31 * (1435894) DEMO|Payment 2021-01
    ; 2021-01,1435894,EUR,5.41
    expenses:demo           5.41 EUR
    assets:bank:card

2021-02-28 * (1473282) DEMO|Payment 2021-02
    ; 2021-02,1473282,EUR,4.90
    expenses:demo           4.90 EUR
    assets:bank:card

2021-03-31 * (1513987) DEMO|Payment 2021-03
    ; 2021-03,1513987,EUR,5.41
    expenses:demo           5.41 EUR
    assets:bank:card

2021-04-30 * (1535096) DEMO|Payment 2021-04
    ; 2021-04,1535096,EUR,5.23
    expenses:demo           5.23 EUR
    assets:bank:card

2021-05-31 * (1558019) DEMO|Payment 2021-05
    ; 2021-05,1558019,EUR,5.41
    expenses:demo           5.41 EUR
    assets:bank:card

2021-06-30 * (1586496) DEMO|Payment 2021-06
    ; 2021-06,1586496,EUR,5.23
    expenses:demo           5.23 EUR
    assets:bank:card

2021-07-31 * (1626884) DEMO|Payment 2021-07
    ; 2021-07,1626884,EUR,5.41
    expenses:demo           5.41 EUR
    assets:bank:card

2021-08-31 * (1634982) DEMO|Payment 2021-08
    ; 2021-08,1634982,EUR,5.41
    expenses:demo           5.41 EUR
    assets:bank:card

2021-09-30 * (1675066) DEMO|Payment 2021-09
    ; 2021-09,1675066,EUR,3.33
    expenses:demo           3.33 EUR
    assets:bank:card

2021-10-31 * (1693457) DEMO|Payment 2021-10
    ; 2021-10,1693457,EUR,2.70
    expenses:demo           2.70 EUR
    assets:bank:card

2021-11-30 * (1722985) DEMO|Payment 2021-11
    ; 2021-11,1722985,EUR,3.48
    expenses:demo           3.48 EUR
    assets:bank:card

2021-12-31 * (1749424) DEMO|Payment 2021-12
    ; 2021-12,1749424,EUR,3.61
    expenses:demo           3.61 EUR
    assets:bank:card
@simonmichael
Copy link
Owner

Is that something you would like to help with ?

I gave if tables a serious try recently but I found them difficult to understand and not very widely useful. I'd be more inclined to remove them, personally.

@simonmichael simonmichael added docs Documentation-related. csv The csv file format, csv output format, or generally CSV-related. labels Mar 5, 2024
@PSLLSP
Copy link
Author

PSLLSP commented Mar 5, 2024

Is that something you would like to help with ?

Not really. It is just a request to improve documentation. I agree that if|tables are not easy to work with and could be confusing.
This example is nice use of if table, it is not so much confusing and explains how to use them (else rule, priority of rules).


Other way to define "else" rule, it has to be the first entry in the table (it has the lowest priority and matches any record):

# date has to be defined, it is mandatory field...
# The default value, it will be overwritten with values from the table.
date  9999-99-99

# The first rule defines default value (else value), the rule matches any record and
# it defines that month has 31 days.
# Following rules can change that
if|date
%ym .|%ym/31
%ym -02|%ym/28
%ym -04|%ym/30
%ym -06|%ym/30
%ym -09|%ym/30
%ym -11|%ym/30
%ym 2016-02|%ym/29
%ym 2020-02|%ym/29
%ym 2024-02|%ym/29
%ym 2028-02|%ym/29
%ym 2032-02|%ym/29

@simonmichael
Copy link
Owner

simonmichael commented Mar 5, 2024 via email

@jmtd
Copy link
Collaborator

jmtd commented Mar 5, 2024

I don't use iftables, but

We want to translate that information in the way that payment is done in the last day of the month, like 2020-04 -> 2024-04-30. This is not straight forward task, months have different number of days, between 28 and 31 and February can have 28 or 29 days.

This is similar to one of the motivations for adding rewrite rules -- I wanted to warp the credit card repayment postings to the start of the billing period, so I regex-match the year-month bit, and hard-code a different day.

However this only works for my card where the start of the billing period is in the same month as the statement. I have one credit card where this holds and one where it doesn't.

To address the other, I was thinking we might need some kind of date manipulation support in rewrite rules. Although I wouldn't like to implement them!

In the meantime, I have an ugly series of 12 if statements which match the month portion like this

if
%type DD
& %description HALIFAX
& %date ../02/(....)
    comment2 date:\1-01-14
    account2 liabilities:clarity

The reason I mention this, I had hoped to try and tackle it with if tables but found that they don't help this problem at all.

@adept
Copy link
Collaborator

adept commented Mar 7, 2024

I think this feature needs someone to champion it and do the work of making it more usable, otherwise it's for the chop sooner or later.

I think I wrote it originally and I use it extensively, and I would be very sad to see it go. @simonmichael , what kind of work (besides documentation) do you envision it needing?

Semantically, an "if table" of N rows behaves exactly the same as would N "if" statements, one per table row - it just occupies about 4x less lines (and significantly fewer characters, and removes the repetition, and ...), so I would argue that it is not significantly more confusing than N if statements

Edit: I have 2200 lines worth of "if tables" and while it is not hard to write a generator that would convert them into if statements to be included in the csv rules file, I would rather avoid doing this if possible

@adept
Copy link
Collaborator

adept commented Mar 7, 2024

Actually, looking at the documentation, I would argue that the behavior is documented (emphasis in the below is mine):

An if table like the above is interpreted as follows:
try all of the matchers;
whenever a matcher succeeds, assign all of the values on that line to the corresponding hledger fields;
later lines can overrider earlier ones.
It is equivalent to this sequence of if blocks:

if MATCHERA
  HLEDGERFIELD1 VALUE1
  HLEDGERFIELD2 VALUE2
  ...

if MATCHERB
  HLEDGERFIELD1 VALUE1
  HLEDGERFIELD2 VALUE2
  ...

if MATCHERC
  HLEDGERFIELD1 VALUE1
  HLEDGERFIELD2 VALUE2
  ...

@simonmichael
Copy link
Owner

Ahaha, hello @adept. I didn't remember. There you are. Sorry to threaten your feature. :)

You're the first person I've found relying on it. Could you help us understand how/for what you use it ?

In my experience (multiple tries) it is extremely non-intuitive for "normal people" at present. The work needed is at least handling UX/doc requests like this one. Also I haven't found a good use case of my own where it helped, yet. It needs more examples/motivation.

I had vague ideas of how variations of it could be useful to more people. But I'll wait to hear what you tell us.

@adept
Copy link
Collaborator

adept commented Mar 7, 2024

Ahaha, hello @adept. I didn't remember. There you are. Sorry to threaten your feature. :)

No worries, I knew you would not shoot to kill without warning :)

My primary use-case is this: for CSV import rules, I usually have account1 fixed in the rules file, and then based on the CSV line, I assign account2, and potentially a comment. Over time, you accumulate lots of rules that look roughly the same:

if SOMETHING
  account2    is:this
  comment    is this

Another way to present them would be:

if|account2|comment
SOMETHING|is:this:account|with this comment
SOMETHING ELSE|is:this:other:account|
AND THIS|is:yet:another:account|with a comment

This is way more compact, and it is also dead easy to add new rules -- maybe even in automated fashion: https://github.com/adept/full-fledged-hledger/wiki/Sorting-unknowns

@adept
Copy link
Collaborator

adept commented Mar 7, 2024

Sorry, forgot one more thing. I have lots of csv files and a build system to convert them to journals. Tabular machine-readable nature of the (majority of) the rules means that I could easily check which rules affect which input files. Then, when rules change, i can rebuild just the affected journals: https://github.com/adept/full-fledged-hledger/wiki/Speeding-up

@PSLLSP
Copy link
Author

PSLLSP commented Mar 8, 2024

I miss a possibility to disable a row in if|table, like to comment it. And I miss comments to explain why some row is in if|table; this can improve readability of rules.

Something like this:

if|date
;%ym .|%ym/31  ; this matches every raw
.|%ym/31  ; this matches every raw too
%ym -02|%ym/28
; this is just a comment
%ym -04|%ym/30
;%ym -05|%ym/31  ; this rule is not needed, was disabled for test and once verified, it will be removed
%ym -06|%ym/30

@adept
Copy link
Collaborator

adept commented Mar 8, 2024

Comment lines in the table body seem sensible, I made a PR for this.

Comments inside field assignments feel more controversial (plus, I don't immediately see how to implement them AND allow user to use ";" or "#" in, say, "comment" field assignments in the table), so I haven't done them.

@baig
Copy link

baig commented Mar 10, 2024

I am using if tables. Anyone who is using https://github.com/adept/full-fledged-hledger is using if tables. My HLedger setup is based on adept's full fledge ledger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
csv The csv file format, csv output format, or generally CSV-related. docs Documentation-related.
Projects
None yet
Development

No branches or pull requests

5 participants