Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[💰20$ bounty] Inherited Tag Values are not Overwritten #1950

Open
nobodyinperson opened this issue Nov 17, 2022 · 22 comments · May be fixed by #1955
Open

[💰20$ bounty] Inherited Tag Values are not Overwritten #1950

nobodyinperson opened this issue Nov 17, 2022 · 22 comments · May be fixed by #1955
Labels
A-WISH Some kind of improvement request, hare-brained proposal, or plea. bounty Thar's some kind o' loot on offer.. docs Documentation-related. journal The journal file format, and its features.

Comments

@nobodyinperson
Copy link
Contributor

nobodyinperson commented Nov 17, 2022

Thank you very much for hledger, it is awesome! 🎉

Apparently there is a misleading behaviour when inheriting values of tags.

The docs (https://hledger.org/1.27/hledger.html#tags-1) mention that tags are inherited, but only mention value-less tags.

Consider this book.hledger:

2022-11-17 Aldi
    ; concerns: me
    Assets               -30 €
    Costs:Food            20 €
    Loaned                10 €  ; concerns: you, note: 👈 This should OVERWRITE the concerns: me from above, shouldn't it?

For consistency, higher-level tag values should overwrite tags values lower down the chain, right? At least a transaction/account shouldn't have two values of the same tag at once, right? But:

❯ hledger --version
hledger 1.27.1, linux-x86_64

❯ hledger -f book.hledger balance  --pivot=concerns
               -10 €  me
                10 €  you
--------------------
                   0  
# ☝️ This is expected.


❯ hledger -f book.hledger register Loaned  --pivot=concerns
2022-11-17 Aldi    you              10 €          10 €
# ☝️ This is expected.

❯ hledger -f book.hledger register tag:concerns=me --pivot=concerns
2022-11-17 Aldi    me              -30 €         -30 €
                   me               20 €         -10 €
                   you              10 €             0 # 👈 why is this line here? I explicitly filtered for tag:concerns=me but a line with concerns=you appears?

❯ hledger -f book.hledger register Loaned tag:concerns=me
2022-11-17 Aldi          Loaned                 10 €          10 €
# ☝️ Why does this appear? 

❯ hledger -f book.hledger register Loaned tag:concerns=you
2022-11-17 Aldi          Loaned                 10 €          10 €
# ☝️ This is expected.

Conclusion

It seems that tag values are not overwritten down the chain but somehow still kept around. This can lead to confusion and inconsistencies in register for example.

Proposal

I propose that tag values down the line should overwrite higher-level tag values.

@nobodyinperson nobodyinperson added the A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. label Nov 17, 2022
@nobodyinperson nobodyinperson changed the title Inherited Tag Values are not Overwritten [💰20$ bounty] Inherited Tag Values are not Overwritten Dec 2, 2022
@chrislemaire
Copy link
Contributor

Looks like it should be easy enough to change by tinkering with the postingAllTags function in Posting.hs. My solution would be to have this function filter the tags coming from transactions to only include those that have not been used by the posting already. This way, we let postings override all tags of the same name of their parents. My remaining question is whether this is desired behaviour, which a dev might be able to answer better.

Also, would it be enough to do this just to postings, or might it be useful to also do this for accounts inheriting tags from parent accounts?

@chrislemaire chrislemaire linked a pull request Dec 5, 2022 that will close this issue
@simonmichael simonmichael added the journal The journal file format, and its features. label Dec 6, 2022
@simonmichael
Copy link
Owner

simonmichael commented Dec 6, 2022

Thanks for the discussion and PR, and sorry for the slow response. After looking into it.. yes, though docs don't explain it well, tags are multi-valued; they collect and add new values, rather than overriding a single value. This gets tricky, there are multiple cases to consider:

  1. Same tag on sibling items, eg the same tag on multiple transactions
  2. Inheriting tags from a parent item, eg posting inheriting tags from (a) transaction or (b) account
  3. Acquiring tags from a child item, eg transaction acquiring tags from postings

Currently in all of these cases we add rather than override values. This explains the behaviour you're seeing, and has the virtue of being simple and consistent.

You expect it to override instead, in cases 2a and 2b ? This sounds reasonable, if it does not add confusion. I feel there might be ambiguities lurking, we should work out a complete set of tests.

@simonmichael
Copy link
Owner

simonmichael commented Dec 6, 2022

PS re this example:

❯ hledger -f book.hledger register tag:concerns=me --pivot=concerns
2022-11-17 Aldi    me              -30 €         -30 €
                   me               20 €         -10 €
                   you              10 €             0 # 👈 why is this line here? 

I explicitly filtered for tag:concerns=me but a line with concerns=you appears?

The Loaned posting also inherits me from the transaction, though you is shown as it is the most specific value for the posting (a partial kind of overriding). To exclude the posting, you would have to add not:tag:concerns=you there.

The pivoting makes things a little less clear, here are a few more views:

$ hledger bal -N tag:concerns=me   # all three postings inherited `me`
               -30 €  Assets
                20 €  Costs:Food
                10 €  Loaned
$ hledger bal -N tag:concerns=you   # only Loaned has `you`
                10 €  Loaned
$ hledger tags concerns --values  # the `concerns` tag's multiple values
me
you

@simonmichael simonmichael added bounty Thar's some kind o' loot on offer.. docs Documentation-related. A-WISH Some kind of improvement request, hare-brained proposal, or plea. A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. and removed A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. A-WISH Some kind of improvement request, hare-brained proposal, or plea. labels Dec 6, 2022
@nobodyinperson
Copy link
Contributor Author

tags are multi-valued; they collect and add new values, rather than overriding a single value

Ah, that explains it!

You expect it to override instead, in cases 2a and 2b ?

Yes, I expected child tag specifications to override parent specifications. For all my tags use cases, this is the only sensible behaviour I can think of.

To exclude the posting, you would have to add not:tag:concerns=you there.

This is complicated as there might be very many (unknown) values to exclude.

@simonmichael
Copy link
Owner

simonmichael commented Dec 6, 2022

I have clarified the current behaviour at https://hledger.org/dev/hledger.html#tag-values, for a start.

@chrislemaire
Copy link
Contributor

Would it be an idea to be able to add something to the list of tags to enable this type of behaviour? Something similar to this special not:... tag. Perhaps something like this:

2022-11-17 Aldi
    ; concerns: me   tag-behavior:override
    Assets               -30 €
    Costs:Food            20 €
    Loaned                10 €  ; concerns:you

Perhaps even having configurations for this at the top-level? I don't have enough time to scroll through documentation to see how this could be implemented exactly today, but it doesn't sound infeasible.

@nobodyinperson
Copy link
Contributor Author

nobodyinperson commented Dec 6, 2022

The problem with the current behaviour is then that children can't remove a tag value, right? This would be possible kind of possible with the override behaviour but require manual redundant specification of the parent tags if the old behaviour is to be imitated.

@nobodyinperson
Copy link
Contributor Author

I think the only viable way of introducing the override behaviour would be to introduce a {journal,file,transaction}-wide setting like @chrislemaire suggested. The current behaviour should remain the default because people might depend on it.

@simonmichael
Copy link
Owner

I don't think options for this will be worth their cost - better if we can figure out and provide the best default behaviour, and if necessary document workarounds..

@simonmichael simonmichael added A-WISH Some kind of improvement request, hare-brained proposal, or plea. and removed A-BUG Something wrong, confusing or sub-standard in the software, docs, or user experience. labels Dec 6, 2022
@simonmichael
Copy link
Owner

It's true that children can't remove tags inherited from parents, but maybe that's for the best ? If not, we should have some examples of real-world situations which need this functionality.

@nobodyinperson
Copy link
Contributor Author

Reasons against tag accumulation

  • There is no way of removing or resetting tags down the line. If tags can be accumulated, they should also be removable. Otherwise the only alternative is to not set a tag on the top transaction level and explicitly specify all tags manually for each individual posting except for that one posting where you want the tag to have a different value:

    ; with tag overwriting
    2022-12-21 Today's work
        ; client: A
        Work                  3h
        Work                  2h
        Work                  1h
        Work                  1h  ; client: B
        Billable
    
    ; with (current) tag accumulation
    2022-12-21 Today's work
        Work                  3h  ; client: A
        Work                  2h  ; client: A
        Work                  1h  ; client: A
        Work                  1h  ; client: B 
        Billable
    

    The above is a toy example, at least for me transactions can have very many postings (easily 10-20), so reducing copy-pastyness is an argument.

  • There is no way to explicitly list all values a tag has (accumulated), not even with hledger print --explicit, hlegder print -O json or hledger print -O csv (here to some degree as the transaction comment is given so with some parsing it is available)

  • If tags can indeed have multiple values, then why aren't all of them displayed/used with --pivot? From the above example:

    > hledger -f book.hledger register Work --pivot=client
    2022-12-21 Today's work         A                               3h            3h
                                    A                               2h            5h
                                    A                               1h            6h
                                    B                               1h            7h    <--- if tags are accumulated, then why does this only show 'B', not also 'A'?
    
    
    > hledger -f book.hledger balance Work --pivot=client
                      6h  A              <----- if tags are accumulated, then why does client A only have these 6h, not all 7h?
                      1h  B
    --------------------
                      7h                 <----- if tags are accumulated, then this sum should give 8h as the 1h with two client values should count twice (another discussion if that's desirable...)

Migration Paths

Switching to Tag Overwriting

If nobody actually uses the 'tag accumulation' feature, switching to 'tag overwriting' should be fully backwards-compatible. I can't think of a scenario where you would want actual tag accumulation. Especially as it's so hidden and not visible also with --pivot as explained above.

Introducing a New Tag-Setting Syntax

If there is actual use of the tag accumulation behaviour, how about making adding, removing and resetting tag values an explicit operation:

  • tag=: value removes all values of tag tag and sets it to value
  • tag+: value adds value value to tag tag
  • tag: value for backwards-compatibility could also add the value
  • tag-: value removes value value from tag tag

But to be honest, I'd just switch to tag overwriting as a default behaviour.

@nobodyinperson
Copy link
Contributor Author

By the way, @chrislemaire's changes in #1955 produce the intuitive behaviour of overwriting tag values for me, thanks!

@simonmichael
Copy link
Owner

simonmichael commented Jan 5, 2023

Re that example: I see what you mean, but it's not very compelling because you could simply do:

2022-12-21 Today's work
    Work:A                  3h
    Work:A                  2h
    Work:A                  1h
    Work:B                  1h
    Billable

I think I have described "adding/accumulating values" versus "overriding values" poorly, and in fact I'm confused about these now. And it's true that it's not easy to see the tags in effect on a particular posting (--pivot is an indirect way).

So consider this example: here account, transaction and posting each have a uniquely-named tag and a same-named tag:

account Work  ; account-tag:A, tag:A

2023-01-01
    ; transaction-tag:T, tag:T
    Work                  1h
    Work                  2h  ; posting-tag:P, tag:P
    Billable

I used some debug code to see what's in each posting's ptags field, and have added that as an extra posting comment below:

$ hledger print  # annotated
2023-01-01
    ; transaction-tag:T
    Work                  1h                   ; [ ( "account-tag", "A" ), ( "tag", "A" ) ]
    Work                  2h  ; posting-tag:P  ; [ ( "posting-tag", "P" ), ( "tag", "P" ), ( "account-tag", "A" ), ( "tag", "A" ) ]
    Billable                                   ; []

This shows that internally,

  • A posting inherits tags from its account, but not (directly) from its transaction
  • Multiple instances of the same tag are accumulated, but kept separate, not combined in any way (see tag: above).

But to users, reports with tag queries behave as follows (at least, register does) - this shows that when it comes to queries, postings do inherit the transaction's tags as well (in effect):

$ hledger reg tag:posting-tag
2023-01-01                      Work                            2h            2h
$ hledger reg tag:account-tag
2023-01-01                      Work                            1h            1h
                                Work                            2h            3h
$ hledger reg tag:transaction-tag
2023-01-01                      Work                            1h            1h
                                Work                            2h            3h
                                Billable                       -3h             0

Also, when pivoting on a tag with multiple values, this shows that... only the first (of tag's values) is displayed. This should be documented I guess (and if someone wants to make it work differently, feel free):

$ hledger reg Work --pivot=tag
2023-01-01                      A                               1h            1h
                                P                               2h            3h

So I think the fine details of how tags behave can be specific to each command, but the general user-visible behaviour is as docs describe, currently:

  • postings inherit tags from account and transaction
  • transactions acquire tags from postings
  • multiple instances of the same tag are accumulated as separate tag:value pairs
  • tag values can not be removed or overridden at a lower level
  • --pivot shows only the first value of multi-valued tags

@simonmichael
Copy link
Owner

simonmichael commented Jan 5, 2023

Should print --all-tags or print --explicit-tags show all inherited/acquired tags on the transaction and postings ?

@nobodyinperson
Copy link
Contributor Author

I'd say print --all-tags.

@lukasbestle
Copy link

lukasbestle commented Jan 7, 2023

I have another use case that would benefit from tag overwriting:

account expenses:travel  ; tax: 7%

2023-01-07 Rail company
    expenses:travel  107 €
    assets:bank

2023-01-07 Long-distance bus company
    expenses:travel  119 €  ; tax: 19%
    assets:bank

Without overwriting tags, the expense posting of the second transaction would have tags for both tax rates, which breaks the tax reporting. Using separate accounts for each tax rate is not viable as the tax rates are not relevant for budgeting, so the extra accounts (which would be needed for every expense account) would decrease the overview a lot.


So I agree with @nobodyinperson's suggested migration path. I'd be fine with either:

  • the backwards-compatible solution (tag: value means accumulation, tag=: value allows overwriting and tag-: value allows removing parent tags) or
  • a new default behavior (tag: value means overwriting, tag+: value means accumulation and tag=: value/tag-: value allow the modification of parent tags).

@simonmichael
Copy link
Owner

simonmichael commented Jan 11, 2023

Tags are

  • "inherited downward"
    • from accounts to subaccounts
    • from accounts to postings
    • from transactions to postings
  • "acquired upward"
    • from postings to transactions

In the #1955 PR, tests for all of these would be good. Additionally, more real-world testing - there might be some consequences we haven't seen yet.

I'll begin a mail list thread to clarify support for this.

@the-solipsist
Copy link
Collaborator

  1. All else being equal, I think the backwards compatible solution should be preferred over a new default.
  2. I think -tag: and -tag:value and =tag:value are more intuitive than tag-: and tag-:value and tag=:value.
    Since a an inherited tag itself ought to be able to be able to be removed from inheritance, I find tag-: to be unintuitive, and think -tag: is more intuitive.

@chrislemaire
Copy link
Contributor

It looks to me like there is some consensus on overwriting tags as an option rather than the default should resolve this issue. I will take a look at implementing a solution and adding tests for some more complicated scenarios in the #1955 PR, taking into account the hierarchical ordering of Posting > Transaction > Account > Parent Account.

If there is still debate over the exact syntax, I think it should be easy enough to change after implementing most of the PR, so I will start working on it now.

@simonmichael
Copy link
Owner

I don't think the sample size is large enough; no-one on the list has responded yet. I am personally against adding syntax for multiple tag behaviours as I said.

@chrislemaire
Copy link
Contributor

Ah I see, I'll hold off on it then until you have had response on the mailing list or until a different solution comes up. An issue I thought of with the syntax as well was that you might need to interpret tags left-to-right as well, adding some more cognitive depth to understanding tags. This is because situations like the following (with multiple values for the same tag assigned in a single comment):

2023-01-07 Long-distance bus company
    expenses:a  1€  ; =tag:value1; tag:value2
    assets:bank

which might be different from:

2023-01-07 Long-distance bus company
    expenses:a  1€  ; tag:value1; =tag:value2
    assets:bank

Unless I'm missing some already implemented nicer syntax for declaring multiple values for one tag, it matters here how tags are interpreted (left-to-right or right-to-left).

@simonmichael
Copy link
Owner

simonmichael commented Jan 14, 2023

These are things that would help this to move forward, I think:

  1. Several persuasive realistic examples of need gathered in one place

  2. In the [WIP] Change inheriting valued tags to override #1955 PR, tests for all of these would be good.

  3. Additionally, more real-world testing - there might be some consequences we haven't seen yet.

  4. For journal-breaking changes, gathering definite support (not just silence) in the larger community (here on the tracker and on the mail list, over a suitable period of time). This takes a while and active promotion.

These are things I generally want to avoid:

  • Breaking people's workflow without sufficient justification and consensus

  • Adding complexity to syntax or functionality without rather strong need

Currently I personally am mildly in favour of making value-overriding the default and only behaviour, after taking some more time to explore it, build confidence and gather support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-WISH Some kind of improvement request, hare-brained proposal, or plea. bounty Thar's some kind o' loot on offer.. docs Documentation-related. journal The journal file format, and its features.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants