Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with map auto initialization #170

Open
skejeton opened this issue Jun 27, 2022 · 4 comments
Open

Problem with map auto initialization #170

skejeton opened this issue Jun 27, 2022 · 4 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@skejeton
Copy link
Contributor

skejeton commented Jun 27, 2022

I know this is mostly for Go parity, however this creates big problems when you access a map with a large value type that's gonna just sit around because it was never meant to be created.

I was converting my old code that used old maps, which returned pointers, and so I assumed the same semantics.

chunk_pos := ChunkPos{d_util.floor(tx/real(chunkW)), d_util.floor(ty/real(chunkH))}
chunk := &world.tilemap[layer][chunk_pos]
if chunk == null {
    return 0;
}

Now what's wrong here? Well my code worked as expected: I checked if chunk was null and returned zero. Except that branch never executes, instead it created a chunk I didn't want to be created, and wasted some RAM, but it still works as it coincides with zeroinit.

The first place I've actually encountered this as problematic is here:

if gl := &f.cache[rune]; gl != null {
    return gl^
}
f.cache[rune] = image.Image{crenderglyph(f.handle, rune, f.size/5.0)}
return f.cache[rune]

The code that is responsible for creating the image is never executed. Thankfully due to tophat's logs it didn't take me long to figure out why text wasn't rendering, the image indices were 0. Although it did take me a while to realize the image.Image struct was initialized to 0 because of auto init (or rather that auto-init was a thing).

There's few approaches I can think of:

  1. Deallocate/Garbage Collect map keys that were never explicitly written to.
    e.g. if we did some_map["test"] and it created and returned default zero initialized value,
    It would free that value at some point by the garbage collector.
    Drawback: If the key is periodically accessed it might mean allocating/freeing all the time, not a big deal though because if solved correctly, these periodic accesses should be rare and thus not have any noticeable performance impact.
  2. Return null from map if key is not found. Now this is not currently possible because the maps return value types, but I don't see any problem with returning pointer types.
    Drawback: Obviously maps will not be able to return values, but whether that's a good or a bad thing is debatable.
  3. This approach is a loosened second approach: if you access a map while taking the pointer of the value, like in &some_map["test"] it will not create a value, but instead return null if it doesn't exist.
    Drawback: This might be a bit inconsistent and confusing at first to people unfamiliar
  4. Warning when you access by value. Drawback is obvious: it's real annoying

This is open to discussion, I don't actually have critical problems with RAM usage. But the auto initialization (and maybe, accessing by value) might be a dangerous trope.
I'm not a Go developer, I don't know it beyond the basics, nor do I have any experience with it. I'm mostly with C and C++ background, so it might be unsubstantiated for me to complain.

@vtereshkov
Copy link
Owner

@skejeton I cannot agree with you on this point. What you call an auto initialization is a widely accepted approach. C++'s std::map does exactly this: https://onlinegdb.com/_2_bEnIeRK.

Maps in Go are more elaborate, as they return a valid (zero, not nil) value for a non-existing key, but do not add this key/value pair to the map itself: https://go.dev/play/p/5H-hXjGXzdl. The cost for the Go designers was high, as they had to treat map items as "semi-addressable" quantities, mentioned as special cases throughout the Go specification.

In Umka, you can always check if the key exists by calling validkey(m, key) whose effect is the same as m.count(key) != 0 in C++.

Regarding your proposals:

  1. Not feasible, as one cannot say at which instant the item should be collected as garbage.
  2. Somewhat confusing, as a user will expect to get an int (not ^int) after he/she put an int to the map.
  3. I myself thought of this solution, but rejected it as too complicated. This approach implies that the expression m["Hello"] means completely different things in p = &m["Hello"] and m["Hello"] = a. The former means a non-existing item whose "address" is null, while the latter means an existing item to which the value of a can be copied.

@vtereshkov vtereshkov closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2022
@skejeton
Copy link
Contributor Author

skejeton commented Jun 28, 2022

Hi I think it's worth to take a closer look at this issue.
In C++ std::map creates a new element on index because there's no access and assign []= operator, so to make it possible, the standard library uses a trick where [] operator returns a reference you can assign to with =. The reference obviously has to point to a valid memory address, so allocation (and default constructor call) is performed. It's mostly a hack used due to a limitation.

I think the mistake Go designers made should be learned from.

Not feasible, as one cannot say at which instant the item should be collected as garbage.

I don't think it's impossible, you can easily track if there's any accessible references to the item, and if there is, pass it up. If there's a single reference from the map itself (and the element wasn't assigned to), then you can free it without any problem. Unless I'm misunderstanding something.

Somewhat confusing, as a user will expect to get an int (not ^int) after he/she put an int to the map.

It's arguable if it's confusing or not, if the user would use the returned value as a temporary (or copied) variable, they would have to dereference it. However this is except for the case when it's a struct or an array, in which case accessing it's members is indistinguishable from pointers/values. Either way this is as simple as doing some_map[key]^ to just receive the copy.

I myself thought of this solution, but rejected it as too complicated. This approach implies that the expression m["Hello"] means completely different things in p = &m["Hello"] and m["Hello"] = a. The former means a non-existing item whose "address" is null, while the latter means an existing item to which the value of a can be copied.

It's a good point. I never viewed it that way, I just assumed []= and [] are different operations with no intersect. In my opinion though humans are heuristic creatures so trying to make a strictly logical thing might be less preferred over a shortcut solution. (after all, that's the rationale to why I think good syntax is nice, even if it's extra edge cases and complexity)

I'll leave this issue for later, when I make more use of maps and have a better understanding of them in a more practical manner, so as for now, this issue could be last on the list. It's also reminder for me to remember it's a scripting language so it should be designed for convenience, but there's many ways to solve that, with different drawbacks.

@vtereshkov
Copy link
Owner

@skejeton You convinced me to reopen it and continue discussion, though I'm not planning any immediate actions here.

In C++ std::map creates a new element on index because there's no access and assign []= operator, so to make it possible, the standard library uses a trick where [] operator returns a reference you can assign to with =.

Right. In Umka I have perform essentially the same trick, as I have no []= operator. Such an operator would cause problems like this:

type T = struct {x, y: int}
m := make(map[str]T)
m["Hello"] = T{5, 7}   // Allowed
m["World"].x = 9       // Not allowed?

@vtereshkov vtereshkov reopened this Jun 28, 2022
@skejeton
Copy link
Contributor Author

skejeton commented Jun 28, 2022

Sure, and thank you. I'll be sure to report anything I find interesting about this.
This is actually an interesting example:

m["World"].x = 9       // Not allowed?

And I think this may be just solved using the first method (which is luckily an implementation detail, so no code has to be changed). It should be a win-win even if difficult to execute. And indeed, don't hurry, and take your time, prioritize what's easier first. Take my issues with a grain of salt and better instead close them later as resolved or unplanned to have a better overall picture 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants