Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genericize DataIdentity for container types #4591

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from

Conversation

ab9rf
Copy link
Member

@ab9rf ab9rf commented May 14, 2024

this partial rework of the data identity system provides genericized support for container types using concepts, and in general cleans up and modernizes this code

this doesn't cover several the std::vector specializations, still figuring out the best way to do that so this can be considered WIP but this code works on Windows and so i am putting up a PR mainly to run CI tests and see if gcc likes this or not

the process of going through this code has identified a number of other possible refinements that should probably be done which i'll come back to after i've had some rest

ab9rf added 7 commits May 14, 2024 18:09
this partial rework of the data identity system provides genericized support for container types using concepts, and in general cleans up and modernizes this code

this doesn't cover several the std::vector specializations, still figuring out the best way to do this
fix compliance with gcc-12 for `requires` (this is because gcc-10 implemented the concepts proposal while gcc-12 implemented the final standard which is different; msvc is lax and accepts either)

also fix mistake in constraints for `isAssocContainer`
had the wrong type variable in the `container_identity` constructor, oops
this hides the underlying C++ type but this isn't necessarily useful anyway and exposing it requires complex nonportable code that is probably more work than it's worth
i'd like to not have to do this, but we're not there yet
Comment on lines 783 to 806
template<isIndexedContainer C>
struct DFHACK_EXPORT identity_traits<C> {
static type_identity* get() {
static generic_container_identity<C> identity{};
return &identity;
}
};

template<isAssocContainer C>
struct DFHACK_EXPORT identity_traits<C> {
static type_identity* get() {
static generic_assoc_container_identity<C> identity{};
return &identity;
}
};

template<isSetContainer C>
struct DFHACK_EXPORT identity_traits<C> {
static type_identity* get() {
static generic_set_container_identity<C> identity{};
return &identity;
}
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure these no longer need to be behind BUILD_DFHACK_LIB? My understanding is that had something to do with linking issues with plugins attempting to use these specializations, and it was deemed better/easier to not allow that at all, but I'm afraid I don't remember much more than that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm still exploring this - i think it has to do inconsistent typing of the get static method. there appears to be no reason why all of the get methods cannot return a type_identity* as the type is polymorphic anyhow

however, i'm still investigating this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue of duplicate implementations still exists, but in this case the static object being created is read-only so if there are two copies there should be no risk of harm. the problem that arises when there are multiple instances of the same specialization is when there is mutable data, but there's no mutable data in a type_identity

also, both gcc and msvc have improved their loading models since 2012 to reduce problems with duplicated implementations

there is a ton of stuff in our code that is there to work around compiler bugs and shortcomings, or even C++ language shortcomings, that were fixed, in some cases, over a decade ago

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue of duplicate implementations still exists, but in this case the static object being created is read-only so if there are two copies there should be no risk of harm.

Are we sure the address of that object isn't used as an identifier anywhere? I'm pretty sure at least something in the Lua layer passes around pointers to type_identity objects and compares them for equality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least some places where the pointer to type_identity is used:

  • Several registry tables:
    • The table at &DFHACK_TYPEID_TABLE_TOKEN maps type_identity* pointers (as raw lightuserdata) to the corresponding DF type object (e.g. identity_traits<df::item>::get() maps to the lua df.item).
    • The table at &DFHACK_TYPETABLE_TOKEN is a bidirectional map of type_identity* to metatable
  • In DF object metatables:
    • The value at &DFHACK_IDENTITY_FIELD_TOKEN is the object's type_identity. This is the value with a "userdata" key visible in printall(debug.getmetatable(df.item)), for example.
      • This is used in is_type_compatible(), but that already returns false for all IDTYPE_CONTAINER identities anyway, so your change has no effect there.
      • It's also returned by get_object_identity(), but that doesn't seem to be used for identification anywhere (although the result is passed on to several calls that I didn't trace...). Mainly, the identity is used by calling methods that should return the same thing for duplicate copies of the identity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, unless I'm missing something else, I think duplicates might be ok. Especially for templated container types, for which several operations that use the type_identity aren't implemented to begin with.

@@ -113,10 +114,10 @@ namespace DFHack

class DFHACK_EXPORT container_identity : public constructed_identity {
type_identity *item;
enum_identity *ienum;
type_identity *ienum;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why this changed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specifically because of associative containers, the identity key type of an associative container goes there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I wonder if there is a way to enforce that this is an enum_identity for non-associative containers. Probably complicated.

mostly so this code will pass CI - in the long term i'd prefer something different
@Bumber64
Copy link
Contributor

Bumber64 commented May 16, 2024

Seems to work for me. I can pass a vector into Lua and modify it.

@myk002 myk002 requested a review from lethosor May 16, 2024 22:24
template<typename T> struct type_name<std::deque<T>> {
static const inline std::string name = "deque";
};
template<typename T, int sz> struct type_name<std::array<T, sz>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be

Suggested change
template<typename T, int sz> struct type_name<std::array<T, sz>> {
template<typename T, size_t sz> struct type_name<std::array<T, sz>> {

to match properly.

I am debating removing the name = "container" from the base template to catch this sort of thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I was testing with: DFHack/df-structures#757

Copy link
Member Author

@ab9rf ab9rf May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am debating removing the name = "container" from the base template to catch this sort of thing.

i was ambivalent on that issue originally. it's not hard for us to add a new specialization but it's more of a burden for a developer of an external plugin. maybe only include the generic case in the library build? that would cause any unspecialized in core code to error at compile time, while allowing plugin developers freedom to use specialized containers for their own purposes (although i'm not sure what the scenario for that is, tbh) nah i don't think that does what i want.. maybe someone else would have a better idea

Copy link
Member

@lethosor lethosor May 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for adding new types in plugins is limited anyway. Their identities won't be registered in the global registry tables in #4591 (comment) (although I'm a little unsure whether any container types are registered there).

Really, my intent of breaking the base template would be to catch specializations like array<T, int> that are intended to work but don't. It took me a while to figure out why a std::array was printing as "container" instead of "array".

fix parameter type of `std::array`

Co-authored-by: Alan <lethosor@users.noreply.github.com>
@Bumber64
Copy link
Contributor

Bumber64 commented May 24, 2024

Regarding std::unordered_map, the keys (i.e., first) aren't accessible by Lua. Running pairs on it gives 0-n for keys and just the value of second for values (i.e., not {first, second}.)

Any attempt to insert into std:unordered_map<int, std::string> via Lua got me the error Temporaries of type string not supported. Same for std::unordered_set<std::string> and ordered versions.

@ab9rf
Copy link
Member Author

ab9rf commented May 24, 2024

Regarding std::unordered_map, the keys (i.e., first) aren't accessible by Lua. Running pairs on it gives 0-n for keys and just the value of second for values (i.e., not {first, second}.)

Any attempt to insert into std:unordered_map<int, std::string> via Lua got me the error Temporaries of type string not supported. Same for std::unordered_set<std::string> and ordered versions.

The Lua interface to C++ associative containers is at best shoehorned and needs to be reconceptualized. This PR may not be the best place to do this.

@Bumber64
Copy link
Contributor

Bumber64 commented May 24, 2024

The Lua interface to C++ associative containers is at best shoehorned and needs to be reconceptualized. This PR may not be the best place to do this.

Should I open an issue at this time? I don't need it for anything I'm working on. I just remembered it being mentioned on Discord as something to test. If nobody ever ends up needing it, then it'd be kind of pointless. (I guess Bay12 might use the type eventually?)

@ab9rf
Copy link
Member Author

ab9rf commented May 24, 2024

If nobody ever ends up needing it, then it'd be kind of pointless. (Bay12 might use the type eventually?)

Bay12 is already using std::maps in several places (notably in some of the interface-related stuff) and I seem to recall myk mentioning that having them fully available from lua would be useful in some places already.

Our current Lua interfaces only supports reading the "values" list as if it were a regular array, and does not support modifications at all. I personally believe that it would be good to move past these limitations.

@ab9rf
Copy link
Member Author

ab9rf commented May 24, 2024

Let me add here that this code is not really ready to go yet. I really want to extend this to encompass std::vector<T*> which right now is still being handled by the old stl_ptr_vector_identity, but this isn't just a matter of removing the latter specialization, changes also have to be made to the generic_container_identity for the special handling that vectors of pointers require.

I did test to see what happens if you just remove the specialization: the launcher sprouts hundreds of scrollbars, which is both hilarious and entirely unhelpful.

@lethosor
Copy link
Member

Should I open an issue at this time? I don't need it for anything I'm working on. I just remembered it being mentioned on Discord as something to test.

Was that something I said? I didn't mean that unordered_map should have any particular functionality, just that this code should (a) compile and (b) not have any regressions when handling instances of unordered_map. The behavior you're pointing out is a known limitation that was present before this change. Not great, but not a regression.

Copy link
Member

@myk002 myk002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this deserve a changelog line (probably under API)?

@ab9rf
Copy link
Member Author

ab9rf commented May 27, 2024

does this deserve a changelog line (probably under API)?

lemme finish writing it first lol

@lethosor
Copy link
Member

I'm not sure it does. It's really not a user-facing change. The only aspect that could affect some external devs is the ability to define std::arrays in df-structures, so I would probably put it under Structures, if anything.

@ab9rf
Copy link
Member Author

ab9rf commented May 29, 2024

the main "features" this PR offers are the ability to pass a wider range of types into Lua from C++, especially from plug-ins, future proofing the identity system to possibly support a broader range of container types (which we expect to be likely to arise as we continue to extend/revise structures), and general code quality (at least i would like to think so)

all of this is basically being driven by bay12's increasing use of C++20 paradigms that our identity system was never designed to work with. the existing system having was basically designed using C++03 + "boost but not really boost" which are at best dated by modern standards, but which worked very well with toady's C++98ish style of coding DF itself, and which doesn't necessarily play that well with C++20 which we are increasingly moving toward both because C++20 has some really nice paradigms (i am especially fond of std::range) but more because bay12 (with putnam now on board) is clearly moving rather quickly in that direction

i'm very much cognizant of the fact that we have, in recent revisions, added several types to structures that are currently being handled as opaque types within the Lua identity system but which really ought to have more meaningful Lua semantics. while this PR does not itself resolve this issue, i'd like to think that it's a step toward a more flexible identity management system that will hopefully enable us to do so. specific goals i have in mind here include making std::function objects directly callable from Lua (without explicit C++ glue code), allowing Lua code to manipulate the contents of C++ associative containers (including at the very least std::map, std::unordered_map, and std::set), and transparent support in Lua of types wrapped with std::optional or std::shared_pointer

anyway, i don't think any of this merits a changelog since it's 90% inside baseball and really only impacts the core team, at least initially. we didn't changelog #4406 as far as i know, and this is basically the same type of change as that was

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

None yet

4 participants