You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Code built with g++ with link time optimization (LTO) fails with a "Trying to save an unregistered polymorphic type" exception. The same code works fine without LTO. This is on a stock Fedora 37 machine (with gcc 12.2.1, cereal 1.3.2).
My code is a large mixed C++ and Python project, but I boiled it down to a minimal reproducer here: cereal_test.zip
A.h defines and registers a polymorphic type Wrapped and a class Container that stores a shared_ptr<Wrapped>. B.h registers a Wrapped subclass BWrapped. We build two dynamic libraries libA.so and libB.so and wrap each with SWIG so they can be used from Python as A.py and B.py (note we build only the A wrapper with -flto):
If we then try to serialize a Container object that contains a BWrapped in Python (the _get_as_binary method uses cereal to write Container to a BinaryOutputArchive and then returns the resulting data), it fails:
$ cat test.py
import A, B
w = B.BWrapped()
c = A.Container(w)
print(c._get_as_binary())
$ python3 test.py
terminate called after throwing an instance of 'cereal::Exception'
what(): Trying to save an unregistered polymorphic type (BWrapped).
If we rebuild A without LTO though, it works fine:
It looks like the problem is that LTO causes StaticObject to not work correctly. If we add to A.h a function
void show_a_output_binding_map() {
auto const & bindingMap = cereal::detail::StaticObject<cereal::detail::OutputBindingMap<cereal::BinaryOutputArchive>>::getInstance().map;
std::cerr << "A map is at " << &bindingMap << std::endl;
}
and a similar function to B.h then with LTO we see
$ cat test.py
import A, B
A.show_a_output_binding_map()
B.show_b_output_binding_map()
w = B.BWrapped()
c = A.Container(w)
print(c._get_as_binary())
$ python3 test.py
A map is at 0x7f029b6fec40
B map is at 0x7f029b3ff540
terminate called after throwing an instance of 'cereal::Exception'
what(): Trying to save an unregistered polymorphic type (BWrapped).
i.e. StaticObject is not a singleton so when B registers BWrapped, A cannot see it. (Without LTO, the address printed for A map and B map is the same.)
I see cereal has specific code (in detail/static_object.hpp) to try to prevent link optimization from breaking StaticObject, but it seems not to be working here. Obviously an easy workaround is "don't use LTO" but I'd like to find a better solution. I can modify the SWIG interface, so perhaps I can add some code to the generated modules that explicitly references StaticObject and so persuades the linker not to mangle the code?
The text was updated successfully, but these errors were encountered:
In order to correctly serialize a polymorphic pointer
we need the most-derived type. cereal includes machinery
for this but it relies on the linker to make sure that
certain objects are unique in the process, and this doesn't
work well with link time optimization, or on Windows,
as per USCiLab/cereal#783. Provide our own similar
mechanism that registers Object subclasses in precisely
one place - the Object class itself in IMP.kernel.
FWIW, I see the exact same issue when building for Windows (I use MSVS 2015, for 64-bit). (The reproducer code is similar, except that functions need the usual dllexport/import tags so that DLLs work.)
Our workaround for now, linked above, adds a map of serialize/deserialize functions to our application itself, so we can be sure they're stored only in one place. Works for us but it is definitely not as general as cereal's polymorphic machinery.
Code built with g++ with link time optimization (LTO) fails with a "Trying to save an unregistered polymorphic type" exception. The same code works fine without LTO. This is on a stock Fedora 37 machine (with gcc 12.2.1, cereal 1.3.2).
My code is a large mixed C++ and Python project, but I boiled it down to a minimal reproducer here: cereal_test.zip
A.h
defines and registers a polymorphic typeWrapped
and a classContainer
that stores ashared_ptr<Wrapped>
.B.h
registers aWrapped
subclassBWrapped
. We build two dynamic librarieslibA.so
andlibB.so
and wrap each with SWIG so they can be used from Python asA.py
andB.py
(note we build only the A wrapper with-flto
):If we then try to serialize a
Container
object that contains aBWrapped
in Python (the_get_as_binary
method uses cereal to writeContainer
to aBinaryOutputArchive
and then returns the resulting data), it fails:If we rebuild A without LTO though, it works fine:
It looks like the problem is that LTO causes
StaticObject
to not work correctly. If we add toA.h
a functionand a similar function to
B.h
then with LTO we seei.e.
StaticObject
is not a singleton so whenB
registersBWrapped
,A
cannot see it. (Without LTO, the address printed for A map and B map is the same.)I see cereal has specific code (in
detail/static_object.hpp
) to try to prevent link optimization from breakingStaticObject
, but it seems not to be working here. Obviously an easy workaround is "don't use LTO" but I'd like to find a better solution. I can modify the SWIG interface, so perhaps I can add some code to the generated modules that explicitly referencesStaticObject
and so persuades the linker not to mangle the code?The text was updated successfully, but these errors were encountered: