Skip to content

Custom KnowledgeRecord Types

James Edmondson edited this page Apr 3, 2020 · 14 revisions

FEATURE DEPRECATED

As for 3.3.0, this feature is deprecated. We specifically removed this due to complications with Boost long term support and the requirement of boost filesystem, which was causing issues with moving to UE4 as a simulation environment for GAMS. If you are interested in this feature, let us know, and we will try to prioritize its reinclusion. However, there is no planned resurrection for this tool at the moment.

This feature lives in v3.2.3, at the latest.


Madara's KnowledgeBase and KnowledgeRecord support several useful types natively: integers, doubles, vectors of both, strings, and blobs. Where appropriate, it is best to use those types. They will be accessible by any KnowledgeBase, and a variety of useful functions will work with them, including overloaded math operators and indexing operations. For types which cannot easily be represented by those native types, KnowledgeRecord, and thus KnowledgeBase, can store an Any type, provided by Madara, which can in turn store custom types directly.

Table of Contents

Supported Types

There are two main kinds of objects supported by Any: native C++ types, and Cap'n Proto messages. Supported native C++ types must be default constructible, copy constructible, and support serialization via either the Cereal Library, or via Madara's translation system to a compatible Cap'n Proto message struct. Note that this translation system is an alternative to using Cap'n Proto generated types and schemas directly.

Note for Shield AI developers: any type which supports the internal "BufferedStore" class will also support Any.

For more, see the Creating a Custom Type section.

Native C++ Types

Storing Custom Native Types in KnowledgeBase

For many use cases, you need not use the Any type directly. To store a supported type, call the set_any or emplace_any methods. The set_any method takes a std::string key or VariableReference as first argument, and an object to store as second. This object will be stored by value, but will be copied or moved depending on whether it is given as an lvalue or rvalue reference. For emplace_any, supply the same first argument as set_any, but the remaining arguments are passed directly to the appropriate Any constructor. For example:

using namespace madara; using namespace knowledge;
KnowledgeBase kb;
using strvec = std::vector<std::string>;

// Constructs vector, and moves into "foo"
kb.set_any("foo", strvec{"a", "b", "c", "d"});

// Use mk_init helper to pass initializer lists
kb.emplace_any<strvec>("bar", mk_init({"e", "f", "g"});

You can also create KnowledgeRecord objects holding an Any, with an Any-compatible stored type:

// Copy/move value into record
KnowledgeRecord rec0(Any(std::vector<float>{1.1, 2.2, 3.3}));


// Construct in-place within record
KnowledgeRecord rec1(tags::any<strvec>, 10, "a");

// Move records into knowledge base
kb.set("rec0", std::move(rec0));
kb.set("rec1", std::move(rec1));

Reading Custom Native Types from the KnowledgeBase

To get back the stored data, use get() as normal to get the KnowledgeRecord, which provides a to_any method. This method accepts a template type parameter, and returns a copy of the stored data, provided the type exactly matches what was stored originally. The type cannot simply be convertible from the stored type, or be base type of it, it must be the exact same type, or a BadAnyAccess exception will be thrown. Continuing the above example:

auto foo = kb.get("foo").to_any<strvec>();
assert(foo.size() == 4);
assert(foo[1] == "b");

// Stores an int, since that's the default type of an integer literal
kb.set_any("baz", 10);

assert(kb.get("baz").to_any<int>() == 10);

// Throws BadAnyAccess; short does not exactly match int
kb.get("baz").to_any<short>(); 

// This version explicitly names the type, to avoid confusion
kb.emplace_any<int>("baz2", 10); 

KnowledgeBase also provides share_any and take_any methods to access the internal shared_ptr that KnowledgeRecord holds. For more details on these methods, see their documentation section.

In addition, KnowledgeRecord provides get_any_ref and get_any_cref to obtain a reference to the stored value directly. This is less safe than using share_any, but more efficient. You should ensure that the KnowledgeBase is either locked, or that you hold a copy of the KnowledgeRecord as long as you are using the reference, and only modify the object through reference while holding the KnowledgeBase.

These access mechanisms require knowing the type stored in the Any. See Accessing Fields and Elements for avoiding this requirement.

Registering Native Types

Any native types which will be stored directly within an Any stored in a KnowledgeRecords or the KnowledgeBase must be registered with the Any type system so they can be deserialized automatically. If a type stored in a KnowledgeBase would be serialized, and isn't registered, a BadAnyAccess exception will be thrown. Types which will never be serialized, e.g., sent over the network or checkpointed to file, do not need to be registered.

Sub-member and element types within registered types are always known statically once the top level type is known. For example, primitive types and STL containers are not registered by default, but can still be used within registered structs. But if you wish to store such types directly into a KnowledgeBase or KnowledgeRecord as an Any, you should register them.

To register a type, call Any::register_type<T>(const char *) with your custom type T, and a tag string which will be used to identify the type portably, across processes. This tag is not copied, and is kept as the given pointer, so it must remain valid for the remainder of the program lifetime. Typically, a string literal is used. This string should be short if possible. It will be serialized along with each KnowledgeRecord storing an Any of that type. For example, for a type YourType:

Any::register_type<YourType>("YourType");

These functions should be invoked from main, before any threads are launched (or at least, before any KnowledgeBase accesses occur). In general, it's best to define a function in each namespace that registers that namespace's types, and call those functions as needed. A type can be registered under multiple tags, but only the first such tag will be used during serialization; all will be recognized during unserialization.

Note that registration is only needed for types stored directly.

Cap'n Proto Types

Any can store Cap'n Proto messages, in their serialized form. Remember that this is different than storing a native C++ type with a Cap'n Proto serialization capability. Madara provides a template, CapnObject, which wraps and stores a Cap'n Proto generated type, which can in turn be stored in an Any. Madara also provides the GenericCapnObject type which can store any Cap'n Proto message, without needing a generated type. For example, assume a Cap'n Proto message called Point contains 3 doubles, x, y, and z, and has been compiled by capnp compile into a C++ type:

using namespace capnp;
MallocMessageBuilder msg;
auto builder = msg.initRoot<Point>();
builder.setX(42);
kb.set_any("my_point", CapnObject<Point>(msg)));

You can use GenericCapnObject to create an any with any loaded schema, even without the type itself:

// Can be any loaded schema; see Cap'n Proto docs for details
auto schema = Schema::from<Point>();
auto builder2 = msg.initRoot<DynamicStruct>(schema);
builder2.set("x", 42);
kb.set_any("my_other_point", GenericCapnObject("point", msg));

GenericCapnObject is useful for sending data, but cannot not be receieved. Unregistered types are rejected by Madara transports and checkpoint readers.

Registering Cap'n Proto Types

Typically, in C++ code, you should register Cap'n Proto message types using their generated type, wrapped in CapnObject<...>. For example:

Any::register_type<CapnObject<YourCapnType>("YourCapnType");

This will provide static typing for the Cap'n Proto objects received over the network and read from disk.

Madara also provides RegCapnObject, which is similar to GenericCapnObject, except that it knows the schema of the object it holds. This is used primarily by the Java and Python ports, which cannot register Cap'n Proto messages as statically known C++ types.

To use RegCapnObject you must register a Cap'n Proto StructSchema with a tag name. You may register both a type and a schema to the same tag, but to avoid confusion, you should ensure that the type is a CapnObject<...>, and the schema is the corresponding schema. It is redundant to register_type a Cap'n Proto message as RegCapnObject, as that is the type used when a message corresponds to a registered schema, but no registered type.

Any::register_schema("YourType", your_schema);

See the Cap'n Proto documentation, and the test file $MADARA_ROOT/tests/test_any.cpp for information about loading StuctSchema from Cap'n Proto schema files.

Once you've registered a tag, you can construct a RegCapnObject as:

auto builder3.msg.initRoot<DynamicStruct>(your_schema);
Any a3(RegCapnObject("YourType", msg));
kb.emplace_any("my_point3", a3);

Reading Cap'n Proto Types

Any types provide a reader method for accessing a held Cap'n Proto message as a Cap'n Proto reader. This access can be either via specific generated type, or through the DynamicStruct provided by Cap'n Proto for reflection based access. There are 3 variants of the reader method:

# Form Return Type Supported By
1 any.reader() DynamicStruct::Reader CapnObject<T>, RegCapnObject
2 any.reader(schema) DynamicStruct::Reader All CapnObject variants
3 any.reader<T>() T::Reader CapnObject<T>

If a reader form is called on an unsuported type, BadAnyAccess will be thrown. Note that for form 2, if the schema is only used by GenericCapnObject. It will be ignored for other types.

Native Type Reflection

The capabilities described on this page previously require knowing the Any type held in the KnowledgeRecord statically, at compile time. If this is not possible, Any provides runtime reflection capabilities to access internals of values stored in an Any. These capabilities are, of course, less efficient than the direct compile-time casting described above.

Note that these capabilities only apply to native C++ types, not Cap'n Proto generated types and schemas. For the latter, use the dynamic reflection features provided by Cap'n Proto itself.

Any and AnyRef

An Any object owns an object of any supported type, managing its lifetime like std::unique_ptr, except that it will automatically clone the stored object when the Any is copied. Madara also provides an AnyRef class, which provides many of the same capabilities as Any, except that it refers to a supported type by reference, without owning it. It can read from and modify the referenced object, but it cannot change its type. While Any is used to store custom types in KnowledgeRecords, AnyRef is used to reference fields and contained elements of those stored types.

An AnyRef is like a C++ pointer. It must be used with care. Typically, it should only be an ephemeral object, used while resolving fields and elements within an Any. Be careful storing one long-term. An AnyRef is invalidated if the object it points to is destructed, which includes any call to emplace() or set() on an Any it points into, even if these don't change the type. Use assign() to modify a stored object without destructing it.

Accessing Fields and Elements

To support this feature, a for_each_field overload must be available for the stored type in an Any. It is not sufficient that Cereal library serialization be supported. You can call supports_fields() on an Any to determine if the stored type supports them.

The simplest way to access fields is by name, using a string. Any is callable, and when called with a string (std::string or const char *), it will attempt to find the field of that name, and return an AnyRef referring to it. The operator() of Any and AnyRef is a synonym for their ref() methods.

You can also access fields using AnyField objects. You can get the AnyField objects corresponding to a stored value by calling list_fields(), which returns a vector of them, or find_field(), which will find on by name. You can pass an AnyField instead of a name when referencing a field for a much faster lookup (no string comparisons, vs O(log n) string comparisons where n is the number of fields, for name lookup).

Given an AnyRef (or an Any), you can call the to<T>() method on it to retrieve a copy of the stored value, supporting limited type conversions (stored value will be knowledge_cast to a KnowledgeRecord, then knowledge_cast into requested type). Call supports_from_record() to determine if an Any or AnyRef supports this operation. If not, to<T>() will require exact type matching, just like ref<T>(). You can also call to_integer(), etc., to access data, for each type supported natively by KnowledgeRecord.

You can also call ref<t>() to access the stored value by reference. This call must match the stored value's type exactly. You can also call the AnyRef or Any with an object of type<T> (several are predefined in the madara::knowledge::tags namespace), which act like a call to ref<T>().

You can also access contained elements by integer or string index, if the stored type supports it. Use supports_int_index() and supports_string_index() respectively to check if a type stored in an Any, or referenced by an AnyRef, supports those indexing modes. Call the at() method, or use operator[](), passing a size_t, const char * or const std::string & to index the stored value, and return an AnyRef to it.

struct Example {
  int i;
  double d;
  std::vector<double> dv;
  std::vector<Example> ev;
}; // Assume for_each_field is defined

kb.set_any("example", Example{3, 5.5,
  {1.1, 2.2, 3.3}, {
    {4, 6.6, {4.4, 5.5, 6.6}, {}},
    {5, 7.7, {7.7, 8.8, 9.9}, {}}});

// When given no type, to_any() returns a copy of stored Any
Any any = kb.get("example").to_any();
assert(any("i").to_integer() == 3L);
assert(any("dv")[1].to_double() == 2.2);
assert(any("ev")[0]("dv")[2].to_integer() == 6);

// Iterate over each field, and print. Note that printing Anys always works, as it
// falls back to serializing to JSON if no other printing support is available.
for (const AnyField &field : any.list_fields()) {
  std::cout << field.name() << ": " << any(field) << std::endl;
}

// Iterate over each elements, and print. Best to get size and AnyRef first.
AnyRef arr = any("ev")[0]("dv");
size_t n = arr.size();
for (size_t i = 0; i < n; ++i) {
  std::cout << i << ": " << arr[i] << std::endl;
}

Writing to Fields and Elements

You can use ref<T>() as described in the previous section to obtain a reference to the data referred to by an Any or AnyField, then modify the data through that reference. This, however, requires exact type matching with the stored value.

You can also use KnowledgeRecord as an intermediate type for any stored value which supports knowledge_cast into it from KnowledgeRecord. Call supports_from_record() to determine if an Any or AnyRef supports this operation. You can call from_record() to invoke this support directly, or assign() to try direct assignment first, then fallback to from_record(). Note that these operations modify the stored object, the don't replace it, as emplace() and set() methods do. As such, they are supported by AnyRef, whereas the latter are exclusive to Any.

In addition, AnyRef supports operator= as a synonym for assign(). Any itself does not to avoid confusion over whether assign() semantics are used or set() semantics.

// assign through direct reference
any("ev")[1]("i").ref<int>() = 42;

// will assign directly, as the literal is an int and matches the stored value
any("ev")[1]("i") = 47;

// will convert to a KnowledgeRecord, then call `from_record`, which calls `to_integer`
any("ev")[1]("i") = 52.1;

// will also convert to a KnowledgeRecord
any("ev")[1]("i") = "57";

Creating a Custom Native Type

A type supported by Any must be default constructible, copy constructible, and support serialization via either the Cereal Library (a dependency used by Madara), or via translation to a Cap'n Proto struct type.

Cereal Serialization

To implement a native type backed by Cereal, you can either provide the functions expected by Cereal, or define a free function for_each_field in the same namespace as your type. The function should have the following signature, replacing YourType with your type:

template<typename Fun>
void for_each_field(Fun &&fun, YourType &val);

This function will be called with a functor (fun) and the a reference (const or not) value of your type. Your for_each_field should call fun for each field in your type which you want serialized with a const char * argument holding the field's name (used for JSON serialization) and a reference to the field within the given val. For example:

struct Example
{
  int a;
  double b;
  std::string c;
};
template<typename Fun>
void for_each_field(Fun &&fun, Example &val)
{
  fun("a", val.a);
  fun("b", val.b);
  fun("c", val.c);
}

For base classes, the base class should be referenced by calling for_each_field on a reference to that type. For example:

struct Derived : Example
{
  double x;
};
template<typename Fun>
void for_each_field(Fun &&fun, Derived &val)
{
  for_each_field(fun, (Example &)val);
  fun("x", val.x);
}

Your for_each_field should look exactly like above. Do not add conditionals, or temporary variables. Every time for_each_field is called, the functor passed must be called in exactly the same order, and with the same references, which must refer directly to the value's data members. If you require greater control over serialization, directly implement the serialize, save, and/or load functions that Cereal expects for your type. These will override the generic for_each_field-based versions Madara provides.

Cap'n Proto Serialization

Cap'n Proto's generated C++ types follow a specific reader/builder model that doesn't always map well to the types a system might already use, or otherwise would want to use. To allow arbitrary native C++ types to interoperate with Cap'n Proto (and be compatible with other users of the same schema), Madara provides an adaptation system allowing native C++ types to map to Cap'n Proto's generated types.

The adapation system requires that you use only a single macro, MADARA_CAPN_MEMBERS, to setup the mapping. Once this is done, the type may be used just like a Cereal-based type. For example, suppose we use the following Cap'n Proto schema:

using Cxx = import "c++.capnp";
$Cxx.namespace("geo_capn");

struct Point {
  x @0 :Float64;
  y @1 :Float64;
  z @2 :Float64;
}

Then we can write a corresponding native type and map it as follows:

struct Point
{
  double x = 0, y = 0, z = 0;

  Point() = default;
  Point(double x, double y, double z = 0) : x(x), y(y), z(z) {}

  double &getX() { return x; }
  double &getY() { return y; }
  double &getZ() { return z; }
};

MADARA_CAPN_MEMBERS(Point, geo_capn::Point,
    (x, X)
    (y, Y)
    (z, Z)
  )

The first two arguments are simple: the first is the name of the native C++ type, which must not have any namespace qualificiation. The macro must be used within the same namespace context as the type being mapped. The second argument is the name of the Cap'n Proto generated type, and may include namespace qualifiers.

The fields are listed as a sequence of parenthesized comma separated lists. There are no commas between the individual lists. Each list has two elements: a native field name, and the name Cap'n Proto uses to name the corresponding accessors in the generated type. This will be an UpperCamelCase version of the first element.

This 2-element form is a shorthand which assumes that the field of the native type has the same as the name given in the schema for the Cap'n Proto message. It also assumes that the first element names a public data member or public accessor function member which returns a reference. If more control is needed, use the 3-element form. For example, an equiavlent to the above macro invocation is:

MADARA_CAPN_MEMBERS(Point, geo_capn::Point,
    (x, X, [](Point &p) -> double & { return p.x; })
    (y, Y, &Point::getY)
    (z, Z, &Point::z)
  )

The the third element can be a pointer-to-data-member, a pointer-to-member-function that takes no arguments and returns a reference, or any callable (function, lambda, functor, etc.) that takes a reference to the native type, and returns a reference to the field.

Field types map naturally. As long as a field can be accessed as the C++ type given in the native type definition, the conversion should work. There are several caveats, however: Cap'n Proto message types with Enums, Groups and Unions are not currently supported by the translation system, and only std::vector and std::array containers are supported, mapping to the Cap'n Proto List type.

Packaging Custom Native Types

If your types will be used with the Java and Python ports of Madara, you should package them into a shared library, using your platform's initializer function feature for shared libraries to register the types on load. For example, on Linux or MacOSX:

#include "madara/knowledge/Any.h"

using namespace madara;
using namespace knowledge;

namespace geo
{
  struct Point
  {
    double x = 0, y = 0, z = 0;

    Point() = default;
    Point(double x, double y, double z = 0) : x(x), y(y), z(z) {}
  };

  template<typename Fun>
  void for_each_field(Fun fun, Point& val)
  {
    fun("x", val.x);
    fun("y", val.y);
    fun("z", val.z);
  };

  void register_types()
  {
    Any::register_type<Point>("Point");
  }
}

extern "C" void madara_register_test_types(void) __attribute__((constructor));

extern "C" void madara_register_test_types(void)
{
  geo::register_types();
}

Compiled as a shared library, this would allow Java and Python to load and use the Point type in their port of Any.

Registering in MADARA tools

Tools like stk_inspect and karl can be configured to load and register Capnp types for usage in inspecting variables and values. To use these tools for Capnp types, we use the following and in the following order:

-ni <schema directory>:  the schema directory to find a capnp type`
-n <tag:type>:          the tag is the type name as registered in a MADARA knowledge base. The type is the name of the Capnp type as defined in a schema file.
-nf <schema directory>/<type file>: the schema file to load

An example loading:

stk_inspect -ni $GAMS_ROOT/src/gams/types -n Imu:Imu -nf $GAMS_ROOT/src/gams/types/Imu.capnp