Mapping Recipes

Mapping Recipes for C/C++ Libraries

Introduction

With JavaCPP and the help of the native C++ compiler and its toolchain, we can easily call native functions and access data from C/C++ libraries. Normally, when creating bindings for a native library, we ideally need to write only one configuration class in Java, such as the following:

import org.bytedeco.javacpp.*;
import org.bytedeco.javacpp.annotation.*;
import org.bytedeco.javacpp.tools.*;

@Properties(
    value = @Platform(
        includepath = {"/path/to/include/"},
        preloadpath = {"/path/to/deps/"},
        linkpath = {"/path/to/lib/"},
        include = {"NativeLibrary.h"},
        preload = {"DependentLib"},
        link = {"NativeLibrary"}
    ),
    target = "NativeLibrary"
)
public class NativeLibraryConfig implements InfoMapper {
    public void map(InfoMap infoMap) {
    }
}

Along with the following commands to first parse the headers files into the target NativeLibrary class, and then link everything together:

$ java -jar javacpp.jar NativeLibraryConfig.java
$ java -jar javacpp.jar NativeLibrary.java

Note: Any modifications to the target class NativeLibrary will get overwritten. If you would like to write manually additional code, consider using a helper class.

Also note that, under the directory where the target NativeLibrary.class is located, the last call outputs a shared library into a subdirectory named after the platform (linux-x86_64, macosx-x86_64, windows-x86_64, etc) and also copies any dependent libraries, which need to be bundled as resources for the target class, either as files or inside a JAR file, it does not matter. The Loader automatically extracts them in its cache when necessary. For platforms such as Android that already feature a native loader that expect to find the libraries in another directory, it can be specified in the platform properties under the platform.library.path entry. For reference, we can consult all the default platform properties in the source code resources here:

https://github.com/bytedeco/javacpp/blob/master/src/main/resources/org/bytedeco/javacpp/properties/

Additionally, it is possible to simplify further these build steps with Maven and the included Mojo plugin as shown with the JavaCPP Presets, which were created based on the recipes detailed below, so they also serve as examples to follow:

https://github.com/bytedeco/javacpp-presets/wiki/Create-New-Presets

However, until the parsing capabilities of JavaCPP improve, probably by relying on a full C++ compiler front end such as Clang, see issue #51, these simple instructions alone will typically fail unless we tweak the @Properties and provide more Info to the InfoMap. The kind of Info that we need to craft depends greatly on the content of the header files. This guide is structured in such a way that only the sections, the recipes, relevant to the tasks that we are interested in completing need to be consulted:

Providing properties for each platform
Including multiple header files
Ignoring attributes and macros
Defining macros and controlling their blocks
Mapping macros to fields or methods
Skipping lines from header files
Specifying names to use in Java
Mapping a declaration to custom code
Redefining the code of a macro
Writing additional code in a helper class
Creating instances of C++ templates
Defining wrappers for basic C++ containers
Using adapters for C++ container types
Dealing with abstract classes and virtual methods

InfoMap.java also comes with default entries that one should be aware of as they can provide a good reference for some of the tasks explained below.

Providing properties for each platform

To provide a different set of @Platform properties for each platform, we can pass an array of them to the @Properties annotation. The @Platform(value = {"..."}, ...) are matched against the platform name using String.startsWith() such that, for example, @Platform(value = "android-arm", ...) matches with both android-arm and android-arm64, but not android-x86. Each matching @Platform further down the list overrides the settings of any previous ones, leading to configuration files that look like this:

@Properties(value = {
    @Platform(
        includepath = {"/path/to/generic/include/"},
        linkpath = {"/path/to/generic/lib/"},
        include = {"NativeLibrary.h"},
        link = {"NativeLibrary"}
    ),
    @Platform(
        value = {"android", "ios"},
        includepath = {"/path/to/include/for/mobile/"},
        linkpath = {"/path/to/lib/for/mobile/"}
    ),
    @Platform(
        value = "windows-x86",
        include = {"NativeLibrary.h", "HacksForWindows.h"},
        link = {"NativeLibraryForWindows"}
    )},
    // ...
)

It is not currently possible to aggregate the settings from multiple @Platform annotations. Matching platform names only with their prefixes does not always offer enough flexibility, so we might want to revisit this and allow more sophisticated ways to perform matching using, for example, regular expressions. Nevertheless, it is already possible to work around this with the BuildEnabled and LoadEnabled interfaces.

Including multiple header files

The Parser class is responsible for parsing the header files found in the list of @Platform(include = {...}, ...) annotation values. Its preprocessor currently does not honor the #include directive, since it is generally unreliable to blacklist all the system header files we should not be mapping, instead opting for a whitelist approach. This also prevents issues with circular inclusions and lets users specify exactly in which order the content should appear in the target class, even though this is not something C/C++ developers usually have to think about. For example, given the following 2 header files:

types.h

struct Data {
    // ...
};

functions.h

#include "types.h"

void function(Data data);

We would need to specify them both in this order: @Platform(include = {"types.h", "functions.h"}, ...). That needs to be done recursively for all headers files, basically doing a topological search manually (so there is obviously room for improvement here).

Note: If the header files are in C and not contained within a extern \"C\" { } block, we need to list those in the @Platform(cinclude = { ... }, ...) annotation values instead.

Ignoring attributes and macros

One of the things the Parser tries to do is to translate #define macros into final variables in Java. It attempts to guess the type it should use, and it works well most of the time, but it fails often enough that one of the first things we need to do to fix parsing errors is to ignore the macros we are not interested in translating. It also often trips over compiler attributes, which can be used almost anywhere in the declarations. This includes, but not limited to, things like calling conventions, memory alignment preferences, library import/export directives, assertions, exception handling, preconditions, and postconditions. One common pattern involves using macros to abstract away attributes that have similar meaning between compilers but that have different names, for example:

#ifdef _WIN32
#define EXPORTS  __declspec(dllexport)
#define NOINLINE __declspec(noinline)
#elif defined(__GNUC__)
#define EXPORTS  __attribute__((visibility ("default")))
#define NOINLINE __attribute__((noinline))
#else
#define EXPORTS
#define NOINLINE
#endif

In this case, we generally need to use this kind of Info to be able to parse the header files successfully:

infoMap.put(new Info("EXPORTS", "NOINLINE").cppTypes().annotations());

An empty but non-null Info.cppTypes list prevents the parser from trying to guess the type to assign to a variable, while an empty but non-null Info.annotations instructs it to consider it also like an attribute, but without any corresponding Java annotations, so its output is also empty.

Defining macros and controlling their blocks

There are two places where we can define a macro: In the @Platform(define = { ... }, ...) annotation values and with Info.define in the InfoMap. The first one is for the Generator, which simply outputs one #define line per string. The second one is used by the Parser to provide users with some generic control over which part of the file gets parsed. In this case, the conditional groups #if, #ifdef, and #ifndef do not get evaluated the usual way. The whole condition is matched as is with an Info to decide whether to parse the block or not. Further, if no Info matches, all blocks are parsed by default, regardless of the conditions. For example, a header file might already contain blocks like the following to prevent other tools like Doxygen or SWIG from tripping on some tricky piece of code:

#if !defined(DOXYGEN) && !defined(SWIG)
    // ...
#endif

JavaCPP will most likely have issues with these blocks as well, so it would be wise to add the following:

infoMap.put(new Info("!defined(DOXYGEN) && !defined(SWIG)").define(false));

However, we do not wish to skip those blocks at compile time, so we do not add them to a @Platform annotation, but we might want to define there other macros such as NDEBUG or USE_OPENMP to enable inlining of functions, parallel processing, etc, for example: @Platform(define = {"NDEBUG 1", "USE_OPENMP 1"}, ...).

Mapping macros to fields or methods

Another thing that we might want to do with macros is to have them available as variables or methods. By default, macros that look like constants that can be translated easily into correct Java syntax will result in a public static final variable, for example:

#define VERSION MAJOR "." MINOR

By default, this gets translated into:

public static final String VERSION = MAJOR + "." + MINOR;

But if MAJOR or MINOR are not actually defined, or if they are defined to some other type than String we will get a Java compilation error. Using the following Info we can instead consider this macro as a function returning a value in the given C++ type:

infoMap.put(new Info("VERSION").cppTypes("const char*").translate(false));

Function-like macros do not get mapped to Java by default. However, after providing the C++ types, we will get methods to call them, for example:

#define SQUARE(x) x * x

With this Info:

infoMap.put(new Info("SQUARE").cppTypes("double", "double"));

Gives us:

public static native double SQUARE(double x);

Skipping lines from header files

When macros cannot be (mis)used to skip over just the right portions of header files, we can match the lines themselves against regular expressions. All we might have to go with could be comments, such as these ones, for example:

// START COMPLEX DECLARATIONS
// ...
// END COMPLEX DECLARATIONS

In this case, we could skip these lines with this Info, using the patterns that mark the start and the end of the sections, respectively:

infoMap.put(new Info("filename.h").linePatterns("// START COMPLEX DECLARATIONS", "// END COMPLEX DECLARATIONS").skip());

Note that the strings need to be regular expressions. Moreover, the remaining lines must not contain any syntax errors introduced by the lines skipped. Further, without Info.skip, this works in reverse, whitelisting the lines to parse instead.

Besides skipping linepatterns, it is also possible to skip a individual variables definitions:

infoMap.put(new Info("FFI_SYSV", "FFI_THISCALL", "FFI_FASTCALL", "FFI_STDCALL", "FFI_PASCAL", "FFI_REGISTER", "FFI_MS_CDECL").skip())

Specifying names to use in Java

By default, the Parser tries to use the same name as the C/C++ identifiers for the fields and methods of the peer classes, but it is possible to change them. In general, for struct, class, or union we can use Info.pointerTypes, while for others such as member variables and functions we use Info.javaNames, like this:

infoMap.put(new Info("full::namespace::TypeNameInCPP").pointerTypes("ClassNameInJava"));
infoMap.put(new Info("full::namespace::FunctioNameInCPP").javaNames("MethodNameInJava"));
infoMap.put(new Info("full::namespace::operator +(ns1::TypeA*, ns2::TypeB&)").javaNames("AddNameInJava"));

Note: Names for operator functions we need to include one whitespace and function parameters in general are optional, but if given must not contain their names, one whitespace must follow each comma, but with no whitespace before or after *, &, ( or ). Moreover, the types should not be typedef aliases, but the real underlying type names. This is only a current limitation of the parser, not an inherit issue with how InfoMap can and should work.

Regarding typedef, since there is no equivalent in Java, the parser will always use the underlying type, whenever possible, but it only works for simple cases. One common pattern for C libraries is to alias struct pointers to another name, for example:

struct DataStruct { /* ... */ };
typedef struct DataStruct* DataHandle;

Although the parser should probably handle these situations better by default, for now, we need to provide this kind of Info to have it mapped in the expected way:

infoMap.put(new Info("DataStruct").pointerTypes("DataHandle"));
infoMap.put(new Info("DataHandle").valueTypes("DataHandle").pointerTypes("@Cast(\"DataHandle*\") PointerPointer", "@ByPtrPtr DataHandle"));

It is also possible to change the parent class of a Pointer subclass using Info.base, as long as the type we provide implements Pointer, which can be Pointer itself to force it back in the case where we are not interested in the parent class, for example:

infoMap.put(new Info("ChildClass").base("Pointer"));

Mapping a declaration to custom code

Sometimes the parser fails miserably, with no way to rectify the situation using additional Info. In this case, it is possible to provide custom Java code, which the parser will output as is, using Info.javaText. For example, setting a member variable in C++ may not always be possible, because of deleted functions and what not, which the parser is currently unable to understand. Although we could use Info.skip to ignore the field completely, we could also allow read only access with an Info like this:

infoMap.put(new Info("DataStruct::aReadOnlyField").javaText("public native @MemberGetter @Const @ByRef FieldType aReadOnlyField();"));

Redefining the code of a macro

In the case of macros, it is also possible to redefine its entire content before it actually gets processed. It might be useful, for example, when there is a function-like macro that appends a calling convention, an export directive, and other attributes that cause problems for the parser. In that case, we can nullify the macro with an Info.cppText like this:

infoMap.put(new Info("DECORATE").cppText("#define DECORATE(returnType) returnType"));

Writing additional code in a helper class

If the parser does not fail, but does not get it quite right, or if we want to provide additional functionality specific to Java, such as custom deallocators with Pointer.DeallocatorReference, we can place that code in a helper class. For a library named NativeLibrary, it might look like this:

import org.bytedeco.javacpp.*;
import org.bytedeco.javacpp.annotation.*;

public class NativeLibraryHelper extends NativeLibraryConfig {
    /** Registers a custom deallocator when the user calls our DataHandle.create(). */
    public static abstract class AbstractDataHandle extends Pointer {
        protected static class ReleaseDeallocator extends NativeLibrary.DataHandle implements Pointer.Deallocator {
            ReleaseDeallocator(NativeLibrary.DataHandle p) { super(p); }
            @Override public void deallocate() { NativeLibrary.releaseData(this); }
        }

        public AbstractDataHandle(Pointer p) { super(p); }

        public static NativeLibrary.DataHandle create() {
            NativeLibrary.DataHandle p = NativeLibrary.createData();
            if (p != null) {
                p.deallocator(new ReleaseDeallocator(p));
            }
            return p;
        }
    }

    public static void customDataMethod(NativeLibrary.DataHandle p) { /* ... */ }
}

And then the only other thing we need to specify is the fully qualified name of that class in the @Properties(..., helper = "...") annotation value:

@Properties(
    // ...
    target = "NativeLibrary",
    helper = "NativeLibraryHelper"
)
public class NativeLibraryConfig implements InfoMapper {
    public void map(InfoMap infoMap) {
        infoMap.put(new Info("DataStruct").pointerTypes("DataHandle").base("AbstractDataHandle"));
        // ...
    }
}

This allows the target class to inherit from the helper class, such that we can refer from the target class to any method or class defined in the helper class, as well as vice versa.

Creating instances of C++ templates

With C++ templates, it is not usually obvious which types should be used to create instances, and further how to name them, so we need to specify them manually. Fortunately, it is typically quite straightforward, in a manner similar to Specifying names to use in Java, again using Info.pointerTypes for data structures and Info.javaNames for functions, for example:

infoMap.put(new Info("data::Blob<float>").pointerTypes("FloatBlob"));
infoMap.put(new Info("data::Blob<double>").pointerTypes("DoubleBlob"));
infoMap.put(new Info("processor::process<double,data::Blob<float> >").javaNames("processFloatBlob"));
infoMap.put(new Info("processor::process<double,data::Blob<double> >").javaNames("processDoubleBlob"));

Note: Because of the current state of the parser, we need a whitespace between each pair of >, but there should not be any whitespaces after commas between template arguments. Again, this is only a limitation of the current implementation of the parser, not an inherit issue with how InfoMap can and should work.

Defining wrappers for basic C++ containers

While containers such as std::vector and std::map are just templates, their definitions are quite complex and vary depending on the C++ compiler, so they are not portable. The Parser instead provides a set of common features for those basic containers. As with normal templates, we need to create instances manually with an Info for each, but to create a peer class, we also need to set Info.define, for example:

infoMap.put(new Info("std::vector<data::Blob<float> >").pointerTypes("FloatBlobVector").define());
infoMap.put(new Info("std::map<std::string,data::Blob<float> >").pointerTypes("StringFloatBlobMap").define());

The list of supported basic containers includes by default the ones listed in InfoMap.java, but it is also possible to append to that list other similar templates this way:

infoMap.put(new Info("basic/containers").cppTypes("templates::MyMap", "templates::MyVector"));

Using adapters for C++ container types

For some standard C++ container types, it is sometimes preferable to use an adapter to map them to existing Java types. The Generator provides a few adapters by default for std::string, std::wstring, std::vector, std::shared_ptr, and std::unique_ptr. Therefore, by default, the Parser maps those types directly to both Pointer types and standard Java types (String, int[], etc) using the corresponding annotations @StdString, @StdWString, @StdVector, @SharedPtr and @UniquePtr, as given in the defaults found in InfoMap.java. For @SharedPtr and @UniquePtr, since the namespace may sometimes be boost or std, we need to specify it in the @Platform annotation like this:

@Platform(compiler = "cpp11", define = {"SHARED_PTR_NAMESPACE std", "UNIQUE_PTR_NAMESPACE std"}, ... )

Which can result in an output such as the following, but be aware that existing Java types have limitations, for example, Java arrays cannot be resized while std::vector can:

public static native void transform(@SharedPtr DataHandle arg0, @StdVector int[] parameters);

Users can create more adapters by themselves, and use them with the @Adapter annotation, either directly or on newly created annotations. At a minimum, we basically need to define a C++ class template with:

a constructor taking a const pointer (which can be an array) to the values, the size (which can be always 0 or 1 for some containers), and an owner pointer of the container itself (which may be null or equal to the value pointer),
an assign() method with the same set of parameters, but not const,
another constructor taking a reference to an existing container object, which can be an rvalue reference if required,
a static void deallocate(void *owner) method to call the destructor,
appropriate cast operators to return types needed by function calls, along with
member variables named ptr, size and owner, which basically mirror the state of the container, but outside of the container.

Each adapter instance is short-lived, so we cannot rely on the fields for anything that should persist. For example, the adapter required for a smart pointer similar to std::shared_ptr may look like this:

template<class T> class SmartPtrAdapter {
public:
    SmartPtrAdapter(const T* ptr, int size, void *owner) :
        ptr((T*)ptr),
        size(size),
        owner(owner),
        smartPtr2(owner != NULL && owner != ptr ? *(smart_ptr<T>*)owner : smart_ptr<T>((T*)ptr)),
        smartPtr(smartPtr2) { }
    SmartPtrAdapter(const smart_ptr<T>& smartPtr) :
        ptr(0),
        size(0),
        owner(0),
        smartPtr2(smartPtr),
        smartPtr(smartPtr2) { }
    void assign(T* ptr, int size, void* owner) {
        this->ptr = ptr;
        this->size = size;
        this->owner = owner;
        this->smartPtr = owner != NULL && owner != ptr ? *(smart_ptr<T>*)owner : smart_ptr<T>((T*)ptr);
    }
    static void deallocate(void* owner) {
        delete (smart_ptr<T>*)owner;
    }
    operator T*() {
        ptr = smartPtr.get();
        if (owner == NULL || owner == ptr) {
            owner = new smart_ptr<T>(smartPtr);
        }
        return ptr;
    }
    operator smart_ptr<T>&() {
        return smartPtr;
    }
    operator smart_ptr<T>*() {
        return ptr ? &smartPtr : 0;
    }
    T* ptr;
    int size;
    void* owner;
    smart_ptr<T> smartPtr2;
    smart_ptr<T>& smartPtr;
};

Along with the following annotation and Info.annotations:

@Documented
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD, ElementType.PARAMETER})
@Adapter("SmartPtrAdapter")
public @interface SmartPtr {
    /** template type */
    String value() default "";
}

// ...

infoMap.put(new Info("ns::smart_ptr").skip().annotations("@SmartPtr"));

Dealing with abstract classes and virtual methods

For abstract classes or other classes that cannot be instantiated because of deleted constructors, or what have you, that the Parser may not understand, we can skip over the constructors with Info.purify, while for classes containing virtual methods that we would like to override in Java, we can use Info.virtualize to have the parser annotate the methods with @Virtual annotations, which lets the Generator output the necessary machinery to get this working using a hidden concrete implementation and JNI callbacks. For this reason, we should not activate both settings together for abstract classes with pure virtual functions that end users need to implement, for example:

class Logger {
    protected:
    virtual void log(const std::string& message) = 0;
    virtual ~Logger() {}
};