Skip to content

A CLI tool to be used as part of a build toolchain which generates linker input object files containing arbitrary files as binary objects (blobs) with accompanying C/C++ header files providing the necessary code to access the blob's data at runtime.

License

Notifications You must be signed in to change notification settings

aremmell/emblob

Repository files navigation

emblob

License REUSE status Maintainability Rating Reliability Rating Security Rating CMake on multiple platforms

A CLI tool to be used as part of a build toolchain which generates linker input object files containing arbitrary files as binary objects (blobs) with accompanying C/C++ header files providing the necessary code to access the blob's data at runtime.

To put it in simple terms, emblob allows you include as part of your executable files such as, but not limited to:

  • Videos, images, sound effects/music tracks
  • Configuration files (JSON/YAML/TOML/INI/XML, etc.)
  • Version information (commit hash, time/date, build machine, etc.)
  • Web payloads (HTML/CSS/JS, etc.) for web-based GUIs
  • Public keys/certificates

In fact, any file can be embedded using emblob. Upon deployment of your application, you can utilize the automatically generated code to gain direct access to the raw data (on a per-file basis) of the files that are embedded in your executable (blobs). You can then do whatever you wish with this data: extract to disk, stream over a network connection, display as part of a GUI, and more.

emblob is known to work with versions of GCC and Clang which implement the C++20 standard on macOS, Linux, and FreeBSD (x64 and aarch64).

Windows/MSVC support is a possibility, if I get bored enough.

VS Code and the CMake Tools extension

  1. Load emblob.code-workspace and bring up the command palette
  2. Select CMake: Select Configure Preset and choose debug or release
  3. Select CMake: Build

CMake from the command-line:

  1. cmake -S . -B build
  2. cmake --build build --target emblob --target simple --target struct --clean-first

The CMake configuration is two-stage; it compiles emblob in the build directory, then it executes emblob with two separate sample input files, both located in the examples directory. For each of these input files, the following build products are generated (where {name} is the basename of the input file):

  • {name}.S: A linker assembly file containing instructions for the linker to embed the source file into {name}.o
  • {name}.o: A linker input object file which contains {name}.bin as a binary blob
  • emblob_{name}.h: A C/C++ header file containing routines to access binary blob data

Following the creation of these files, two example programs whose source code may also be found in the examples directory are compiled and linked with the object file generated by emblob. See example programs.

The name of the generated C and C++ compatible header file is dependent upon the value passed to emblob on the command-line for the --outfile/-o option. If you do not specify a value for this option, the basename of the input file (--infile/-i) is used. The header file's name will always be in the format emblob_{outfile}.h:

  • emblob -i gorilla.htmlemblob_gorilla.h
  • emblob -i gorilla.html -o chimpanzeeemblob_chimpanzee.h

All generated functions have names in the format emblob_get_{outfile}_{what} where {what} represents the function's specific task or return type. All functions are declared static inline (and extern "C" if __cplusplus is defined).

  1. uint64_t emblob_get_{outfile}_size()

    Returns the size of the embedded blob, in bytes.

  2. const uint8_t* emblob_get_{outfile}_8()

    Returns a pointer to the embedded blob that may be used to access the blob's data one byte (8-bits) at a time.

  3. const uint16_t* emblob_get_{outfile}_16()

    Returns a pointer to the embedded blob that may be used to access the blob's data two bytes (16-bits) at a time.

  4. const uint32_t* emblob_get_{outfile}_32()

    Returns a pointer to the embedded blob that may be used to access the blob's data four bytes (32-bits) at a time.

  5. const uint64_t* emblob_get_{outfile}_64()

    Returns a pointer to the embedded blob that may be used to access the blob's data eight bytes (64-bits) at a time.

  6. const void* emblob_get_{outfile}_raw()

    Returns a pointer to the embedded blob that may be used to access the blob's data arbitrarily.

Last but not least, emblob generates a linker object input (.o) file. As is the case with the generated header, its name is derived from the --outfile/-o option and has the format {outfile}.o. This is the file that physically contains the contents of the embedded blob, and it must become part of your executable in order to be useful.

Your build system likely has its own unique syntax and structure for achieving this, but for the purposes of this document, I will simply demonstrate how this can be done manually using a compiler frontend such as GCC or Clang:

Given a source code file named my_application.cpp, and a linker object input file generated by emblob named blob.o, the following commands will:

  1. Compile the source code file and generate a linker object input file named my_application.o
  2. Link together my_application.o and blob.o into an executable named my_application
  3. Run the new executable
c++ -c my_application.cpp && c++ -o my_application my_application.o blob.o && ./my_application

The C++ source code for the example programs can be found in the examples directory. I used this free online hex editor to create the example input files, but any old hex editor will do (or you can even create programs to generate them).

The simplest use case scenario: input is a binary file containing just 15 bytes (with values 0x01 through 0x0f). The program prints the size of the embedded blob in bytes, then the value of each byte in hexadecimal format. Source code

A particularly useful side effect of the C language (and by extension, C++) is the ability to directly map the contents of an embedded blob to a type-safe data structure and vice versa—a data structure may be serialized to a file quite easily. This example program demonstrates how you can effortlessly create a custom binary file, embed it as a blob, obtain a pointer to its data, cast to an appropriate type, and access members as if the data structure were initialized at compile-time. Source code

Note: this example does not take into account the endianness of the system it is running on—the example file is in little-endian format.

Name Short name Description Default value
--infile -i The relative path of the file to embed as a binary blob. N/A
--outfile -o The basename of the output files (e.g. 'foo' will result in foo.S, foo.o, and emblob_foo.h). Basename of the input file
--log-level -l Sets the console logging verbosity: [debug, info, warning, error, fatal]. info
--version -v Prints emblob version information. N/A
--help -h Prints emblob usage information. N/A

When choosing a compiler frontend, emblob will attempt to read the CC environment variable. If it is empty, emblob will execute cc.

In order to choose a specific compiler frontend, simply set the CC environment variable to the name of the desired compiler (e.g. 'clang').

To temporarily set or override the CC environment variable for the duration of emblob's execution:

env CC={compiler} emblob {args}

About

A CLI tool to be used as part of a build toolchain which generates linker input object files containing arbitrary files as binary objects (blobs) with accompanying C/C++ header files providing the necessary code to access the blob's data at runtime.

Topics

Resources

License

Security policy

Stars

Watchers

Forks