Skip to content
James Edmondson edited this page Apr 3, 2020 · 29 revisions

TOOL DEPRECATED

As for 3.3.0, this tool is deprecated. We specifically removed this due to complications with Boost long term support and the requirement of boost filesystem, which was causing issues with moving to UE4 as a simulation environment for GAMS. If you are interested in this feature, let us know, and we will try to prioritize its reinclusion. However, there is no planned resurrection for this tool at the moment.

This tool lives in v3.2.3, at the latest.


The MADARA File Service

The MADARA File Service (MFS) is a tool included in MADARA that offers a request/response model for remote operators to retrieve files from an operational agent. The service includes features for compression, encryption, bandwidth monitoring, and other standard MADARA quality-of-service features, all available through command line options.

MFS Overview

Update: The contents are now sent as a madara::knowledge::containers::Vector instead of as a large binary blob. This allows us to fragment the message into 60KB messages (default size) to allow for all supported transports to do a good job of transferring large files and also to enable progress bars on large file transfers.


Table of Contents


Sandboxes

The MFS does not allow retrieval of arbitrary files from the agent. Instead, it allows the user to designate sandboxes that are essentially walled-off gardens from the rest of the file system.

Sandbox Overview

Sandboxes are defined via command line options such as -0f, similar to the option in the karl command line tool. The following file is located in the repository at $MADARA_ROOT/examples/settings/mfs_sandboxes.mf and shows how to configure a sandbox on the filesystem.

.sandbox.projects=#expand_env ("$(HOME)/projects");
.sandbox.projects.name="Projects Repo";
.sandbox.projects.description="A collection of coding projects";
.sandbox.files=#expand_env ("$(HOME)/files");
.sandbox.files.recursive=true;
.sandbox.files.name="File Repo";
.sandbox.files.description="A collection of assorted files";

In the above, ids are specified as alphanumeric identifiers after .sandbox. The path is loaded from the variable referenced by .sandbox.{s.id}, where s.id is the sandbox id. Because this file is a KaRL script, there are lots of system calls such as #expand_env that can be used to be more expressive in the creation of these variables. name and description are strings that define these metadata characteristics of the sandbox. recursive is a flag that indicates whether or not the sandbox includes recursive file indexes that look inside of the directories within the sandbox path.

The sandboxes are loaded in initially and are scanned during runtime for changes in the number of files, their sizes, and their updated last_modified times.


Digests

The MFS periodically posts digests of the files available in each sandbox. These digests include the last_modified time, the file name, and the file size for all files that may be served.

The digests take the form of:

{a.prefix}.sandbox.{s.id}.file.{f.name}.last_modified={f.last_modified}
{a.prefix}.sandbox.{s.id}.file.{f.name}.size={f.size}
{a.prefix}.sandbox.{s.id}.file.{f.name}={f.last_modified}

Note that the digest includes a convenience duplication of the last_modified timestamp that can be used to simplify parsing the digests in order to craft responses.


Requests

Requests are initiated through changes in knowledge over the network. The MFS has on-receive filters that respond to any knowledge sent by other applications with the same MADARA endpoints and with the same buffer filters (e.g., encryption and compression). The Requests are enqueued into buffer that exists between the on-receive filter and a response generator that handles these requests in the queue.

A request takes the form of:
{a.prefix}.sync.sandbox.{s.id}.file.{f.name}={f.last_modified}

Where a.prefix is the agent prefix used by the MFS. s.id is the id of the sandbox where the file is located. f.name is the filename of the file in reference to the path of the sandbox denoted by s.id. f.last_modified is the timestamp when f.name was last modified, according to the file copy that the requester has on its own file system. This timestamp is useful for the MFS in that it can decide if a file update is actually useful. If its own timestamp is less than or equal to the requester timestamp, no file update is sent because the requester has the most recent copy.

This allows requesters to simply repeat their requests and the MFS will handle the situation gracefully.


Fragment Requests

You can also specify fragments of a file be sent over the network. This is very useful if you made a request for a full file, but only part of the file arrived. Fragment Requests are initiated in a very similar way to requesting a full file. The MFS has on-receive filters that respond to any knowledge sent by other applications with the same MADARA endpoints and with the same buffer filters (e.g., encryption and compression). The Fragment Requests are enqueued into buffer that exists between the on-receive filter and a response generator that handles these requests in the queue. The main difference between a Request and a Fragment Request is that the latter has a type of an integer array instead of an integer value of the last modified timestamp.

A request fragment takes the form of:
{a.prefix}.sync.sandbox.{s.id}.file.{f.name}=[{f.frag#},{f.frag#}...]

Where a.prefix is the agent prefix used by the MFS. s.id is the id of the sandbox where the file is located. f.name is the filename of the file in reference to the path of the sandbox denoted by s.id. f.frag# is an array of fragment numbers to send from fragmenting f.name.

Once you've received all file fragments, you shouldn't need to request again.


Request All

It is often useful to request all files within a sandbox. To do this, simply specify all_files in the sandbox with the sync request.

A request all takes the form of:
{a.prefix}.sync.sandbox.{s.id}.all_files=1

Where a.prefix is the agent prefix used by the MFS. s.id is the id of the sandbox where the files are located.


Responses

Responses are generated by the MFS and are basically just the contents of the file divided into fragments. All requesters may receive the same response, depending on connectivity of the agents and the types of configured transports. The file contents are set in the contents Vector container in a structure similar to the digest format variables last_modified and size.

The response takes the form of:\

{a.prefix}.sandbox.{s.id}.file.{f.name}.contents.1={f.frag.1}
...
{a.prefix}.sandbox.{s.id}.file.{f.name}.contents.N={f.frag.N}
{a.prefix}.sandbox.{s.id}.file.{f.name}.contents.size={f.num_frags}
{a.prefix}.sandbox.{s.id}.file.{f.name}.crc={f.crc_32bit}

Where a.prefix is the agent prefix used by the MFS. s.id is the id of the sandbox where the file is located. f.name is the filename of the file in reference to the path of the sandbox denoted by s.id. f.frag.# is the fragmented contents of the file. f.num_frags is the number of fragments the file has been split into. f.crc_32bit is the 32 bit CRC hash that identifies the contents.


Delete Requests

Like file requests, delete requests are initiated through changes in knowledge over the network. The MFS has on-receive filters that respond to any knowledge sent by other applications with the same MADARA endpoints and with the same buffer filters (e.g., encryption and compression). The DeleteRequests are enqueued into buffer that exists between the on-receive filter and a response generator that handles these requests in the queue.

A delete request takes the form of:
{a.prefix}.delete.sandbox.{s.id}.file.{f.name}={f.last_modified}

Where a.prefix is the agent prefix used by the MFS. s.id is the id of the sandbox where the file is located. f.name is the filename of the file in reference to the path of the sandbox denoted by s.id. f.last_modified is the timestamp when f.name was last modified, according to the file copy that the requester has on its own file system. This timestamp is useful for the MFS in that it can decide if a file update is actually useful. If its own timestamp is less than or equal to the requester timestamp, no file update is sent because the requester has the most recent copy.

This allows requesters to simply repeat their requests and the MFS will handle the situation gracefully.


Delete All Requests

It is often useful to delete all files within a sandbox. To do this, simply specify all_files in the sandbox with the delete request.

A delete all request takes the form of:
{a.prefix}.delete.sandbox.{s.id}.all_files=1

Where a.prefix is the agent prefix used by the MFS. s.id is the id of the sandbox where the files are located.


Useful Command Line Args

Command line args for MFS can be printed out using -h or --help. Some of the more interesting command line arguments are:

-tdp|--transport-debug-prefix pfx : prefix in the knowledge base to save transport debug info
-sz|--send-hz hz               : maximum messages per second (inf by default). If used, smoother
                               : enforcement of sends per second (vs bursty enforcement of send
                               : bandwidth)
-esb | --send-bandwidth bytes  : enforce send bandwidth of a certain number of bytes per second
--ssl password                 : encrypt with SSL (requires ssl feature during MADARA build)
--lz4                          : use LZ4 compression (requires lz4 feature during MADARA build)    

Example Requester Invocation

$MADARA_ROOT/bin/mfs -0f $MADARA_ROOT/examples/settings/mfs_sandboxes.mf -m 239.255.0.1:4150 -esb 4000000 &
$MADARA_ROOT/bin/karl -i $MADARA_ROOT/examples/settings/mfs_request.mf -y 1 -ky -t 15 -m 239.255.0.1:4150

FileFragmenter Usage

The Madara File Service (MFS) was written to allow programmatic and automated file transfer. To facilitate receiving file fragments and piecing them back together, we have written a helper class called FileFragmenter. FileFragmenter can split a file into pieces (which we use in MFS) or piece them back together from the knowledge base. Below, you'll see some examples of how to use the class in C++ and Python.

C++

madara::knowledge::FileFragmenter fragmenter;
size_t received_size = fragmenter.from_kb("agent.0.sandbox.files.file.my_file.contents",kb);

Python

FileFragmenter example usage is below.

fragmenter = madara.knowledge.FileFragmenter()
received_size = fragmenter.from_kb("agent.0.sandbox.files.file.my_file.contents",kb)
file_size = fragmenter.file_size
file_contents = fragmenter.file_contents

2nd option: we have created an example file_receiver.py file that can be modified to receive files over ZMQ, Multicast, etc.

3rd option: we now include a function called get_file_progress which works similarly to the FileFragmenter.from_kb

received_bytes = madara.utility.get_file_progress ("file/myimage.jpg", crc, file_size)
percentage = float (received_bytes / file_size)