A Lua library for encoding and decoding the Erlang External Term Format.
Tested on Lua 5.1 - 5.4 and LuaJIT.
By default, this decodes and encodes with logic similar to erlpack, but decoders and encoders have options to customize how values are processed.
MIT (see file LICENSE
).
Some files are third-party and contain their own licensing:
csrc/thirdparty/bigint/bigint.h
: Zero-clause BSD, details in source file.csrc/thirdparty/miniz/miniz.h
: MIT license, details incsrc/thirdparty/miniz/LICENSE
.
local decoder = etf.decoder(options) -- create a decoder
local decoded = decoder:decode('\131\116\0\0\0\2\100\0\1\97\97\1\100\0\1\98\97\2')
-- decoded will be a table like:
{ a = 1, b = 2 }
options
is an optional table with the following keys, all optional:
use_integer
- set totrue
to decode all integers asetf.integer
userdata.use_float
- set totrue
to decode all floats asetf.float
userdata.version
- specify the Erlang Term Format version you wish to decode. As far as I can tell,131
is the only version in existence.atom_map
- customize how Atom types are decoded. This can be a table, or a function that accepts a string (representing the atom name) and a boolean (true
if the atom is a map key,false
otherwise).
Here's how various Erlang types are mapped to Lua by default:
Supported | Erlang Type | Lua Type |
---|---|---|
[ ] | ATOM_CACHE_REF |
|
[x] | ZLIB |
(automatically decompressed and decoded) |
[x] | SMALL_INTEGER_EXT |
number |
[x] | INTEGER_EXT |
number or etf.integer (based on value) |
[x] | FLOAT_EXT |
number |
[x] | PORT_EXT |
table |
[x] | NEW_PORT_EXT |
table |
[x] | V4_PORT_EXT |
table |
[x] | PID_EXT |
table |
[x] | NEW_PID_EXT |
table |
[x] | SMALL_TUPLE_EXT |
table |
[x] | LARGE_TUPLE_EXT |
table |
[x] | MAP_EXT |
table |
[x] | NIL_EXT |
table (empty) |
[x] | STRING_EXT |
string |
[x] | LIST_EXT |
table |
[x] | BINARY_EXT |
string |
[x] | SMALL_BIG_EXT |
number or etf.integer |
[x] | LARGE_BIG_EXT |
number or etf.integer |
[x] | REFERENCE_EXT |
table |
[x] | NEW_REFERENCE_EXT |
table |
[x] | NEWER_REFERENCE_EXT |
table |
[x] | FUN_EXT |
table |
[x] | NEW_FUN_EXT |
table |
[x] | EXPORT_EXT |
table |
[x] | BIT_BINARY_EXT |
string |
[x] | NEW_FLOAT_EXT |
number |
[x] | ATOM_UTF8_EXT |
string or boolean or etf.null |
[x] | SMALL_ATOM_UTF8_EXT |
string or boolean or etf.null |
[x] | ATOM_EXT |
string or boolean or etf.null |
[x] | SMALL_ATOM_EXT |
string or boolean or etf.null |
etf
will figure out the maximum and minimum integer values that can be
safely handled by Lua at run-time. When any integer is decoded, it will
use Lua's number
type if possible, and a etf.integer
userdata if it's
outside the safe range.
You can opt to have all integers be returned as etf.integer
userdatas. The
benefit of this is all values will use the same type. On Lua 5.2 and later,
the etf.integer
userdatas can be compared to regular Lua numbers, but on
Lua 5.1 you can only compare etf.integer
values with other etf.integer
values.
To enable this, create the decoder with the use_integer
option set to true
:
local etf = require'etf'
local decoder = etf.decoder({use_integer = true })
local val = decoder:decode('\131\97\1') -- returns a integer
print(debug.getmetatable(val).__name)
-- prints "etf.integer"
Erlang supports a concept of "atoms" which doesn't completely translate to Lua.
In Erlang, one can create a map like:
Map = #{ a => 1, b => false, c => hello }
In that example, a
, b
, false
, and hello
are all atoms. They're
essentially small strings that can be used for map keys, enums, etc.
Note that Erlang doesn't have a boolean
type. false
is just another atom.
By default, atoms are decoded with the following logic:
- If the atom is a map key (like
a
andb
in the example), it's decoded as a string. - If the atom is a value (like
false
andhello
in the example, then:- Atom
true
is decoded as Lua's booleantrue
. - Atom
false
is decoded as Lua's booleanfalse
. - Atom
nil
is decoded asetf.null
, which is an atom userdata. - Anything else is decoded as a string.
- Atom
This is meant to be compatible with erlpack, and to make decoded data as easy to handle as possible.
If your application has other atoms that need to be translated into values,
you can specify the atom_map
parameter. This can be a table with string keys,
or a function. The function should accept a string parameter representing
the atom name, and a boolean representing if the atom is a map key or not.
The default logic can be represented as:
local function atom_map(str, is_key)
if is_key then return str end
if str == 'true' then
return true
elseif str == 'false' then
return false
elseif str == 'nil' then
return etf.null
end
return str
end
If for example, you wanted to keep string keys but keep the values as atoms:
local function atom_map(str, is_key)
if is_key then return str end
return etf.atom(str)
end
SMALL_TUPLE_EXT
, LARGE_TUPLE_EXT
, LIST_EXT
, and NIL_EXT
will be decoded into array-like tables (all keys are integers, they're consecutive,
and they start at 1).
The table will have a metatable set to indicate the original type - etf.tuple_mt
for tuples, and etf.list_mt
for lists.
MAP_EXT
will be decoded into a Lua table. By default, the keys are (probably) strings,
see above about how atoms are mapped. Values are mapped into the appropriate Lua type
according to the above table.
The table will have a metatable set to indicate it was a map - etf.map_mt
.
The various PORT
types (PORT_EXT
, NEW_PORT_EXT
, V4_PORT_EXT
) will be decoded into
a table with the following fields:
node
- a string.id
- a number or integer.creation
- a number or integer.
The table will have a metatable set to etf.port_mt
.
The PID
types (PID_EXT
, NEW_PID_EXT
) will be decoded into a table with
the following fields:
node
- a string.id
- a number or integer.serial
- a number or integer.creation
- a number or integer.
The table will have a metatable set to etf.pid_mt
.
FUN_EXT
will be decoded into a table with the following fields:
numfree
- a number or integer.pid
- the previously-mentionedPID
type.module
- a string.index
- a number or integer.uniq
- a number or integer.free_vars
- an array like table of terms.
The table will have a metatable set to etf.fun_mt
.
NEW_FUN_EXT
will be decoded into a table with the following fields:
size
- a number or integer.arity
- a number.uniq
- a string.index
- a number or integer.numfree
- a number or integer.module
- a string.oldindex
- a number or integer.olduniq
- a number or integer.pid
- the previously-mentionedPID
type.free_vars
- an array like table of terms.
The table will have a metatable set to etf.new_fun_mt
.
EXPORT_EXT
will be decoded into a table with the following fields:
module
- a string.function
- a string.arity
- a number or integer.
The table will have a metatable set to etf.export_mt
.
The REFERENCE
types (REFERENCE_EXT
, NEW_REFERENCE_EXT
, NEWER_REFERENCE_EXT
) will
be decoded into a table with the following fields:
node
- a string.creation
- a number or integer.id
- an array-like table of numbers or integers.
The table will have a metatable set to etf.reference_mt
.
local encoder = etf.encoder(options) -- create a encoder
local encoded = encoder:encode({ a = 1, b = 2 })
-- encoded will be a MAP_EXT with BINARY_EXT keys and SMALL_INT_EXT values
options
is an optional table with the following keys, all optional:
version
- specify the Erlang Term Format version you wish to encode. As far as I can tell,131
is the only version in existence.compress
- set totrue
to enable compression at the default level, or0
through9
to specify a compression level.value_map
- customize how values are encoded, this can be a table or a function that accepts the value to be encoded, and a boolean indicating if the value is a table key.
Here's how various Lua types are mapped to Erlang Term Format by default:
Supported | Lua Type | Erlang Type |
---|---|---|
[x] | nil |
a nil SMALL_ATOM_UTF8_EXT |
[x] | number |
NEW_FLOAT_EXT , SMALL_INTEGER_EXT , INTEGER_EXT , SMALL_BIG_EXT , LARGE_BIG_EXT (as appropriate) |
[x] | boolean |
SMALL_ATOM_UTF8_EXT |
[x] | string |
BINARY_EXT |
[x] | table |
NIL_EXT , LIST_EXT , or MAP_EXT |
[x] | userdata |
(see details below) |
A table is determined to either be map-like or list-like. If a table
has integer keys starting at 1, with no gaps, it's considered to be
list-like and will be encoded as a LIST_EXT
.
If a table has no keys at all, it will be treated as a list-type
with zero items and encoded as a NIL_EXT
(Erlang's version of an
empty list).
Otherwise, the table is considered map-like, and will be encoded
as a MAP_EXT
. All table keys will be encoded as strings (specifically
BINARY_EXT
). This is meant to be compatible with erlpack.
etf
allows creating various userdata to force a specific encoding:
Userdata | Erlang Type |
---|---|
etf.integer |
SMALL_INTEGER_EXT , INTEGER_EXT , SMALL_BIG_EXT , LARGE_BIG_EXT as appropriate |
etf.float |
NEW_FLOAT_EXT |
etf.string |
STRING_EXT |
etf.binary |
BINARY_EXT |
etf.atom |
SMALL_ATOM_UTF8_EXT or ATOM_UTF8_EXT |
etf.tuple |
TUPLE_EXT |
etf.list |
LIST_EXT |
etf.map |
MAP_EXT |
etf.port |
NEW_PORT_EXT or V4_PORT_EXT |
etf.pid |
NEW_PID_EXT |
etf.export |
EXPORT_EXT |
etf.reference |
NEWER_REFERENCE_EXT |
The etf.integer
type will encoded to the smallest-possible integer. So, a integer
in the range of an 8-bit unsigned integer will be encoded as a SMALL_INTEGER_EXT
value,
a integer
in the range of a 32-bit signed integer will be encoded as an INTEGER_EXT
value,
and so on.
Using these userdata with a custom value_map
function allows precise control over
mapping. For example, if you want to use Atom types for all table keys, you could do:
local function value_map(val, is_key)
if is_key then
return etf.atom(val)
end
return val
end
local encoder = etf.encoder({value_map = value_map })
local binary = encoder:encode({ a = 1, b = 2 })
-- will return a MAP_EXT with atom keys and integer values
decoder
- function that returns adecoder
userdata.decode
- convenience function to decode without creating a decoder.
encoder
- function that returns anencoder
userdata.encode
- convenience function to encode without creating an encoder.
atom
- function that returns anatom
userdata (requires a string).binary
- function that returns abinary
userdata (requires a string).string
- a function that returns astring
userdata (requires a string).
integer
- function that returns ainteger
userdata (accepts a number, string, or none).
float
- function that returns afloat
userdata (accepts a number, string, or none).
list
- function that returns alist
userdata (optionally accepts a table).map
- function that returns amap
userdata (optionally accepts a table).tuple
- a function that returns atuple
userdata (optionally accepts a table).
export
- function that returns anexport
userdata (requires a table matchingEXPORT_EXT
above).pid
- a function that returns apid
userdata (requires a table matchingPID_EXT
above).port
- a function that returns aport
userdata (requires a table matchingPORT_EXT
above).reference
- a function that returns areference
userdata (requires a table matchingREFERENCE_EXT
above).
maxinteger
- ainteger
value representing the maximum integer that can be represented by Lua natively.mininteger
- ainteger
value representing the minimum integer that can be represented by Lua natively.null
- anatom
that represents anil
atom.
atom_mt
- theatom
userdata's metatable.integer_mt
- theinteger
userdata's metatable.float_mt
- thefloat
userdata's metatable.binary_mt
- thebinary
userdata's metatable.decoder_131_mt
- thedecoder
userdata's metatable.encoder_131_mt
- theencoder
userdata's metatable.export_mt
- theexport
userdata's metatable.fun_mt
- thefun
userdata's metatable.list_mt
- thelist
userdata's metatable.map_mt
- themap
userdata's metatable.new_fun_mt
- thenew_fun
userdata's metatable.pid_mt
- thepid
userdata's metatable.port_mt
- theport
userdata's metatable.reference_mt
- thereference
userdata's metatable.string_mt
- astring
userdata's metatable.tuple_mt
- thetuple
userdata's metatable.
_VERSION
- the module version as a string._VERSION_MAJOR
- the module's major version as a number._VERSION_MINOR
- the module's minor version as a number._VERSION_PATCH
- the module's patch version as a number.
numsize
- the size of a Lua number, in bytes.