Skip to content

Trl.Serialization aims to create a compact human readable general-purpose data representation system based on the definition of terms. These _terms_ should be familiar to any programmer because they are basically strings, numbers, and function symbols. This is exposed as a serialiser and deserialiser to make it useful for most applications.

License

Notifications You must be signed in to change notification settings

WCoetser/Trl.Serialization

Repository files navigation

Motivation

Why invent a new serialization system? My problems with JSON and XML are:

  • Data tend to be bulky when viewed in text editors
  • No pointer support
  • Duplication of information
  • Not self-transformative
  • Expression trees are enweildly
  • No support for predefined constants (ex. Pi)
  • Only support one root element or dataset per file

Trl.Serialization aims to adress these issues and create a compact human readable general-purpose data representation system based on the definition of terms. These terms should be familiar to any programmer because they are basically strings, numbers, and function symbols.

For example, let's say that you want to represent this data (JSON version):

[
    {
        "name": "Socrates",
        "born": -470,
        "location":
        {
            "city": "Athens",
            "country": "Greece"
        }
    },
    {
        "name": "Plato",
        "born": -423,
        "location":
        {
            "city": "Athens",
            "country": "Greece"
        }
    },
    {
        "name": "Aristotle",
        "born": -384,
        "location":
        {
            "city": "Stagira",
            "country": "Greece"
        }
    }
]

Representing this information in the Term Rewriting Langauge (TRL) will give:

root: (p1, p2, p3);

p1 => person<name, born, location>("Socrates", -470, athens);
p2 => person<name, born, location>("Plato", -423, athens);
p3 => person<name, born, location>("Aristotle", -384, stagira);

athens => location<city, country>("Athens", "Greece");
stagira => location<city, country>("Stagira", "Greece");

TRL is the language used by Trl.Serialization and is defined in Trl.TermDataRepresentation.

Simple example: Deserialization

Input:

root: (p1, p2, p3);

p1 => Person<Name, Born, Location>("Socrates", -470, athens);
p2 => Person<Name, Born, Location>("Plato", -423, athens);
p3 => Person<Name, Born, Location>("Aristotle", -384, stagira);

athens => Location<City, Country>("Athens", "Greece");
stagira => Location<City, Country>("Stagira", "Greece");

Sample program showing deserialization:

StringSerializer serializer = new StringSerializer();
var philosophers = serializer.Deserialize<List<Person>>(INPUT_DESERIALIZE);
foreach (var p in philosophers)
{
    Console.WriteLine($"Name = {p.Name}, Born = {p.Born}, Location = {p.Location.City}, Country = {p.Location.Country}");
}

Output:

Name = Socrates, Born = -470, Location = Athens, Country = Greece
Name = Plato, Born = -423, Location = Athens, Country = Greece
Name = Aristotle, Born = -384, Location = Stagira, Country = Greece

Simple example: Serialization

Input:

static Location ATHENS = new Location
{
    City = "Athens",
    Country = "Greece"
};

static Location STAGIRA = new Location
{
    City = "Stagira",
    Country = "Greece"
};

static Person[] INPUT_SERIALIZE = new Person[]
{
    new Person { Name = "Socrates", Born = -470, Location = ATHENS },
    new Person { Name = "Plato", Born = -423, Location = ATHENS },
    new Person { Name = "Aristotle", Born = -384, Location = STAGIRA }
};

Sample program showing serialization:

StringSerializer serializer = new StringSerializer();
var philosophers = serializer.Serialize(INPUT_SERIALIZE, prettyPrint: true);
Console.WriteLine(philosophers);

Output:

root: (Person<Born,Location,Name>(-470,L0,"Socrates"),Person<Born,Location,Name>(-423,L0,"Plato"),Person<Born,Location,Name>(-384,Location<City,Country>("Stagira","Greece"),"Aristotle"));
L0 => Location<City,Country>("Athens","Greece");

Multiple datasets in the same document

By convention, the root object being serialized/deserialized is referred to as root, ex.:

root: "Hello World";

Sometimes, you want to represent multiple datasets in the same file. For example, let's assume you want to be able to get Aristotles and Plato seperately. Then you could code it like this:

plato: Person<Name, Born, Location>("Plato", -423, athens);
aristotle: Person<Name, Born, Location>("Aristotle", -384, stagira);

athens => Location<City, Country>("Athens", "Greece");
stagira => Location<City, Country>("Stagira", "Greece");

It is now possible to get the datasets seperately:

var plato = serializer.Deserialize<Person>(INPUT_DESERIALIZE, "plato");
var aristotle = serializer.Deserialize<Person>(INPUT_DESERIALIZE, "aristotle");

Custom term names and inheritance

Sometimes you need to deserialize classes with inheritance. In this case you must create explicit mappings to specify which term maps to which subclass. You could, for example, have these class definitions:

public interface IShape { }
public class Circle : IShape { public double Radius { get; set; } }
public class Square : IShape { public double Width { get; set; } }

The NameAndTypeMappings class is used to set up mappings to subtypes:

var nameMappings = new NameAndTypeMappings();
nameMappings.MapTermNameToType<Circle>("circle");
nameMappings.MapTermNameToType<Square>("square");

These mappings can then be used with deserialization:

var serializer = new StringSerializer(nameAndTypeMappings: nameMappings);
IShape circle = serializer.Deserialize<IShape>("root: circle<Radius>(10);");
IShape square = serializer.Deserialize<IShape>("root: square<Width>(10);");

Named constants

Sometimes it is convenient to use named constants in instead of values. For example, let's say that you want to define PI (3.14...). This can be done with the NameAndTypeMappings class, ex.:

var nameMappings = new NameAndTypeMappings();
var serializer = new StringSerializer(nameAndTypeMappings: nameMappings);
nameMappings.MapIdentifierNameToConstant("Pi", Math.PI);

var output = serializer.Serialize(Math.PI);
Console.WriteLine(output);

Output:

root: Pi;

Term serialization and deserialization using constructors and deconstructors

Sometimes you need to create objects by invoking their constructors. In this scenario, you need to not specify a class member mapping list. For example, you may want to represent a .NET DateTime object like this:

root: datetime(2020,10,13);

Constructors will automatically be invoked when no class member mappings are given. This code can be used for the deserialization:

string INPUT_SERIALIZE = "root: datetime(2020,10,13);";
var output = serializer.Deserialize<DateTime>(INPUT_SERIALIZE);

Serialization is the opposite of this. During serialization we need something that is like a constructor for classes, but that gives you the constructor parameters as outputs. A neat new feature that fits this description very well is deconstructors, which were introduced in C# 7. Deconstructors can be written in classes or in extension methods. This allows us to create a deconstructor for the .NET build-in DateTime type:

public static class DateTimeExtensions
{
    public static void Deconstruct(this DateTime dateTime, out int year, out int month, out int day)
    {
        year = dateTime.Year;
        month = dateTime.Month;
        day = dateTime.Day;
    }
}

.NET deconstructors must have at least 2 out arguments. In Trl.Serialization decconstructors with 1 or 0 out parameters are also supported. Serialization code for making use of the DateTimeExtensions class looks like this:

var nameMappings = new NameAndTypeMappings();
var serializer = new StringSerializer(nameAndTypeMappings: nameMappings);
nameMappings.MapExtensionMethodDestructorsFromType(typeof(DateTimeExtensions));
nameMappings.MapTermNameToType<DateTime>("datetime");
DateTime INPUT_DESERIALIZE = new DateTime(2020, 7, 8);
var outputSerialized = serializer.Serialize(INPUT_DESERIALIZE);

Building expression trees

With all of the above features in place, it is possible to create expression trees. A full example of how this might work is given in the sample console app in this repository.

The first step would be to define a base class or interface for expressions. In the sample app, this is done in the BinaryOperator class. After this, the child classes used to represent expression tree operators need to be registered, ex.:

var nameMappings = new NameAndTypeMappings();
var serializer = new StringSerializer(nameAndTypeMappings: nameMappings);
nameMappings.MapTermNameToType<Add>("add");
nameMappings.MapTermNameToType<Sub>("sub");
nameMappings.MapTermNameToType<Mul>("mul");
nameMappings.MapTermNameToType<Div>("div");

In this code sample, the Add, Sub, Mul, and Div subclasses inherits from BinaryOperator. Each of this example, the subclasses also implement an interface called IExpression defining the Calculate method.

All of this could be used for deserialization and executing the expression tree:

string INPUT_DESERIALIZE = "root: div(mul(4,sub(add(3,2),1)),5);";
IExpression expr = serializer.Deserialize<IExpression>(INPUT_DESERIALIZE);
Console.WriteLine($"Result = {expr.Calculate()}");

Installation via Nuget

See https://www.nuget.org/packages/Trl.Serialization/ for nuget package.

Unit Test Code Coverage

Unit tests can be run using the .\test.ps1 script. This will generate a code coverage report in the .\UnitTestCoverageReport folder using Coverlet and ReportGenerator.

Code Coverage

Licence

Trl.Serialization is released under the MIT open source licence. See LICENCE.txt in this repository for the full text.

About

Trl.Serialization aims to create a compact human readable general-purpose data representation system based on the definition of terms. These _terms_ should be familiar to any programmer because they are basically strings, numbers, and function symbols. This is exposed as a serialiser and deserialiser to make it useful for most applications.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published