Skip to content

Implementation of data types

Sébastien Doeraene edited this page Nov 9, 2012 · 12 revisions

Implementation of data types

This section explains how to write new data types in the Object Model.

A data type in the Object Model is what we usually call a class in a classical object-oriented language. It is a collection of fields and methods acting on these fields.

The skeleton

To get started with a new data type, you should copy and paste this skeleton. As a running example, we will explain how to write the Cons data type.

You can put the following, e.g., in the file datatypes-decl.hh of the experiment application.

#include <mozart.hh>

namespace mozart {

//////////
// Cons //
//////////

// Stuff generated by the generator (don't worry about it now)
#ifndef MOZART_GENERATOR
#include "Cons-implem-decl.hh"
#endif

class Cons: DataType<Cons> {
public:
  // The standard constructor (invoked by Cons::build(...))
  inline
  Cons(VM vm, RichNode head, RichNode tail);

  // The GR constructor (GR stands for Graph Replicator)
  // It is used e.g. for garbage-collection, and has always the same signature
  inline
  Cons(VM vm, GR gr, Self from);

  // Any method you want, just as in a regular class
  // For example:
  StableNode* getHead() {
    return &_head;
  }

  StableNode* getTail() {
    return &_tail;
  }
private:
  // Any field you want, just as in a regular class
  // For Cons we'll have:
  StableNode _head;
  StableNode _tail;
};

// Stuff generated by the generator (don't worry about it now)
#ifndef MOZART_GENERATOR
#include "Cons-implem-decl-after.hh"
#endif

}

So what does this declaration says? At the C++ level, I guess you figured. It's merely a class Cons inheriting from DataType<Cons> (it is an instance of CRTP). At the Object Model level, however, it is much more meaningful!

This piece of code declares a new data type, called Cons. This data type is non-transient (the default). Its type identity can be accessed with Cons::type() (declared in the superclass DataType<Cons>. Moreover, it links this data type to a memory representation, which is the Cons class itself, a means to garbage-collect an entity of this type, etc.

Part of this magic is implemented by a rather clever type system over the C++ type system, written as a collection of (variadic) template classes in the core object model headers (memword.hh, storage-decl.hh, store-decl.hh, type-decl.hh, typeinfo-decl.hh and datatype-decl.hh). The rest of the magic is just generated automatically by a clang-based generator.

Most of the things generated, you need not care about. They are true boilerplate. But you should know that it will generate a specialization of TypeInfoOf<T> for the type Cons, which contains the following:

template <>
class TypeInfoOf<Cons>: public TypeInfo {
private:
  typedef SelfType<Cons>::Self Self;
public:
  TypeInfoOf() : Type("Cons",
                      UUID(), // No UUID, sometimes there is one
                      /* copyable  = */ false,
                      /* transient = */ false,
                      /* feature = */ false,
                      /* structuralBehavior = */ sbTokenEq) {}

  /** Singleton instance of this class */
  static const TypeInfoOf<Cons>* const instance() {
    return &RawType<Cons>::rawType;
  }

  /** Type identity of Cons (wraps the singleton instance in a Type) */
  static Type type() {
    return Type(instance());
  }

  inline
  void gCollect(GC gc, RichNode from, StableNode& to) const;

  inline
  void gCollect(GC gc, RichNode from, UnstableNode& to) const;

  inline
  void sClone(SC sc, RichNode from, StableNode& to) const;

  inline
  void sClone(SC sc, RichNode from, UnstableNode& to) const;
};

In Cons, there are a few things you probably do not understand yet. What exactly is the type Self used in method signatures, for example. Do not let them bother you, we will explain them in due time. For most practical purposes, you can consider Self as an alias for Cons*.

For now, focus on the things you do understand:

  • A constructor, which takes a contextual VM, and the head and tail of the Cons,
  • Two fields, which are StableNode's, for storing the head and the tail,
  • Two accessor methods for the head and the tail.

A pretty regular C++ class, I should say.

This class actually defines the behavior of your data type, entirely. Its memory layout as well as the operations you can call on it.

The implementation of the constructors, as well as any non-trivial method, should be put in a file named datatypes.hh. The entire contents of this file must be hidden from the generator, because it is not compilable without the sources that the generator generates. For the minimal skeleton we showed above, it should contain the following:

#include <mozart.hh>

#include "datatypes-decl.hh"

#ifndef MOZART_GENERATOR

namespace mozart {

//////////
// Cons //
//////////

#include "Cons-implem.hh"

Cons::Cons(VM vm, RichNode head, RichNode tail) {
  _head.init(vm, head);
  _tail.init(vm, tail);
}

Cons::Cons(VM vm, GR gr, Self from) {
  gr->copyStableNode(_head, from->_head);
  gr->copyStableNode(_tail, from->_tail);
}

}

#endif // MOZART_GENERATOR

Again, here you can implement the methods as in any regular class. In the regular constructor, here we initialize the head and tail with the parameters given. Usually nodes in parameters are passed as RichNode's.

The GR constructor instructs the graph replicator that it should replicate from->_head (resp. from->_tail) into _head (resp. _tail). How it does it, you need not know at that point. Just make sure that, in your GR constructor you:

  • Use gr->copyStableNode and/or gr->copyUnstableNode to copy nodes,
  • Use gr->copySpace to copy SpaceRef's,
  • Use gr->copyThread to copy Runnable*'s,
  • Use gr->copyStableRef to copy StableNode*'s,
  • Use the regular assignment operator of C++ to copy any other data (int, bool, etc.).

The generator

Now you have what you should write by hand. But there are still parts of the code that are missing. The generator will write them for you, but you need to instruct him to do so.

I will not expand on this now. In the experiment application, we have set up CMakeLists.txt so that the generator is run automatically on customlib.hh, which includes datatypes-decl.hh. Hence, you need not worry about it.

When working in the core aspects of Mozart, you should just modify coredatatypes-decl.hh (resp. coredatatypes.hh) so that it includes your files, e.g., cons-decl.hh (resp. cons.hh).

How to use your new data type

Now that you have defined your brand new data type, you'll want to use it. You may never instantiate a Cons directly. You must always go through the Cons::build() method to do so. Using the as<Cons>() method, you may call any public method of Cons through a rich node.

#include "customlib.hh"

UnstableNode head = SmallInt::build(vm, 5);
UnstableNode tail = Atom::build(vm, MOZART_STR("nil"));
UnstableNode cons = Cons::build(vm, head, tail);

RichNode richCons = cons;
cout << richCons.type()->getName() << endl;

RichNode richHead = *richCons.as<Cons>().getHead();
cout << richHead.type()->getName() << endl;
cout << richHead.as<SmallInt>().value() << endl;

Memory layout

The machinery of the Object Model takes care of all the details of memory management. But if you care about how exactly a node of type Cons behaves in memory, then you simply have the following: the first word in the node is Cons::type(), and the second word in the node is a Cons*. It points to an actual instance of Cons in memory, i.e., to an area with 4 words: 2 for _head and 2 for _tail.