Skip to content

cxw42/do-not-self-host

Repository files navigation

do-not-self-host

A development toolchain from the ground up, starting from assembly. Don't self-host your next language! Make it possible for us to build from source, from scratch, without needing a bootstrap package!

This is a long-term hobby project, so please do not expect regular updates :) . However, I certainly welcome others who want to contribute.

Assumes a development environment that provides stdin/stdout and redirection.

Current status

  • ngb: VM (in C)
  • ngbasm: assembler (in Python)

Editor support

ngb assembly files have the extension .nas. A Vim syntax configuration is available here.

I'm not the only one

The Facebook Buck build system also doesn't self-host by default (although it can). The Buck FAQ says, in part:

Q: Why is Buck built with Ant instead of Buck?

A: Self-hosting systems can be more difficult to maintain and debug. If Buck built itself using Buck, then every time a change was made to Buck's source, the commit would have to include a new Buck binary that included that change. It would be easy to forget to include the binary, difficult to verify that it was the correct binary, and wasteful to bloat the Git history of the repository with binaries that could be rebuilt from source. Building Buck using Ant ensures we are always building from source, which is simpler to verify.

Installation and testing

The code is currently C and Python, but the infrastructure runs in Perl. Tests use Perl's prove.

Building

  • If you don't already have it, install Perl (e.g., using perlbrew).
  • Install cpanminus.

Then build using:

perl Makefile.PL
cpanm --installdeps .
make
cd mtok
make

Once you have run the perl and cpanm steps, you shouldn't need to do so again if you are only working on the C/Python/ngbasm sources. Just run make as necessary.

Testing

Once you have done the build steps, run prove or make test in the top level of the repository.

Older notes

Based on crcx/Nga-Bootstrap, which provides:

  • naje - a basic assembler (Python)
  • nmfcx - a Machine Forth Cross Compiler (Retro)

In the pipeline:

  • NGA+:

    • Implement NGA VM in x86 assembly (NASM?)
    • Read/write stdin/stdout (port-based, a la retro? Maybe not - that's flexible, but perhaps more than we need).
    • Add support for record blocks A and B - configurable number of fields per block; aload, astore, bload, bstore, aread, awrite, bread, bwrite
    • .const
  • Minimal Infix High-Level Language (Minhi) - <program>::=<expression>+, and everything else is an expression.

    • Why expressions? Because infix expressions are easy to parse based on a table, as described in A Retargetable C Compiler: Design and Implementation.
    • Lexer written in NGA+ that takes source and outputs token stream
    • Parser written in NGA+ that takes token stream (block A) and outputs AST (block B)
    • Compiler that produces NGA+ assembly
    • Later, a compiler that produces x86 assembly

Future: to be determined... (but possibly a C compiler written in Minhi)