A development toolchain from the ground up, starting from assembly. Don't self-host your next language! Make it possible for us to build from source, from scratch, without needing a bootstrap package!
This is a long-term hobby project, so please do not expect regular updates :) . However, I certainly welcome others who want to contribute.
Assumes a development environment that provides stdin/stdout and redirection.
- ngb: VM (in C)
- ngbasm: assembler (in Python)
ngb assembly files have the extension .nas
. A Vim syntax configuration
is available here.
The Facebook Buck build system also doesn't self-host by default (although it can). The Buck FAQ says, in part:
Q: Why is Buck built with Ant instead of Buck?
A: Self-hosting systems can be more difficult to maintain and debug. If Buck built itself using Buck, then every time a change was made to Buck's source, the commit would have to include a new Buck binary that included that change. It would be easy to forget to include the binary, difficult to verify that it was the correct binary, and wasteful to bloat the Git history of the repository with binaries that could be rebuilt from source. Building Buck using Ant ensures we are always building from source, which is simpler to verify.
The code is currently C and Python, but the infrastructure runs in Perl.
Tests use Perl's prove
.
Then build using:
perl Makefile.PL
cpanm --installdeps .
make
cd mtok
make
Once you have run the perl
and cpanm
steps, you shouldn't need to do so
again if you are only working on the C/Python/ngbasm sources. Just run
make
as necessary.
Once you have done the build steps, run prove
or make test
in the top
level of the repository.
Based on crcx/Nga-Bootstrap, which provides:
- naje - a basic assembler (Python)
- nmfcx - a Machine Forth Cross Compiler (Retro)
In the pipeline:
-
NGA+:
- Implement NGA VM in x86 assembly (NASM?)
- Read/write stdin/stdout (port-based, a la retro? Maybe not - that's flexible, but perhaps more than we need).
- Add support for record blocks A and B - configurable number of fields
per block;
aload
,astore
,bload
,bstore
,aread
,awrite
,bread
,bwrite
.const
-
Minimal Infix High-Level Language (Minhi) -
<program>::=<expression>+
, and everything else is an expression.- Why expressions? Because infix expressions are easy to parse based on a table, as described in A Retargetable C Compiler: Design and Implementation.
- Lexer written in NGA+ that takes source and outputs token stream
- Parser written in NGA+ that takes token stream (block A) and outputs AST (block B)
- Compiler that produces NGA+ assembly
- Later, a compiler that produces x86 assembly
Future: to be determined... (but possibly a C compiler written in Minhi)