Skip to content

YJDoc2/The-Transpiler-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Transpiler project

A Transpiler which compiles from the custom syntax to pure C, using Flex and Bison.
This was done as a project while learning Flex and Bison

About

This is a basically very simple syntax converter, which converts from the custom syntax as stated in syntax.md, to pure C files. This supports only Basic of C syntax , and does not support :

  • Multi-dimensional Arrays
  • Any kind of pointers
  • asm and compiler directives, along with preprocessor commands
  • bitwise operators
  • enums and unions
  • etc...

This just supposed to be a fun project, not to be used in production environment

This introduces some things that are not directly supported in C , like :

  • let a = ...
  • for i in ... loops
  • classes ... or glorified structs

See the project.md for how the features were developed step-wise

Also thanks to Yatharth for helping with windows makefile

Build instructions

This can be directly built on linux systems using the outermost Makefile, and is built and tested on Ubuntu 20.04

For Windows, a program capable of running makefile is needed such as : this. Note this is not tested, and should be used on one's own risk. This assumes that the C compiler is linked to the name 'gcc' , in case the name is different , change it to correct name in the Makefile-windows in variable named CC. This also assumes archiving program ar is lined to name 'ar' change accordingly if name is different. Once successfully run, this will generate an executable file named 'ttp' in the outermost folder.

NOTE that for normal use, the files in transpiler folder should not be modified at all, or an accidental modification to any .l or .y file will force Make to run flex and bison programs on next building attempt, and as those are not strictly needed for non-development purposes, this may not allow to create the binary again.

Usage :

The command line syntax is : ttp filename [ -hkl ][-o outfile]
where filename is the name of file to be transpiiled, which is necessary argument
h flag displays help
k flag keeps the generated files even if there were any errors in the transpilation, which are otherwise deleted
l prints the line numbers as pre-processor directives, which can be used by compilers to show the lines in original file, instead of generated, but these may be a bit off that actual
o flag requires a name, which is given to the generated file instead of the name of input file.NOTE that this does not affect the generated class files, which are always named as 'class_classname.h' and 'class_classname.c'

For general Users :

These are few points to be noted before use :

All generated files contain these basic headers :

  • stdio.h
  • stdlib.h
  • stdbool.h
  • string.h
  • complex.h
  • math.h

In code the Order would be maintained in generated files, but not the adjacency. that is the line 1 and line 2 are two consecutive lines, in generated files, line 1 will come before line 2, but there may be some extra generated code between those two lines

this does not check well for void-type function returns, so that 5+ test() or test() +5 where test is a void returning function is transpiled without ay error message, even though in C these are incorrect

For input using input action : for bool = nonzero true, zero false
for complex two floats, real then img

In defining string in the code,only single line strings are accepted

Note that in denoting complex numbers (a,b) a is part which will not be multiplied by I and b is part which will be multiplied by I. In case the a and/or be themselves are complex, the resulting number will be formed by the said multiplications of I.

Does Not Support bitwise operators & | ~ ^ << >>, enums and unions

Only 1D arrays supported
the string should be used for read only / print only strings. for input-able string use charbuf : see syntax.md

Note that it is not necessary to declare function before use, or to use function that are declared in this file only. This works in two stages for scanning all function signatures before actual parsing, and any non-declared function is assumed as void-type, which must be type casted.

in for loops for var in a .. b, there must be a space in a and '..' and '..' and b
Also note that for using for loop over arrays, the array must be declared in the scope , enclosing scope or globally.This will not work on arrays passed as parameters to the functions

variables declared with let must have an expression assigned to them, whose type can be inferred. also these variables will be of normal type, i.e. niether static nor const.

Note that one cannot access the fields of class if brackets are put around the variable, i.e. if class var A has a field B one can access by A.B but not by (A).B . Same applies for fn / method calls, if fn F returns Class type var, which has a field B, one can access it by

  • F().B
  • classname A = F();
    A.B
but cannot access as (F()).B

For those who will look into code

Project is divided in three folders : lib contains all the functions that are used in parser, util contains Data structures required for parser, adn transpiler contains all bison and flex files

Do not go looking for ASTs as they are not there, as this has to print the expressions in the infix format only in C , ASTs are not used. This is the very reason why it has a poor error detection for void return functions, as is void check is put between expr sign (here) expr, it gives a lot of Shift reduce errors, and if put on value, it gives error for externally declared functions,even though they are type casted.

Funtion declaration suntax is fn name(paramlist) -> ret type {...} because otherwise there were Shift/reduce conflicts topstmt in declaration and fndeclaration And I didn't want to use GLR.

The scheme used for recursively varifying expressionf in fncalls is (kind-of) ad-hoc. It basically pushes the arglist ptr and current type of expression in two individual stacks and when it completes parsing a fncall it pops those two back.

For using Valgrind for leack checks : if normal installation gives error because of strlen bug, try uninstalling , then either : making from latest source or installing from apt-get/ respective package manages instead of snaps, also install libc6-dbg:i386 using package manager. The snap of valgrind didn't work because of strlen error.

for input-able strings, charbuf is introduced. it translates to char [ ][ ],and can be declared with [expr], [expr] = {...},[] = {...}, [expr][expr], [expr][expr] = {...}, [][expr] = {...}. these essentially creates char double array, where one can take in input.As C itself allows to take input in statically allocated strings, eg : char c[] = "...", ttp allows that as well by taking input in strings, but this will cause a seg fault at runtime, so it is best to create a charbuf [] to take string input, and charbuf[][] to make an array of input-able strings.

the reason for for loops working only on arrays in scope not passed as pointers is because C can track size of arrays declared in the scope/globally but once it reduces to pointer, the scheme used for determining the size of array : sizeof(a)/sizeof(a[0]) does not work.

About

A transpiler which compiles c-like syntax to pure C. This supports classes, let declaration, for-in loops. Created using Flex and Bison.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published