Skip to content

Latest commit

 

History

History
198 lines (114 loc) · 8.84 KB

howitworks.md

File metadata and controls

198 lines (114 loc) · 8.84 KB

How the tiny lisp interpreter of mine is working

Content

The List or Tree Structure

Probably one of the reasons why Lisp is still worth looking at in the 21st century.

The structure is build recursivly from the “tokenized” form or from without brackets. The required functions are all in reader.js. The evaluation copies this structure. However the functions for evaluation are in eval.js .

There is a function readForm which calls either readAtom or readList. Through readForm readList is interwoven with readAtom, which reads a single Element.

The structure is saved as array. In simple language each thing between parenthesis is actually an array itself.

So (+ 1 (* 2 3)) becomes [ “+”,”1” , [ “*” , “2”, “3” ] ] and then further types and additional information is added.

The function readList reads an whole such array.

The function readAtom is extended to recognize and set types.

Currently implemented are

  • SYMBOL … an alphanumeric string ; used for variables and functions
  • BOOLEAN … boolean values true and false
  • INTEGER … simple numbers without dot and minus



The REPL

REPL is short for repeat , evaluate … etc. There is even an npm package for that because it is so common and that is used in repl.js.

By the way you start the whole thing with

<center>node repl.js . </center>

The Evaluation

... or the E in REPL is actually the heart of the whole thing and it is by nature very recursive.

Basically like in Thue or in other simple languages based on String Rewriting Theorem you replace one part with just another and then do that thing over and over again.

Lisp uses its “list” or better tree or graph structure. It uses also different environments – more about that in another chapter.

JavaScript knows associative arrays and both functions and variables are stored in an associative array. For ease I call and to be compatible with the sayings and writings of others I call this “environment”.

A variable usually marked as “type: SYMBOL” is looked up in the array and then it is replaced by the value it represents.

Just to give a sample:

(define x 3 ) // roughly the same as JS var x=3 

an expression (+ x 1) is replaced with (+ 3 1) as 3 is the value of x and then in the next call or execution of function evalList in function evaluate the +-function is executed and (+ 3 1) is replaced with (4).

(4) and other similar looking things are called singletons by me and they are removed from their parenthesis in function evalForm. This means they used to be an array in an array before but now become part of an array. In the case of (4) , it just becomes 4.

The functions evalForm, evalList and evalAtom are anaolog to the functions readForm, readList and readAtom in file reader.js and already discussed in the first chapter.

It is important that those two things are analog to each other because otherwise it would make the whole interpreter an unknown multiple more difficult.

It is important to keep certain parts well structured and it is a bad idea to design them by trial and error. I'd say you need at least a glimpse of an understanding what a logical proof is. Otherwise you will produce one garbage after another.

Just as variables are stored in what's refered to as environemnt so are functions. There is no real difference between single values and lists as single values are just lists with one atom or element. Functions are just stored as lists too.

However certain functions are not stored as lists because they had to be there before the evaluate function or the interpreter was there.

Thus there are functions stored in an associative array called replEnv in eval.js and functions coded in the evaluate function itself.

The latter modify the list structure or add functions and variables to the environment. I will call them “special functions” for now although the term might not be correct.

The “special functions” or list and enviroment modifiers

A variable or function just gets replaced by a return value when evaluated or when processed in the evaluate function.

There are however certain let's say “special functions” or modifiers which do more than that.

Currently the interpreter supports the following

  • define
  • if
  • car
  • cdr
  • cons
  • quote

-- define

... modifies the environment and defines a new function or variable. Currently you have to make use of “quote” in case you want the variable to store the unevaluated list.

So I see I have to give a sample

(define x (+ 1 2)) will do just the same as (define x (3)) . 

(define x (quote (+ 1 2))) will store (+ 1 2) in x. 

-- quote

... is different from the other modifiers in that I even put it out of the recursive or in reality iterative looping.

It is thus even more special than “define” and all others.

“quote” prevents its argument or parameter from being evaluated and thus it is the key of defining functions.

Although the current implementation does not include “lambda” it is somehow quirky possible to define functions which can have parameters and add an equivalent of a local environment.

This was mentioned by an Austrian on the web but he did not explain how , so I started to ask questions and think about it.

-- if

It ruined the shortness of my evalList function and currently has to even work as a substitute by the more elaborate “cond” in my implementation.

“if” replaces itself by value depending whether or not a statement is true or not.

It has three arguments. The first I may call condition or statement which is evaluated to a BOOLEAN value --- thus only either true or false.

In case the first argument evaluates to true the whole thing is replaced by the second argument. In case it is false it gets replaced by the third argument.

It is important that no code of the if-arguments gets executed before it is decided whether or not the condition or statement is true or false.

-- car, cdr, cons

A list can be regarded as a family of mice biting each other in the tail – probably going for a walk that way.

“car” returns the first element or head of the list. In our mouse family example this would probably be mama mouse. In Lisp jargon this is described as returning the head.

“cdr” returns all the elements attached to the first element. In Lisp jargon this is called the tail.

“cons” is short for constructing and constructs a list from two arguments which are usually lists by themselves. Unfortunately the implementation of “cons” is still buggy by the time I am writing this.

Don't think about mice as those are list elements and I have told you that list elements can also be list of themselves.

This cries for another sample so I give you one.

You can define a list currently by

(define x ((1 2)(3 4)) )

thanks to the implementation of types , which doesn't let the code suppose 1 and 3 are functions.

(car x) should then return (1 2)

while (cdr x) should then return (3 4) .

Workarounds for lack of lambda

As I mentioned earlier “lambda” is not yet implemented so I am showing you a workaround.

Suppose you want to implement a simple function “double” which doubles the amount of a given parameter.

In Scheme and other Lisp interpreters you can do so by

(define double (lambda (x) (* 2 x)) )

which you can also do in javaScript by

var double = function (x){return 2*x;};

or by using the arrow function

var double = x => 2*x;

So , how can it be done without lambda ?

(define double (quote (* 2 (car (cdr doublex)))) )

then you attach the value you want for x with

(define doublex (cons double (quote (3 0))) )

then you execute the whole thing with

(eval (car doublex))

... unfortunately I have not yet made evaluate to be called directly , but you can do so quite easily.