Sunday, December 6, 2020

Sugar – a typed lispy language targeting webasm/wat

Sugar - a typed lispy language targeting webasm/wat

(home) 2020-12-06

Despite what I may have said in an earlier post, it's not all that nice to code directly in wat. So I've started work on a compiler. A very thin compiler. The language should feel very similar to our, beloved, wat, but save us from typing all those i32.const, local.get.

Some points up front:

  • the compiler targets WAT (not WASM). This means using WABT::wat2wasm.
  • not self hosting come on people, I don't even have malloc to play with yet, let alone a buggy, poorly implemented, partial copy of common lisp.
  • the compiler is written in common lisp (specifically, sbcl). hereinafter CL.
  • syntax checking is state-of-the-art and errors are output in a readable format (i.e. you will get a cryptic sbcl debug trace if you make the slightest mistake).
  • why sugar? It was either that or watson, watsup, watsamatta, watulookin. Also cos it's just syntactical sugar, get it?
  • internet searches are to be done using sugarlang or sugarland if you prefer.
  • there is no manual; not even mul/div or all the relation operators are written e.g <= are written yet!
  • the results are not standalone but a wasm file. The best standalone runtime I've found so far is wasm-interp. Please don't say node.js. You can say v8, maybe.
node.js/npm
That feeling when ...
People say Minimalist System
Then say "just use npm install ..."

sugar example

Here is the malloc.wat example from the previous blog post on webasm re-written in sugar. It's quite a bit more concise and convenient than the original raw wat (see below).

(previous post on wat /blog/2020-12-03-webasm-forth-with-lisp-syntax.html )

; malloc.sugar - sugar example code
(memory (import "js" "mem") 10)
(import "console" "log" (func js_log (i i32)))
(global memend i32 #x1000)
(global sbrk (mut i32) 0)
; ----- allocate some memory in wasm-space
(defun (export "malloc") (len i32) :result i32
  (locals (i32 (newbrk (+ sbrk len)) (mem sbrk)))
  (if (>= newbrk memend) (return 0))
  (set sbrk newbrk)
  mem)
; ----- write bytes in wasm-space memory using js callback
(defun (export dump_range) (start i32 len i32)
  (locals (i32 i (end (+ start len))))
  (for (i start end)
    (js_log (get.u8* i))))

And this is what you would have written if you were coding directly in wat.

(module
  (memory (import "js" "mem") 10)
  (import "console" "log" (func $js_log (param $i i32)))
  (global $memend i32  (i32.const 4096))
  (global $sbrk (mut i32)  (i32.const 0))
  ;; ----- allocate some memory in wasm-space
  (func (export "malloc") (param $len i32) (result i32)
    (local $newbrk i32)
    (local $mem i32)
    (local.set $newbrk (i32.add (global.get $sbrk) (local.get $len)))
    (if (i32.ge_u (local.get $newbrk) (global.get $memend))
      (return (i32.const 0)))
    (local.set $mem (global.get $sbrk))
    (global.set $sbrk (local.get $newbrk))
    (local.get $mem))
  ;; ----- write bytes in wasm-space memory using js callback
  (func (export "dump_range") (param $start i32) (param $len i32)
    (local $i i32)
    (local $end i32)
    (local.set $end (i32.add (local.get $start) (local.get $len)))
    (local.set $i (local.get $start))
    (block $break2 (loop $head1
      (br_if $break2 (i32.eq (local.get $i)  (local.get $end)))
      (call $js_log (i32.load8_u (local.get $i)))
      (local.set $i (i32.add  (i32.const 1) (local.get $i)))
      (br $head1)))))

using sugar

Write a your-source.sugar source file in your favorite text editor. Then run sugar on it to produce your-source.wat

This is how I use sugar (with a makefile for calling sugar, then wat2wasm): /src/sugar.mk

$ mk malloc.wasm
./sugar malloc.sugar >malloc.wat
summary: globals:2 functions:3 macros:2
wat2wasm malloc.wat

(see below for installing common lisp)

compiler

The compiler is very small and may be interesting to read, for that reason. /src/sugar

It includes a macro system, which I have used to implement inc and for but it's not yet available for use from sugar itself. Also the full CL defmacro/destructuring-bind lambda list syntax is non trivial to replicate; not to mention backquote.

compiler structure

Here is the overall compiler structure

loop:
  read a form [from source file]
  (compile form) => stdout
compile form:
  integer => (i32.const int)
  symbol => (local/global.get varname) [check in environment/scope]
  list/s-expr/cons =>
    while: (is-macro? form)
      form := (expand-macro form)
    is-special? form => (compile-special form) [defun/import/export/global/set...]
    is-function? =>
      for all args (compile arg)
      call func with args
    otherwise => error undefined function

common lisp?

Too many brackets?

Rule 1: There are no brackets, only indentation.
Rule 0: We suffer the brackets cos it gives us defmacro and that's worth a lot of suffering.

Did you ever generate code? Did you think that was cool? You'll love common lisp. Try it for 2 weeks. Money back guarantee if not satisfied.

value-if

Lisp has value-IF i.e. you can use IF anywhere you use a value. That's like C's ternary but more powerful. At first I thought WAT doesn't have it, but (if) has an optional (result type) clause, like (func) so that it's branches can return (type matched) values. So you can do clever things like this: note the (result i32) clause between if and condition.

(set x
  (if (result i32) (>= z limit)
    (comp-true-value y)
    (comp-false-value s 14 q)))

wat (select) is like the C ternary operator although it can have blocks/statement sequences inside.

select v if: select has all 3 clauses always evaluated while (if) has only the conditional and the chosen branch evaluated. The other (if) branch is not evaluated.

future directions

user accessible macros (defmacro)

Implement (defmacro ...) for sugar.

shortcut and/or

At the moment I only have the bitwise and/or operators '&' and 'bor' and next on the list is having shortcutting logical and/or operators like CL.

Ideally I'd like to have lisp-style and/or where the expression value is returned rather than just 0 or 1. Also standalone and/or so we can have things like

(and (file-existsp fname) (process-file fname))
(or (setup-completep) (error setup-failed))
Shortcut and/or actually compile to nested (if) expressions.

missing operators/functions

A whole slew of function counterparts for other types (i64,f32,f64) are missing. The compiler cannot type-infer which one (see below) so these would have explicit names like +f64, set.f64 etc.

There are many other relops missing etc. They are fairly trivial to add.

arrays

Currently I'm using pointer get*/set* but I'd like to have aref and implicit index to pointer arithmetic.

loops

Some more loops. while repeat etc. The underlying block, loop, br_if, br are exposed so writing loops is possible. It would only be the case of adding some more macros to enable:

(while condition body...)
(forever body...)

Also continue, break would be nice.

structures / defstruct

Some kind of struct might be nice to have. Even if all the fields had to be i32 initially i.e. just a vector with named, instead of numbered, fields.

better syntax/semantic checking

Even the current error messages drop you into the SBCL debugger, which can be a scary place - [ctrl-d] to exit.

type tracing

Knowing the types of values would allow the compiler to check signatures of calls of user functions and builtins. It would also allow me to infer the correct type for set rather than requiring set.f32 etc.

LET

Locals must be defined at the start of a function. This means implementing LET (scoped blocks with local variables) is not trivial. The current, single-pass, compiler would need to become far more complex to first decide how many locals are required in a function, then allocate them up-front and possibly rename them if there are clashes.

self hosting

The holy grail of compiler writers is that they rewrite their compiler in their new language. Common Lisp is so far above the level of wat that this would be a large undertaking. First I'd need to write malloc...

installing common lisp

Setting up sugar/common lisp may be trivial or hard depending on how familiar with common lisp you are :)

Installing common lisp (I use SBCL) is outside the scope of this post. (Tell me it's harder than npm ...)

I refer you to [Zach Beane's] https://www.quicklisp.org/beta/

SBCL might even be in your package manager:

$ pacman -Ss sbcl

sugar is run as a unix script. I've made a custom sbcl core since I prefer iterate over loop. Here is a very brief explanation of how to make such a core, starting with plain, vanilla sbcl.

$ sbcl
* (ql:quickload '#:iterate)
* (sb-ext:save-lisp-and-die "sbcl-iterate")
$ sudo mv sbcl-iterate /usr/share/sbcl/sbcl-iterate

source

references

Tags: sugar sugarlang sugarland wat wasm wabt common-lisp compiler (home)



from Hacker News https://ift.tt/2VL45vs

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.