Generate smaller, simpler code #104

yallop · 2017-02-15T11:58:51Z

This PR makes a few changes to the code generated by functoria to make it smaller and simpler. The aims are readability and simplicity, not performance, but there might be some small performance improvements.

The typical space savings are around 40-45%. For example,

the main.ml file for conduit_server drops from 199 lines to 115 lines
the main.ml file for console drops from 80 lines to 43 lines

The code is simplified/shrunk in three main ways:

First, there's a distinction between between effectful and non-effectful initialization code in the return type of connect. (This is perhaps a first step in the direction of generating code at a slightly higher level than string concatenation.)

The implementer of connect tags the generated code with Val or Eff according to whether it's a simple value or a possibly-effectful computation, and the caller of connect uses the tag to determine whether the value can be safely inlined. For effectful computations the generated code follows the previous scheme: a top-level binding is emitted
```
let name = lazy (connect_code ())
```
which is used in connect code for other modules:
```
name >>= fun _name ->
...
return (f ... _name ...)
```
For effect-free computations the top-level binding is omitted and the value is instead inlined directly:
```
...
return (f ... connect_value ...)
```
Second, the code that forces lazy values and the code that executes the resulting Lwt computations are combined, avoiding an intermediate binding. Before:
```
let __foo = Lazy.force foo in
__foo >>= fun _foo ->
...
```
After:
```
Lazy.force foo >>= fun _foo ->
...
```
(This changes evaluation order slightly, so it would be a good idea for someone to confirm that the change is safe.)
Finally, the layout is now a little more compact. For example, short expressions now occupy a single line. Before:
```
let foo = lazy (
  M.connect ()
 )
```
After:
```
let foo = lazy ( M.connect ())
```

yallop · 2017-02-15T12:02:35Z

It may be wise to leave this unmerged until Mirage 3 is safely released.

Drup · 2017-02-15T12:27:44Z

Just so that I can get a quick look before reinstalling half my mirage switch: Could you pastebin before/after for some medium-sized unikernel ?

yallop · 2017-02-15T12:32:45Z

Here's a before/after gist for clock.

Drup · 2017-02-15T12:37:51Z

Ok, Change 2 is incorrect because it doesn't launch the sub-connect concurrently, it makes them sequential.

Here is the original version. First, all the tasks are forced, which make them execute concurrently, then they are lwt-bound.

let f11 = lazy (
  let __time1 = Lazy.force time1 in
  let __pclock1 = Lazy.force pclock1 in
  let __mclock1 = Lazy.force mclock1 in
  __time1 >>= fun _time1 ->
  __pclock1 >>= fun _pclock1 ->
  __mclock1 >>= fun _mclock1 ->
  Unikernel1.start _time1 _pclock1 _mclock1
)

Here is the new version, the tasks are forced and lwt-bound immediately, in a sequential order.

let f11 = lazy ( Lazy.force pclock1  >>= fun _pclock1 -> 
 Lazy.force mclock1  >>= fun _mclock1 -> 
Unikernel1.start () _pclock1 _mclock1)

Drup · 2017-02-15T12:42:39Z

(To see why this is important more concretely: consider the case where you have two main devices that both launch servers listening on different ports. The "main" Mirage device will connect on both. It must do so concurrently)

yallop · 2017-02-15T13:03:55Z

Thanks for taking a look at this, @Drup.

Could you be even more concrete? e.g. with a pointer to some code.

The code that functoria currently generates looks quite sequential already, since the Lwt computations are sequenced with >>=. But perhaps there's some additional concurrency introduced when the lazy values are forced.

Drup · 2017-02-15T13:18:03Z

@yallop >>= does not launch the lwt computation on the left hand side, it only waits for it. The actual promise is launched when we Lazy.force it. I'll see if I can come up with a unikernel that exercises this feature.

yallop · 2017-02-15T13:38:35Z

It's ok; I see what you mean. I'll take another look at combining forcing and binding.

yallop · 2017-02-15T14:24:32Z

Having looked at this a bit more, I think an applicative approach could work well here.

We could generate something like this

pure Unikernel1.start <$> lazy () <$> pclock1 <$> mclock1

in place of the current

let f11 = lazy (
  let __time1 = Lazy.force time1 in
  let __pclock1 = Lazy.force pclock1 in
  let __mclock1 = Lazy.force mclock1 in
  __time1 >>= fun _time1 ->
  __pclock1 >>= fun _pclock1 ->
  __mclock1 >>= fun _mclock1 ->
  Unikernel1.start _time1 _pclock1 _mclock1
)

With some care it should be possible to arrange to run the applicative so that all the laziness is forced before blocking.

Drup · 2017-02-15T14:27:22Z

@yallop I agree. The code is made to precisely emulate the applicative behavior. :)
The problem is how to make that easy to define at the device level.

Currently, functoria asks for the dsl to provide monad-like operator, but that monad doesn't include the lazyness aspect. I guess we could touch that a bit ?

avsm · 2017-02-27T12:33:31Z

I think an applicative approach could work well here.

That would be lovely alongside the other code like the cmdliner exposure which has similar idioms.

yallop · 2017-03-07T17:05:22Z

@Drup: is there a reason why the "main" code forces values sequentially? Is it important to preserve that behaviour?

talex5 · 2017-03-07T18:03:41Z

@yallop I think that is the list of special init tasks. The sequence seems to be:

Parse arguments
Set up logging (depends on the arguments)
Set up everything else (may use logging)

yallop · 2017-03-07T18:42:27Z

Thanks, @talex5. Do you think it would be worth making the dependencies explicit in the data flow? i.e. returning a value from each task that can be used as input to subsequent tasks?

talex5 · 2017-03-07T18:55:48Z

I originally made a PR in which init tasks were just normal jobs with dependencies, but it was felt that this list-based system was simpler and good enough.

Drup · 2017-03-07T19:28:00Z

The problem with making that part of the data flow explicit is that suddenly, everyone must depend on the key device. People must remember to declare dependency towards the key device (and, spoiler, they won't). The graph becomes mostly unreadable and you get some other annoying issues, and we don't even really gain anything.

From an abstract point of view, I would really like it, it would be much more elegant. Currently, there is an ugly hack to find back the key device in the graph and do the init properly. It's a very practical approach. If you have a concrete proposition that doesn't make everything else worse, I'm all hear.

yallop · 2017-03-07T20:12:30Z

Thanks, @Drup. That all makes sense. Surfacing the dependencies is valuable, but I can see how it could lead to a tangle. I'll give it some thought.

In the meantime, I'll pull some of the less controversial changes here into a separate pull request.

yallop · 2017-04-19T15:46:32Z

Here's an outline of a possible approach that leaves the device code ignorant of laziness. The idea is to use an explicit representation of the forced computation to ensure that all forcing of lazy values occurs before all use of Lwt.(>>=).

First, the code for each device is a closed function expression:

(fun x y z ->
 Foo.connect x y z)

A function, lift, turns these expressions into delayed computations (where t is abstract):

val lift : 'a -> 'a t Lazy.t

A second function, <$>, applies the lifted function to delayed Lwt computations:

val ( <$> ) : ('a -> 'b) t Lazy.t -> 'a Lwt.t Lazy.t -> 'b t Lazy.t

Finally, run runs a fully applied computation:

val run : 'a Lwt.t t Lazy.t -> 'a

And here's a simple implementation:

type _ t =
     V : 'a -> 'a t
   | App : ('a -> 'b) t * 'a Lwt.t -> 'b t

let lift v = lazy (V v)

let rec run' : 'b 'c. 'b t -> ('b -> 'c Lwt.t) -> 'c Lwt.t =
  fun e k -> match e with
    | V v -> k v
    | App (f, v) -> run' f (fun g -> v >>= fun w -> k (g w))

let run v = Lwt_main.run (run' (Lazy.force v) (fun x -> x))

let (<$>) f x = lazy (App (Lazy.force f, Lazy.force x))

You might then write code like this:

let p = lazy (P.connect ())
let m = lazy (M.connect ())

let d = run (lift connect <$> p <$> m <$> m)

The tricky part is ensuring that the thunks are forced before any binding takes place. Representing the computation using t, V and App takes care of that, since that structure and its evaluation function run' know only about Lwt, not about lazy. It's not applicative, exactly, but it's applicative-style.

What do you think, @Drup?

samoht · 2017-07-04T07:10:17Z

ping @Drup

Drup · 2017-07-04T14:28:35Z

Sorry for the delay!

@yallop I believe this is a good idea! Your API looks good. I wonder how generic we can make it.
One alternative API would be to have a multi argument apply (with an heterogeneous list) and have an heterogeneous Lwt.join. This is pretty much the same anyway (the <$> are the cons) but it makes the fact that we want the join semantics a bit clearer.

avsm · 2018-02-02T12:09:42Z

This one seems relevant to resurrect again :-)

Drup · 2018-02-02T13:06:57Z

Indeed!
@yallop Do you want to take a shot at implementing your last proposition? We could also take that occasion to revisit what we need from the underlying datatype. Currently it's a monad, but maybe we don't need that much.

hannesm · 2018-12-22T15:07:59Z

sorry for being late to the party, but I'm wondering what the status of this PR is? it looks like this would be a good improvement for functoria.

AFAICT @yallop most recent suggestion from #104 (comment) to use an explicit representation of the forced computation is not yet implemented -- @yallop any chance you have some time to develop this code and rebase this PR on master? I can rebase the accompanied PR in mirage if you like.

Drup · 2018-12-22T15:14:15Z

Actually, the last set of patches implement the idea of separating values and computations. The code looks decent at a glance, but it breaks everything single device in existence. :/

hannesm · 2018-12-22T15:50:29Z

@Drup I searched through GitHub for mentions of base_configurable (which catches most definitions of custom devices), and most were in mirage/mirage and mirage/mirage-skeleton, a few distributed around other example unikernels and your lua. I think it is manageable to break the API and update all clients.

yallop · 2019-02-18T17:58:53Z

I've rebased against master, and I plan to revisit this code to bring it into line with the design discussed above.

hannesm · 2019-02-28T09:25:45Z

thanks @yallop, if you get around doing that, that'd be great. as mentioned earlier, I'm happy to rebase your mirage/mirage#790 PR once this is settled :D

samoht · 2019-12-05T10:45:30Z

It would be nice to merge that PR at one point, as simpler code is always nicer :-)

@yallop do you think you could update these patches to make it inline with the above discussion?

samoht · 2021-10-07T07:25:54Z

The functoria codebase has changed a little bit, but this is still a very much wanted code generator simplification. @yallop if you are still interested to implement these changes, I'll be super happy to merge them :-)

samoht · 2021-10-09T11:18:25Z

Actually this repository is not active anymore, so please feel free to move your patch to mirage/mirage !

yallop mentioned this pull request Feb 15, 2017

Update Mirage with the effect/value distinction in 'connect' mirage/mirage#790

Open

yallop force-pushed the inline-config-values branch 2 times, most recently from 6bb36f9 to d812364 Compare February 24, 2018 12:41

Drup mentioned this pull request Dec 22, 2018

code generation: fewer bindings, do not evaluate configurables multiple times, run initialization once #166

Closed

Give entries in Graph.Tbl a distinct type.

fb04beb

yallop added 8 commits February 18, 2019 17:54

Tag the return value of 'connect' as an effectful computation.

52605ef

Add a representation of effect-free connect code.

7c9c439

Mark some trivial computations as effect-free.

1b1f84f

Merge meta_emit and bind functions.

b793264

Improve generated code layout with pretty-printing boxes.

1b64e32

Split emit_connect according to whether connect_string is effectful.

7b040c6

Use the value/computation distinction to inline values.

d52b583

Update the tests with Val/Eff tagging

cadcf00

yallop force-pushed the inline-config-values branch from d812364 to cadcf00 Compare February 18, 2019 17:58

samoht closed this Oct 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate smaller, simpler code #104

Generate smaller, simpler code #104

yallop commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017 •

edited

Drup commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017

yallop commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017

avsm commented Feb 27, 2017

yallop commented Mar 7, 2017

talex5 commented Mar 7, 2017

yallop commented Mar 7, 2017

talex5 commented Mar 7, 2017

Drup commented Mar 7, 2017 •

edited

yallop commented Mar 7, 2017

yallop commented Apr 19, 2017

samoht commented Jul 4, 2017

Drup commented Jul 4, 2017

avsm commented Feb 2, 2018

Drup commented Feb 2, 2018

hannesm commented Dec 22, 2018

Drup commented Dec 22, 2018

hannesm commented Dec 22, 2018

yallop commented Feb 18, 2019

hannesm commented Feb 28, 2019

samoht commented Dec 5, 2019

samoht commented Oct 7, 2021

samoht commented Oct 9, 2021

Generate smaller, simpler code #104

Generate smaller, simpler code #104

Conversation

yallop commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017 • edited

Drup commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017

yallop commented Feb 15, 2017

yallop commented Feb 15, 2017

Drup commented Feb 15, 2017

avsm commented Feb 27, 2017

yallop commented Mar 7, 2017

talex5 commented Mar 7, 2017

yallop commented Mar 7, 2017

talex5 commented Mar 7, 2017

Drup commented Mar 7, 2017 • edited

yallop commented Mar 7, 2017

yallop commented Apr 19, 2017

samoht commented Jul 4, 2017

Drup commented Jul 4, 2017

avsm commented Feb 2, 2018

Drup commented Feb 2, 2018

hannesm commented Dec 22, 2018

Drup commented Dec 22, 2018

hannesm commented Dec 22, 2018

yallop commented Feb 18, 2019

hannesm commented Feb 28, 2019

samoht commented Dec 5, 2019

samoht commented Oct 7, 2021

samoht commented Oct 9, 2021

Drup commented Feb 15, 2017 •

edited

Drup commented Mar 7, 2017 •

edited