Skip to content
sarahkf edited this page Jul 26, 2016 · 20 revisions

Tuples

Updated 6-21-2016 to use {a;b} syntax

Updated 11-29-2015 to use {a,b} syntax.

Updated 5-28-2015 to use (a;b) syntax.

The currently-proposed t;0 syntax for access is ambiguous, need a better alternative. We don't want to use t.0 because that's too close to conflicting with likely syntax errors concerning number literals like .5, which need a special syntax error.

Open Questions

Require spaces after semi-colons?

Change tuple access to not need a dot?

Summary

Tuples are useful for a whole bunch of things, and until now we've tried to see if we can get by without explicit tupling in the language. We've used anonymous objects for returning small collections of things, and several of us have ended up just adding a "Pair" datatype that we've used in isolated cases. Also, constructor syntax for key-value constructors (like string-dict) just don't work out in a typed context as well without some way to group keys and values.

Good tuple syntax would help us clean up a lot of these use cases. This is (the start of) a tuple proposal for Pyret.

Basics

Parenthesized, semicolon-separated sequences are for tuples: Braced, comma-separated sequences are for tuples:

t = {"fst", "snd", 3}

The type syntax for tuples uses the same notation:

{Number, String, (Boolean -> Boolean)}

Destructuring can be done by assignment:

check:
  {a; b; c} = {1; 2; 3}
  a is 1
  b is 2
  c is 3
end

Or by explicit access:

check:
  tup = {"a"; "b"; "c"}
  tup{0} is "a"
  tup{1} is "b"
  tup{2} is "c"
end

That's about it; lots more discussion is below.

Uses

Use as key-value pairs

This would allow us to change the string-dict constructor to:

sd = [string-dict: {"a"; 5}, {"b"; 6}]

If we want special-er key-value syntax in constructors, it could desugar to tuples, which is still easy to type-check. So

[string-dict: "a" => 5, "b" => 6]

Could desugar to:

[string-dict: {"a"; 5}, {"b"; 6}]

Getting at tuple contents

Destructuring assignment is special syntax just for tuples:

{a; b; c} = {1; 2; 3}

A nice thing about this is that it doesn't require any special integration with cases to start, but it can be added later. So if you have a list of pairs, you'd write:

fun unzip<A; B>(l :: List<{A; B}>) -> {List<A>; List<B>}:
  cases(List<(A; B)>) l:
    | empty => {empty; empty}
    | link(f, r) =>
      {v1; v2} = f
      {rest1; rest2} = unzip(r)
      {link(v1, rest1); link(v2, rest2)}
  end
where:
  unzip([list: {1; 2}, {3; 4}]) is {[list: 1; 3], [list: 2; 4]}
end

Down the road, if pattern matching in cases gets more complete, the matches can just merge up into the cases patterns. To start, I think extending let-binding to handle this case will add enough for us to get a feel for the feature.

This kind of assignment does not nest (e.g. no multiple tuple levels).

We will have to write, e.g. unzip3, unzip4, for larger tuple cases.

For arbitrary element access, we could also add special syntax that requires a number:

check:
  tup = {1; 2; 3}
  tup.{0} is 1
  tup.{1} is 2
  tup.{2} is 3
  tup.{3} raises "lookup-large-index"
end

It's a little ugly, but not terrible. Chaining could get odd:

check:
  tup = {{"skip"; {"skip"; "skip"; {"skip"; "skip"; "skip"; "this one!"}}}}
  tup.{0}.{1}.{2}.{3} is "this one!"
end

I think we do want (whether we expose concrete syntax for it or not), a binding form that does this, so we can express tuple deconstruction as a desugaring rather than as its own operation. That is,

{a; b; c} = {x; y; z}

means...

t = {x; y; z}
a = t.{0}
b = t.{1}
c = t.{2}

This avoids tying up the semantics of tuple access with binding forms.

Equality

Tuples recursively check their members for both equal-always and equal-now. Tuples have identity, so identical will report separate instantiations of tuples containing the same values as non-identical.

check:
  {1; 2; 3} is {1; 2; 3}
  {1; 2; 3} is%(equal-now) {1; 2; 3}
  {1; 2; 3} is-not%(identical) {1; 2; 3}
  t = {1; 2; 3}
  t is%(identical) t
end

Type-checking Tuples

One point of this proposal is to add enough syntax to make type-checking tuples as easy as possible:

{e1; e2; ...} :: {t1; t2; ...}

Single-element access is easy to type-check by projecting out the correct index of the tuple type (and checking for overflow). By the desugaring above, checking destructuring assignment is also simple.

One thing we could do to aid inference is allow annotations in tuple destructuring:

{a :: Number; b :: String; c :: (Number -> Number)} = {x; y; z}

This is similar to how we allow annotations in cases and for bindings.

Representing Tuples

raw-arrays already steal JavaScript array as a datatype (and they probably deserve to keep it). I think we'd want a few representations for the simple cases:

function Tuple1(v0) {
  this.v0 = v0;
}

function Tuple2(v0, v0) {
  this.v0 = this.v0;
  this.v1 = this.v1;
}

function Tuple3(v0, v1, v2) {
  this.v0 = this.v0;
  this.v1 = this.v1;
  this.v2 = this.v2;
}

And then

function TupleN(v0, v1, v2, vs) {
  this.v0 = v0;
  this.v1 = v1;
  this.v2 = v2;
  this.vs = vs;
}

For all further cases.

The compiler ought to be able to generate really fast code for the small cases (which will look like normal JS class member lookup to please the JIT), and will have an extra level of indirection (through an array) for the larger cases.

Concretely:

tup{0}  == compiles to ==>  tup.v0
tup{1}  == compiles to ==>  tup.v1 ? tup.v1 : tuple_index_error(...)
tup{2}  == compiles to ==>  tup.v2 ? tup.v2 : tuple_index_error(...)
tup{3}  == compiles to ==>  tup.vs && tup.vs[3] ? tup.vs[3] : tuple_index_error(...)

The checks are obvious to omit when code is fully type-checked.

Examples

Since these are for exploration, they may have some typos or thinkos in them; please feel free to correct on the mailing list or via an issue.

Several are pretty obvious functional patterns, which are used to feel out the syntactic choices.

zip/unzip

fun zip<A; B>(l1 :: List<A>, l2 :: List<B>) -> List<{A; B}>:
  doc: "Assumes same length for lists"
  cases(List<A>) l1:
    | empty => empty
    | link(f1, r1) =>
      link({f1; l2.first}, zip(r1, l2.rest))
  end
end

With some extensions to cases, we could write zip as:

fun zip<A, B>(l1 :: List<A>, l2 :: List<B>) -> List<{A; B}>:
  doc: "Assumes same length for lists"
  cases({List<A>, List<B>}) (l1; l2):
    | {empty; empty} => empty
    | {link(f1, r1); link(f2, r2)} =>
      link({f1; f2}, zip(r1, r2))
  end
end
fun unzip<A, B>(l :: List<{A; B}>) -> {List<A>; List<B>}:
  cases(List<(A; B)>) l:
    | empty => {empty, empty}
    | link(f, r) =>
      {v1; v2} = f
      {rest1; rest2} = unzip(r)
      {link(v1, rest1); link(v2, rest2)}
  end
where:
  unzip([list: {1; 2}, {3; 4}]) is {[list: 1, 3]; [list: 2, 4]}
end

key/value constructors

import string-dict as SD

users = [SD.string-dict
          {"joe.politz@gmail.com"; make-user()};
          {"shriram@gmail.com"; make-user()}]

for each(u from users.items()):
  {email; user-obj} = u
  when user-obj.is-new-account():
    send-welcome-email(email)
  end
end

Folding over a pair

This kind of pattern comes up all the time both in our compiler and in CS173 assignments.

fun eval-with-stores(exprs :: List<Expr>, store :: Store) -> {List<Val>; Store}:
  for fold(acc from {empty, store}, expr from exprs):
    {vals; current-store} = acc
    {val; new-store} = eval(expr, current-store)
    {link(val, vals); new-store}
  end
end

Currently, things like this are common:

fun eval(expr :: Expr, s :: Store) -> { v: Val, s: Store }:
  # an interpreter
end
fun eval-with-stores(exprs :: List<Expr>, store :: Store) -> {vs : List<Val>, s: Store}
  for fold(acc from {vs: empty, s: store}, expr from exprs):
    result = eval(expr, acc.s)
    {
      v: link(result.v, acc.vs),
      s: result.s
    }
  end
end

Which requires remembering names like "v" and "s" if it wants to be concise, or writing out long names like "value" and "store" to be more descriptive. Also note that it would be weird to name the values field in the return of eval-with-stores just "v" rather than "vs", since it needs to be descriptive. With pairs it's more natural to be anonymous.