Unpacking Options
Values: A Case Study in Language Design
Language design happens at many levels of granularity. This post is an
example of a fine-grained example.
I mentioned KernelF before.
It is a functional language whose purpose is to be embedded into DSLs,
according to the third pattern described in the previous post. KernelF has option types. This post is about
the design of the syntax used to unpack options.
Option Types
Option types are used to
handle null values in a type-safe way. The constant maybe in the
code below can either be an actual number value, or nothing, represented by none, depending on
the value of aBool. The if expression
then produces either none or 42. This is why
the constant is typed as an option<number> instead of just number.
val maybe : option<number> = if aBool then
42 else none
Most operators, as well as
many dot operations, are overloaded to also work with option<T> if they
are defined for T. If one of the arguments is none, then the
whole expression evaluates to none. In this sense, a none value
“bubbles” up. Note that the type system represents this; the + operator
and the length call in the example below are also option
types!
val nothing : option<number>
= none
val something : option<number> = 10
val noText : option<string>
= noneassert nothing + 10 equals
none
assert something + 10 equals 20
assert noText.length equals
none
The language design issue we
address in this post is: how do you extract a value of type T from an option<T> after
testing that it actually contains a Tinstead of a none.
The Starting Point
We started with a first-class
concept with some, plus an expression val that
would provide access to the optioned value if it is not none. Having a
first-class concept makes analyses simple to build, because it is simple to
recognise a check for some because the language
concept directly expresses it.
fun f(x: option<number>) = with some x
=> val none 10
The example above returns the
value inside the option, and 10 if the option contains a none. We also
experimented with dot expressions to access the optioned value:
fun f(x: option<number>) = with some x
=> x.val none 10
This second version would not
work for complex expression such as function calls, since repeating the complex
expression before the dot is syntactically ugly and leads to errors if the
called function has side effects. We decided on the first alternative.
Naming
However, this alternative
will result in a problem if several with someexpressions are nested
because val would be ambiguous. The name of the expression used to refer to
the value must be changeable. One solution would be to define a value
explicitly:
fun f(x: number, y: number) = {
val xval = with some
maybe(x) => val none 10
with some maybe(y) => val
+ xval none 20
}
However, this is too verbose.
We came up with two versions of an abbreviation to define names for the tested
value:
fun f(x: number) = with some v = maybe(x) => v none 10
-- or --
fun f(x: number) = with some maybe(x) as
v => v none 10
We preferred <expr> as <name> over <name>
= <expr> because
it cannot be confused with an assignment (which we do not support in KernelF,
but peoples’ mental parser still recognises it). It is also easier from the
perspective of the user, because you can add the name (syntactically and in
terms of typing sequence) after the expression the user wants
to test. Finally, KernelF already has a facility for optionally naming things
with an as suffix. The above can then be written as:
fun f(x: number, y: number) = {
with some maybe(x) as
xval
=> with some
maybe(y) as yval => xval + yval
none
0
none 0
}
To avoid the annoying
nesting, we allowed comma-separated tests:
fun f(x: number, y: number) =
with some maybe(x) as
xval, maybe(y) as yval
=> xval + yval none 0
Using if Expressions
The first-class concept with some turned
out to be disliked by users: it introduces new keywords for something where
users intuitively wanted to use the existingif; so we allowed the if expression
to be used, again with the same variations:
fun f(x: option<number>) = if some(x) then val else 10
fun f(x: option<number>) = if some(x) then x.val else 10
fun f(x: number) = if some(maybe(x)) then val else 10
fun f(x: number) = if some(maybe(x) as v) then
v else 10
A problem with using the
existing if expression is that users can construct
arbitrarily complex expressions, such as the following:
fun f(x: option<number>) =
if some(x) || g(x) then
val else 10
In this case it cannot
(easily) be statically checked that inside the thenbranch, x always
has a value . To enforce this, we ensure that the someexpression is
the topmost expression in the if; it cannot be combined with
others. This is trivial to check structurally and avoids the need for advanced
semantic analysis of complex expressions.
Options as Booleans
We had the idea of
interpreting an option type as Boolean to avoid the need to write some:
fun f(x: option<number>) = if x then val
else 10
However, we discarded this
option because, for our target audience, we think that too much type magic is
too complicated. Another idea was to use the name of the tested variable (if it
is a simple expression) in the then part, and type it to
the content of the option. This would allow the following syntax:
fun f(x: option<number>) = if some(x) then
x else 10
This is harder to implement
because the type of x is now different depending on the location in
the source. This is not easily possible with MPS’ type system. Alternatively,
the second x could be made to be a different language
concept (which comes with a different type), but then one has to prevent the
use of the original x in the then part. This would
require all reference concepts to be aware of the mechanism; every scoping
function would have to call a filter method. While this makes language
extension a little bit harder (users have to call the filtering function), we
decided that this is worth it: since one cannot do anything else inside the then part,
providing the ``unpacked’’ value there makes sense.
Final Design
We settled on the following
syntax. The if conforms to users’ expectations, the as avoids
confusion with assignments, and we provided the magic of “automatic unpacking’’
inside the then part:
fun f(x: option<number>) = if some(x) then x else 10
fun f(x: number) = if some(maybe(x)) as v then
v else 10
For multiple tested values we
now use && instead of the comma, because the && is used
in logical expressions already as a conjunction; note that other logical
operators are not supported on some tests.
fun f(x: number, y: option<number>) =
if some(maybe(x)) as
xval && some(y)
then xval + y else 0
For the common case where one
“just” wants to get the value in the option, and an alternative otherwise. The ^: operator
supports this in a very concise way:
val aNumber = maybe(x) ^: 0
Why not the familiar
Matching?
Many functional languages use
case matching to deal with options. The reason for this is that option types
are often implemented as algebraic data types (ADTs), and case matching is a
natural way to process them. However, KernelF does not have ADTs (because their
purpose is to build custom abstractions, which is not what KernelF is intended
to do), so users do not know about pattern matching. Also, there is no general
pattern matching syntax that could be reused.
Wrap Up
I wanted to publish this one
because it shows that language design can extend to a very fine-grained
language feature, and are not just about “domain abstractions”. It also shows
how user expectations (from some to if) have to be
balanced with implementation effort (treating options as Booleans). Oh, and
here’s the unrelated picture. Almost forgot it :-)