The Evolution of
Decision Tables
A little study in language design
Many domain experts like
tables as a way to representing decisions and calculations. They perceive it to
be more readable than “code”. The popularity of Excel is certainly at least
partly due to this effect. The OMG’s DMN also
uses tables extensively for representing decisions (so-called decision tables).
There are many forms of decision tables, and we have been using them in our
DSLs for a long time.
Their expressiveness has
evolved quite a bit, though. In this article I recount the evolution of
decision tables based on inputs from our users. It is a nice example of
language design, and a good illustration of why it is useful to be able to
evolve the language when you are dealing with business people.
Basic two-dimensional
Decision Tables
Years ago, we had implemented
the basic two-dimensional decision table in mbeddr, our extensible version of C optimized for embedded
programming:
This table is an expression,
so you can use it wherever C expects an expression. And it has a set of Boolean
conditions as row and column headers. The conditions themselves are kinda
independent, but usually, every dimension addressed one variable (spd vs. alt). So
essentially, this table represents a set of nested if statement
over two criteria.
We have since implemented the
exact same structure for KernelF,
the functional base language we use as the basis of business DSLs. Also, in a customer
project in the healthcare domain, we have collaboratively built a version
of this table that explicitly concerns two numeric variables, which made the
syntax much less cluttered:
Generalized N-Criteria
Table
This form of decision tables
gives up on arranging the two criteria along the two table dimensions, and
instead lists them as separate columns. Here is a simple example that
calculates some kind of base fare based on the state(an enumeration
type) the customer lives in and whether she is a member of some kind of club (a
Boolean):
The tables are evaluated
top-down (so more specific criteria have to be mentioned further up), and an
empty field means “don’t care”. So the table calculates a fare of 1.00 if you
live in BW, the fare in BY depends on whether you
are in the club, and everybody else pays 1.20.
Multiple Return Values
Turns out that often there is
more than one return value that depends on the criteria. So we allow multiple
result columns which are represented as a tuple value in terms of KernelF’s
type system, as the explicit return type of the function below illustrates:
Inline Alternatives
The columns in a decision
table are joined by logical “and”. However, sometimes you also want to express
an “or”. For example, the 15%-discounting might apply to Bavaria (BY) and Hesse (HE). We’ve
implemented the comma operator to express this (note that you cannot directly use
the existing || operator because the two enum literals are
not Booleans; we could have overloaded the typing rules though):
Ranges
In all examples so far you
don’t specify a comparison operator; the table implicitly compares with
“equals”. However, as we have seen in the medical example above, we often want
to compare ranges for numerical values. For example, the base fare could change
for children in Baden-Württemberg (BW):
Note that these things in the age column
are not really complete expressions, they take the column value as an implicit
argument. However, it is important to be able to write it this way and not
mention the value (age) every time because that
would increase verbosity and error-proneness.
Embedding the Tables into
Context
In the examples we have seen
so far, the tables are used as expressions. Since KernelF is a functional
language, expressions are everywhere, and making tables expressions makes
allows their use almost everywhere.
Top Level Tables
Very often, though, the
tables are the only expression in a function. This works, of
course, but if you look closely, there’s a bit of duplication: the column
headers refer back to the argument. This is why we have provided a top-level
version of decision tables where this duplication is avoided because the query
columns act as parameter declarations (so they have to specify a type now):
It is surprising how much
difference this makes to many domain experts — they don’t have to explicitly
understand the concept of a function (even though, semantically, the table of
course is one).
Tuple Assignment
We use the decision tables in
a DSL where we have function-like calculations, but the result data structure
is populated by assigning to its members. Here is how the decision tables can
be used in this context:
We first assign the (tuple)
value returned by the decision table to a local value f (which is
inferred to be of type [Currency, Percentage, a tuple type) and we then
use the tuple’s native position-based indexing to assign to the result fields (base and discount are
members of the Fare record). Again: this
works, but it is verbose because of the intermediate value f and the
subsequent position-based assignment. To solve this issue we support assignment
to tuple values if all of the elements of the tuple can be assigned to (i.e.,
are lvalues):
One problem remains, though:
the names and types of the result columns are kinda redundant, because both can
be automatically derived from the values we assign to. So why don’t we just put
the assignment targets into the result columns?
Now, this is a nice compact
notation. Our customers really liked that one. And justifiably so!
It’s still all Expressions!
Despite the specific
notation, we are still in the context of a a “full” functional language. For
example, you can have more complex expressions in the conditions …
… and you can use local
values to factor out more complex calculations (there are literally refactoring
operations to extract local values):
We feel that this really is
combining the best of both worlds: expressive programming and “declarative”,
table-based decision making.
Collaboration with our
Customers
So why and how did this
evolution happen? Basically, it is because of our customers. They get the point
about language engineering in the sense that they drive us to create less
verbose and more end-user friendly notations. It is our job then, as language
engineers, to find solutions that satisfy the users, but also retain the
integrity of the language (both KernelF and the DSL we are developing for them)
in terms of orthogonality, composability and modular implementation. For
example, it is perfectly ok to build special support into decision tables so
they can “assign to” lvalues, but we don’t want the lvalues to know something
about the decision tables. Sometimes this means that we cannot (don’t want to)
implement our customer’s wishes to 100%.
It is a really good
collaboration if we are willing to take the customers’ needs and wishes
seriously, but they also understand our concerns about the language design and
implementation (and the resulting slightly lower than 100% wish-fulfilment rate
:-)).