DSLs vs. “Learning
Languages”
There is a category of
languages that is used to teach programming to novices, often children.
Historically, the primary example is LOGO,
a language for drawing the well-known tutle-graphics onto
a canvas. These days there is a lot of hype around Scratch and similar languages developed around the Blockly framework,
such as CoBlox, a blockly-style language for programming industrial
robots.
Are learning languages (LLs)
domain-specific languages (DSLs)?
Scratch is clearly not: it is
really a general purpose language that comes with a syntax and IDE that is
easier to learn than text editors. But they might be DSLs. For example, CoBlox
is targeted at the domain of industrial robots. LOGO might be seen as a DSL for
drawing a particular style of graphics. And all of them can potentially be
developed with “DSL development tools” aka language workbenches like MPS, Xtext, Spoofax or Rascal.
Comparing DSLs and LLs
I want to argue that there are
important differences in the tradeoffs that govern the design of these
languages. Take a look at the following diagram. It illustrates 8 metrics that
distinguish the two kinds of languages; the further out each of the points are
on the metric axis, the more the particular kind scores on that metric. The
blue line are the Learning Languages and the red line are DSLs. Let’s look at
the metrics in turn.
Domain Coverage
How much of the domain does
the language cover? A DSL, for example, one for payroll calculations, must be able
to cover the whole domain — our customer wants to be able to express the
complete payroll system. For an LL, it is perfectly ok to pick and choose from
a domain, typically in a way that makes the language simpler and easier to
learn.
Essential Complexity
A DSL cannot be simpler than
what is necessary to express all of the domain. For example, the recent payroll
DSL has support for temporal data and rule versioning. These concepts are not
necessarily easy to understand, but they are essential to the domain. An LL can
make relatively arbitrary choices, by picking part of the domain, that reduce
this complexity.
Accidental Complexity
In both DSLs and LLs you
don’t want accidental complexity — after all, it is essentially unnecessary,
and so should be avoided (both points are very low to zero). And in an LL, you
really strive for zero accidental complexity at all cost because it hampers
learning. In a DSL, because you have to cover all of the domain, you might have
to make a couple of compromises in elegance, orthogonality or syntax, which
drives up accidental complexity slightly.
Productivity
A productivity increase
compared to general-purpose languages, through appropriate abstractions,
notations, analyses and tools is the primary reason for developing a DSL in the
first place. For an LL this is not so important; presumably you have to think a
lot (because you learn) while you write programs anyway. And you won’t write
lots of code, which brings us to the next point:
Scalability
Writing large programs,
potentially with selective reuse between them, is an important aspect of many
DSLs; it’s not important for LLs. For example, in an LL, you might prefer a
graphical notation because it is initially “less alien” to novices, but once you
a the language every day, as many DSLs are, you will probably prefer something
more concise, such as text or tables.
Learnability
This is of course the raison
d’être for LLs, so they get full points here. For DSLs it is of course helpful
if they are easy to learn because it reduces initial “resistance” by future
users. But remember, users learn the language once, and then are supposed to be
productive!
Maintainability
… is more or less irrelevant
for LLs: once you’ve mastered a task/exercise, you’ll never modify the program
again. Or even look at it. For DSLs, depending on their use, this might be
different. There are certainly one-shot-style DSL, for example, where you
script a piece of music or an image processing pipeline. But in most of the
cases we deal with, the programs created with the DSLs live for years and have
to be actively maintained.
Expert Users
For a DSL, you expect the
user to be an expert in the domain. Which is why they appreciate the essential
complexity of the DSL: they have to be able to express all of the domain. For
an LL, the user is not yet a programmer; in particular, for domain-specific LLs
like CoBlox, your users are certainly not experts in the robot domain — after
all, the purpose of the language is to teach them about (programming in) that
domain. They won’t understand why the language has a particular quirk that is
necessary to be able to express important but rare corner cases in the domain.
Summary
So, while both LLs and DSLs
are not the same as (serious-use) general-purpose languages, and while both
might be limited to a particular domain, the shape of such languages
(literally, see the diagram above) are very different. LLs have to fulfil
different, but also fewer requirements. We can’t necessarily extrapolate
experiences from one to another.
Ideally we want a combination
in the sense that a DSL has a subset that is effectively an LL to teach the DSL
in the beginning. Maybe I will write in a future post about how we might
achieve this with limited effort.