The Language Testing
Triangle
I often get the question: how
do you test a language — or more specifically, a language implementation in a
tool like MPS or Xtext. That’s a rich question, and the answer isn’t easily
given in a few lines here. I have written extensively on this, both specifically
for MPS as well as for the case of safety-critical
systems.
But there is an important
detail to this discussion for those cases, where your DSL allows the DSL users
to write tests for the models they create, something every self-respecting,
subject matter expert targetted DSL should do.
For example, in a healthcare
DSL, the subject matter experts describe the system basically as a state
machine, but they also have the opportunity to write scenario-based tests to
verify that their state machine works correctly. Similarly for tax calculations: the tax experts describe the calculations
as a tree, and then write tests to verify that their calculation logic is
correct. In both cases, the users write the tests not on the level of the
generated code, using JUnit or Cucumber or whatever. Instead, the DSLs have
specific syntax for expressing tests on the abstration level of the domain.
Here’s a screenshot from the tax example:
The question is: as a language engineer,
can you leverage this test infrastructure for testing the language itself?
Testing Models
Let’s investigate how the
subject matter expert sees the world when they write and run tests as part of
their day-to-day work. Their goal is to verify whether the model is correct.
They do this by writing tests, of which they assume that they are correct in
the sense that they state the correct expectations for the inputs they specify.
This assumption is usually justified because usually the tests are simpler than
the system/model under test, because they contain specific scenarios and not
the complete algorithm (not always of course, a test can be faulty as well).
However, there’s another
assumption here, which is that the language (with all its execution
infrastructure) works correctly. The subject matter expert does not verify
this, they simply trust. Again, valid assumption from their perspective. The
subject matter experts are done with testing once the model — our system under
test here — is covered sufficiently, where “sufficient” is a metric that must
be defined for the particular situation and the risks that materialize if
faults go undetected.
Testing the Language
How does this picture look
when the language engineer wants to test the language using that same
infrastructure? Here’s the triangle:
In this case the language
engineer trusts that the model and the tests are correct, and
if tests fail, it is the language — in particular, its interpreter or generator
— that is faulty. They write tests using the same testing syntax as the subject
matter experts, but they are only done when the language implementation reaches
100% coverage (or whatever else is your magic number).
So who writes tests?
Initially, while the language
implementation is new and unproven, the subject matter expert’s assumption that
the language is correct is not justified — if tests fail, it is likely that the
language needs fixing. This is why early in the project it is the language
engineer who writes tests. This also forces them to put in place the language
constructs to express those tests.
Later in the project when the
language becomes more stable, the subject matter expert writes more tests, uses
the same testing syntax that is by now developed and tuned by the language
engineer through their test writing. Occasionally, if a test fails, it will
still be the language implementation and the subject matter expert will be
puzzled. So the two have to talk and figure it out.
Even in the long run, the
language engineer will continue to write tests so that they can get to their
coverage goal, reaching corner cases that the subject matter experts perhaps
don’t encouter right away. There are also aspects of the language you cannot test
this way, for example, the type system or generators that produce
non-executable artifacts such as documents. For those and other infrastructural
aspects the language engineer is still in charge of testing. See that previously
mentioned paper.
Conclusion
Here’s the important thing:
all the tests written by the subject matter expert to test their models also
count towards testing the language itself! Overall, this reduces the testing
effort significantly. The language engineers’ using the same infrastructure
(test syntax) also means that you don’t have to build a separate language
testing infrastructure. All in all, the approach allows test-driven development
for both the subject matter expert and the language engineer.