Domain-specific languages usually evolve much quicker than programming languages. Because they are tailored to a particular domain, they have to evolve with the domain. And they are developed together with users, so you will try things out and then potentially change something that didn't quite work. This is in contrast to general-purpose programming languages: because they are so universal in their abstractions, there are no external factors that effectively force an evolution of the language. You still might want to evolve it, based on new ideas or hypes or fads in programming language design, but you can do this much more slowly.
So if you evolve a language, what do you do with existing models (aka programs)? It is usually not an option to just break them. You have to keep them valid, somehow. In this article I want to discuss the "somehow" a little bit.
The simplest way of ensuring that existing models continue to work is to make sure that your language evolves in a backward-compatible way. In other words, any new version of the language is (syntactically and semantically) a superset of the previously existing one. If you can do this, then the whole problem goes away. However, it also severely limits your degrees of freedom in how you evolve your language: you can't ever remove anything, and you cannot change the semantics of any existing construct. In practice this is not a great idea.
The next better thing you can do is deprecate things you no longer want to support -- and at the same time add something that is better in some respect. Deprecated language concepts can still be used, for a time, but at some time will be deleted. Depending on your tool you can also prevent the creation of new instances of the deprecated concept, but the existing ones are still ok, for a time.
For general-purpose languages this is harder impossible because you can never reach all users of the language to tell them to change away from the deprecated concept. Java is a good example of very slow removal of deprecated things. For DSLs, however, this is more realistic, since most are used in some bounded scope -- you often can reach all its users. So you can inform, ask, bribe or threaten them, so they stop using the deprecated concept. You can even have your domain-specific IDE report back to the mothership who still uses the old stuff after some deadline. In addition, you can make migrating away from deprecated constructs easier by providing quick fixes or refactorings to semi-automatically change their existing program into an equivalent version that does not use deprecated concepts.
This approach can work in practice. But you are still forced into stepwise backward compatibility.
Lets now move into the space where you can make genuinely incompatible language changes. Let's assume you live in a closed world where you as the person who changes the language has all instances in reach. Repository-based modeling tools can use this approach. You -- as the person who updates the language -- can also use it with git if you can make everybody check in everything by a deadline, on main, and you can then jointly touch all models (it's not terribly realistic, we'll get back to this).
In such a closed world, you can pull along all models in real time, as you make the (incompatible) language change. If the change breaks only a few models, you can change them manually. More realistically, you might want to write some kind of script that algorithmically changes all the instances of the concepts you change. One step more conveniently, your language development tool can potentially observe the changes you make to the language and automatically generate the migration scripts, and in this way fully automatically pull along all models.
This approach works, although the assumption of a closed world in which you have access too all models ever written is a strong one, both organisationally and from a scalability perspective. So let's get rid of that constraint.
This is really the same problem as in database schema migration. There, too, you have a closed world because all the data is accessible -- it's right in that database you want to migrate.
In a distributed, decentralized world -- think: git -- you cannot assume that you have access to all instances of your language at any time. They are in different repos, on different branches. So you have to migrate them to new version whenever you see them. Or better: whenever these models see the new language.
You need a bit of infrastructure. The language has to specify at which version it is. Each language version ships with migrations that can pull programs of the previous over to the current version. Just like in the previous case these can be manually written or automatically recorded from the language change. Each model also has to specify its current version. When the user opens a model in the IDE, that IDE automatically executes the chain of migrations to bring that existing model up to date with the current language version.
There's another requirement for this to work: the migration script must be able to access the data in a model whose structure potentially doesn't conform to the current language. The specific implications of this requirement depend on the implementation technology. Usually it means that the access to the data from the old version is based on reflection or via a M3 API. It is also cool if the tool supports writing test cases for your migrations to make sure they'll work correctly when they are executed automatically.
This last approach, based partially on generated and manually created migration scripts is what MPS uses. If you are careful and thorough with your migration scripts, the approach works well and can help scale out your DSL to a large number of users.
There's one more caveat I have to add: migration scripts are automated. In other words, there must a way to algorithmically decide how an old program should be moved to the new version. Consider the situation where you "split" the semantics of an existing concept: you can go left or right from the old program. In this case you cannot automate the migration, the user has to decide. I guess this doesn't happen too often, but in this case you're back to deprecation, warning message and ideally a quick fix.
So which one do you use? In my practical work with MPS I proceed as follows:
In the very early stages of language development where only very few models exist (and all are under my control) I just break things and fix them by recreating the part of the models I broke -- this works because MPS doesn't "kill" the whole model if your language has an incompatible change, it only "breaks" the concepts that change. You can see that in the editor and you just delete that part and recreate it correctly.
As the language grows, usually a few select people start creating models in order to validate the language. Just breaking their work, even if that's just prototypical, isn't a great way of keeping them engaged. So I use the eager-pull-along approach. It's feasible to tell them to check things in and let me migrate things at 10 pm. Especially since lots of changes to languages are additive and therefore backward compatible. Breaking changes are comparatively rare.
And then, as the language is rolled out to more people, into more repositories and branches, and latest when it gets used for production, I use the on-demand distributed approach with migration scripts. The reason why I don't use them right from the beginning is that with MPS, you have to write migration scripts manually, and this is of course effort -- it makes language evolution slower. So I like to push this out as far as possible.
It's funny. People always confront me with supposed disadvantages of DSLs, compared to all kinds of alternatives. One point people make in this context is that there's supposedly this problem with language evolution. Let me ask you: when was the last time you have written migrations for your library that automatically pulls along client code? Exactly. It's not possible with any of the mainstream languages and IDEs. With DSLs, if you use a decent tool, it is. Disadvantage? I don't think so.
Thanks to Niko Stotz for a couple of a really good additions based on reading an earlier draft of the article.