I just came across the Reia programming language, which is a language designed to run on Erlang's BEAM virtual machine. Another language I have recently been toying with is Clojure, which is a really nice Lisp implemented on the JVM. Reia, Erlang, and Clojure are all deliberately and carefully designed with language support for high-level, lock-free concurrency abstractions.
I'll come back to Reia in a second, after a long-ass detour, because it's actually Reia that knocked this bit of head dirt loose, not Clojure or Erlang.
Clojure's secret sauce is it's data structures, which are specifically targeted towards concurrency. The data structures are immutable and persistent. The word "persistent" in this context means that when you "modify" it, you don't really alter it, but instead create a new value that incorporates the change. (As opposed to the other meaning of being written to disks, databases, punch cards, or whatever the kids use these days.)
This means that, conceptually, every time you change a persistent data structure, you get a whole new, slightly tweaked version. Conceptually. The important bit here is that because the data structure is also immutable, you don't really have to copy the whole thing, your implementation can share in-memory structure with the old version behind the scenes and no one will be the wiser.
Immutability also means you don't have to use locks to protect shared data from concurrent writes by multiple threads. There are no concurrent writes to protect, because there's no such thing as writing into an existing data structure. When you "update" some data in Clojure, you make a new "copy" of it with the change you wanted, and concurrent readers never notice because the original is still intact.
Now, Erlang is heavily influenced by Prolog and has single assignment variables. What this means is that once you bind a name to a value, you cannot change that binding, the name is forever (till the end of its scope) bound to that value. Essentially, name bindings are immutable. The data structures in Erlang, like Clojure, also happen to be immutable, so there's no getting around this with funny business like binding a name to a data structure and then mutating the data structure.
Reia is designed to run on BEAM, the platform of Erlang. Reia does not have single assignment variables. Curiously, this is a very controversial feature of the language, with many people accusing it of breaking the entire concurrency model of its host platform. I found it extremely odd that it would be so controversial, and haven't really seen any good arguments against it.
Clojure does not have single assignment either, and gets along just fine. The reason it's such a non-issue is manifold. Most fundamentally, bindings are thread-local in all three languages: it is simply not possible to have a data race as a result of a mutated binding. Clojure is unique among the three in that it does have non-thread-local names, but they are explicitly declared as such and have extra concurrency semantics that are enforced by the language, again preventing data races.
Additionally, data races caused by mutating data structures via aliased references are not practically possible in any of these languages, because data structures cannot be mutated at all, and (in Clojure's case) shared references must be mutated under language-enforced conditions. (By "practically" I mean "using the subset of the language not including the shoot-yourself-in-the-foot library backdoors.")
One of the more thoughtful objections to Reia's mutable bindings I've seen was the idea that mutations to a name binding could be observed by another process if you closed a fun over the name and sent it to another process. This is not actually the case as far as I can tell.
Reia's conceptual model seems to be more accurately thought of as allowing bound names to shadow themselves. All the code after each assignment operation is actually seeing a new environment where the name has a new binding that shadows the old one. In this way, when you close over a name, it captures the most recent environment in which the name was bound. Subsequent rebindings are irrelevant because they don't really mutate the binding, they shadow it. Because Reia's data structures (presumably being the same as Erlang's) are immutable, the mutable structure funny business previously described cannot be accomplished either.
In a manner of speaking, Reia has persistent bindings.
Conflating single assignment and concurrency in Erlang seems to be a pretty common thing, so perhaps I'll suggest a better theory about why Erlang is single-assignment: Pattern matching.
Erlang's "assignment" operations (both with equals and in implicit positions) are not assignments at all, but pattern matches with destructuring bind. You can use bound variable names as part of a pattern in Erlang, and the match will fail if the value of the bound variable and it's counterpart in the datum being matched against don't agree:
If Erlang did not have single assignment, the last two expressions would be ambiguous: does it mean rebind Foo in a destructuring bind? Or should it use its value as a pattern in a pattern match? With single assignment, there is no rebinding, so it is always unambiguous: unbound variables get bound in a successful match, bound variables are used as part of the pattern.
Reia disambiguates this differently. It uses a "splat" operation which can be likened to "unquote" in Lisp. This is best described with an example:
The second line shows a plain old destructuring bind. All variables in the pattern are considered L-values and rebound. The third line demonstrates the splat, which can be thought of as unquoting, or causing foo to be evaluated as an R-value. Thus it is the same as writing the fourth line, where foo's value is substituted in the pattern. And finally, the fifth line would cause a failed match, again it does not rebind foo.
I'm pretty sure this theory behind single assignment has a lot more traction than the "it has something to do with concurrency model" theory that I've not seen anyone fully unfurl yet.
Anyway this was a long-ass post, sorry about that. I had to at least attempt to correct the crappy FUD about Reia's rebindable names. I'm actually really excited that more languages are blossoming top of BEAM, it really is a remarkable platform. I always wondered how hard it would be to write a BEAM backend for PyPy...
I'll come back to Reia in a second, after a long-ass detour, because it's actually Reia that knocked this bit of head dirt loose, not Clojure or Erlang.
Immutable/persistent data structures and concurrency
Clojure's secret sauce is it's data structures, which are specifically targeted towards concurrency. The data structures are immutable and persistent. The word "persistent" in this context means that when you "modify" it, you don't really alter it, but instead create a new value that incorporates the change. (As opposed to the other meaning of being written to disks, databases, punch cards, or whatever the kids use these days.)
This means that, conceptually, every time you change a persistent data structure, you get a whole new, slightly tweaked version. Conceptually. The important bit here is that because the data structure is also immutable, you don't really have to copy the whole thing, your implementation can share in-memory structure with the old version behind the scenes and no one will be the wiser.
Immutability also means you don't have to use locks to protect shared data from concurrent writes by multiple threads. There are no concurrent writes to protect, because there's no such thing as writing into an existing data structure. When you "update" some data in Clojure, you make a new "copy" of it with the change you wanted, and concurrent readers never notice because the original is still intact.
The single-assignment spectacle
Now, Erlang is heavily influenced by Prolog and has single assignment variables. What this means is that once you bind a name to a value, you cannot change that binding, the name is forever (till the end of its scope) bound to that value. Essentially, name bindings are immutable. The data structures in Erlang, like Clojure, also happen to be immutable, so there's no getting around this with funny business like binding a name to a data structure and then mutating the data structure.
Reia is designed to run on BEAM, the platform of Erlang. Reia does not have single assignment variables. Curiously, this is a very controversial feature of the language, with many people accusing it of breaking the entire concurrency model of its host platform. I found it extremely odd that it would be so controversial, and haven't really seen any good arguments against it.
Clojure does not have single assignment either, and gets along just fine. The reason it's such a non-issue is manifold. Most fundamentally, bindings are thread-local in all three languages: it is simply not possible to have a data race as a result of a mutated binding. Clojure is unique among the three in that it does have non-thread-local names, but they are explicitly declared as such and have extra concurrency semantics that are enforced by the language, again preventing data races.
Additionally, data races caused by mutating data structures via aliased references are not practically possible in any of these languages, because data structures cannot be mutated at all, and (in Clojure's case) shared references must be mutated under language-enforced conditions. (By "practically" I mean "using the subset of the language not including the shoot-yourself-in-the-foot library backdoors.")
Self shadowing, not mutation
One of the more thoughtful objections to Reia's mutable bindings I've seen was the idea that mutations to a name binding could be observed by another process if you closed a fun over the name and sent it to another process. This is not actually the case as far as I can tell.
Reia's conceptual model seems to be more accurately thought of as allowing bound names to shadow themselves. All the code after each assignment operation is actually seeing a new environment where the name has a new binding that shadows the old one. In this way, when you close over a name, it captures the most recent environment in which the name was bound. Subsequent rebindings are irrelevant because they don't really mutate the binding, they shadow it. Because Reia's data structures (presumably being the same as Erlang's) are immutable, the mutable structure funny business previously described cannot be accomplished either.
In a manner of speaking, Reia has persistent bindings.
Why single-assignment then?
Conflating single assignment and concurrency in Erlang seems to be a pretty common thing, so perhaps I'll suggest a better theory about why Erlang is single-assignment: Pattern matching.
Erlang's "assignment" operations (both with equals and in implicit positions) are not assignments at all, but pattern matches with destructuring bind. You can use bound variable names as part of a pattern in Erlang, and the match will fail if the value of the bound variable and it's counterpart in the datum being matched against don't agree:
Foo = 42, [Foo,Bar] = [42,0], % Success, binds Bar = 0 [Foo,Bas] = [40,0], % Fail, Foo does not equal 40
If Erlang did not have single assignment, the last two expressions would be ambiguous: does it mean rebind Foo in a destructuring bind? Or should it use its value as a pattern in a pattern match? With single assignment, there is no rebinding, so it is always unambiguous: unbound variables get bound in a successful match, bound variables are used as part of the pattern.
Reia disambiguates this differently. It uses a "splat" operation which can be likened to "unquote" in Lisp. This is best described with an example:
foo = 42 [foo,bar] = [40,0] # Destructuring bind, rebinds foo and bar. [*foo,bar] = [40,2] # Successful pattern match on foo, rebinds bar, [40,bar] = [40,2] # evaluates as if you wrote this. [*foo,bar] = [42,4] # Failed pattern match.
The second line shows a plain old destructuring bind. All variables in the pattern are considered L-values and rebound. The third line demonstrates the splat, which can be thought of as unquoting, or causing foo to be evaluated as an R-value. Thus it is the same as writing the fourth line, where foo's value is substituted in the pattern. And finally, the fifth line would cause a failed match, again it does not rebind foo.
I'm pretty sure this theory behind single assignment has a lot more traction than the "it has something to do with concurrency model" theory that I've not seen anyone fully unfurl yet.
Anyway this was a long-ass post, sorry about that. I had to at least attempt to correct the crappy FUD about Reia's rebindable names. I'm actually really excited that more languages are blossoming top of BEAM, it really is a remarkable platform. I always wondered how hard it would be to write a BEAM backend for PyPy...
Current Mood:
geeky
Leave a comment


