Lean Scala
Program code is for communicating between humans, not just for instructing computers. So we strive for code to be lean. Lean code is simple and understandable. It is as concise as possible without losing clarity. It avoids lingo, over-abstraction, and obscure features. It does not mislead, that is, it expresses the meaning of a program clearly and without fault. All these properties are desirable for code just as they are desirable for technical prose.
In principle, one can write lean code in any programming language, but some languages make it easier than others. Some languages are limited in their expressiveness and therefore require either repetitive boilerplate or awkward constructions. More expressive languages might have the opposite problem: there are too many ways to express the same concepts, which makes it difficult to understand code written in an unfamiliar way.
Scala, and in particular Scala 3, is a very good notation for writing lean code. It has a smallish, quiet syntax. It has a concise and expressive type system with good type inference that gives type safety and documentation without a lot of boilerplate. Its types and its immutable-first orientation promote software that is reliable and correct by design, and provide an expressive toolkit for clean modularization. Scala’s system of implicit parameters avoids repetitive parameter passing and gives the right foundations for modeling context in all forms, including type classes, configurations, or capabilities. In my courses in EPFL’s Scala specialization on Coursera I present many examples of lean Scala code.
But Scala also has its set of challenges when it comes to writing lean code. The language is both very flexible and very expressive. Its flexibility allows code bases to be written in many different styles. Its expressiveness can encourage over-abstraction. This means that trying to find your way in an unfamiliar Scala codebase is sometimes difficult. Some of the stylistic variations are unavoidable if a language is used in many different application domains. Some of them are for historical reasons, since old software tends to look different from new. But there is also the real problem of variations that arise because people prefer different styles. Which is of course perfectly fine as far as the individual is concerned. But it can be a problem for teams, and overall the community would be better off if there were more common defaults.
I believe we can and should do something to address these challenges. For a start, we can identify a recommended style to write lean Scala code. I tried to come up with a list of properties of such a style and arrived at the following.
- It should follow the Principle of Least Power: Given a choice of solutions, pick the least powerful solution capable of solving your problem. This also extends to language constructs. For instance, implicit conversions or macros are very powerful constructs. One should use them only if simpler alternatives are not sufficient.
- It should be immutable first, without being dogmatic about it. That is, we generally prefer immutable data and pure functions, but also allow effects if they are well contained and described. In the near future we will probably have new type constructs that can help here.
- It should focus on Scala 3 constructs and syntax. To ease migration, Scala 3 still supports many of Scala 2’s idioms. We are in the process of deprecating those bit by bit, but the process is slow and painful. Lean Scala should jump ahead and drop all Scala 2 legacy constructs.
- It should promote the core language over embedded DSLs (domain specific languages). Scala has a rich ecosystem of libraries and frameworks that essentially implement DSLs: fluid syntax in ScalaTest, hardware design in Chisel, query languages, and many others. These have their place, but they also contribute to language fragmentation. So we need to distinguish Lean Scala from “Scala as a host language for DSLs”.
- It should focus on direct style. Monadic effect systems shine in some areas but they are also a kind of DSL, which creates specialized eco-systems and dialects.
No doubt that list needs to be discussed and refined. Once we have identified our goals for lean code, the next question is how to promote it. Here are some of the things we can do:
- Write technical documentation such as blog posts discussing what lean Scala style is, and how to get there.
- Work on tooling support. For instance, develop tool tips or linting rules that promote lean Scala. Where possible this should be supported by automatic rewriting from legacy code. Rewriting is particularly helpful for code that is generated by LLMs. The training set bias means that such code is more likely to use old Scala 2 patterns. Once automatic rewrites are established and more Lean Scala code exists, we can assume that LLMs will generate it by themselves when prompted. But initially they will have to be augmented by our own tooling.
- Encourage efforts to assemble and promote library stacks that are powerful and easy to use and that support lean coding styles.
Clearly, Lean Scala is just one way to write code in the language. There are situations and whole application areas where you want to deviate from the style and develop your own set of principles and techniques. But Lean Scala is a useful default, for instance for teams that don’t have special requirements, people new to the language, or for educators that want to teach simple and modern Scala. For that reason, I hope that defining and promoting it will be useful.