Rama is a testament to the power of Clojure

It took more than ten years of full-time work for Rama to go from an idea to a production system. I shudder to think of how long it would have taken without Clojure.

Rama is a programming platform that integrates and generalizes backend development. Whereas previously backends were built with a hodgepodge of databases, application servers, queues, processing systems, deployment tools, monitoring systems, and more, Rama can build end-to-end backends at any scale on its own in a tiny fraction of the code. At its core is a new programming language implementing a new programming paradigm, at the same level as the “object-oriented”, “imperative”, “logic”, and “functional” paradigms. Rama’s Clojure API gives access to this new language directly, and Rama’s Java API is a thin wrapper around a subset of this language.

There’s a lot in Clojure’s design that’s been instrumental to developing Rama. Three things stand out in particular: its flexibility for defining abstractions, its emphasis on immutability, and its orientation around programming with plain data structures. Besides these being essential to maintaining simplicity in Rama’s implementation, Rama also embraces these principles in its approach to distributed programming and indexing.

Ability to do in libraries what requires language support in other languages

Rama’s language is Turing-complete and defined largely via Clojure macros. So it’s still Clojure, but its semantics are different in many fundamental ways. At its core, Rama generalizes the concept of a function into something called a “fragment”. Whereas a function works by taking in any number of input parameters and then returning a single value as the last thing it does, a fragment can output many times (called “emitting”), can output to multiple “output streams”, and can do more work between or after emitting. A function is just a special case of a fragment. Rama fragments compile to efficient bytecode, and fragments that happen to be functions execute just as efficiently as functions in Java or Clojure.

Even though Rama contains this new programming language implementing this new programming paradigm, it’s still Clojure. So it interoperates perfectly. Rama code can invoke Clojure code directly, and Clojure code can invoke Rama directly as well. There’s no friction between them. Rama itself is implemented in a mixture of regular Clojure code and Rama code.

Neither Rich Hickey nor John McCarthy ever envisioned this completely different programming paradigm being built within their abstractions, much less one that reformulates the basis of nearly every programming language (the function). They didn’t need to. Clojure, along with its Lisp predecessors, are languages that put almost no limitations on your ability to form abstractions. With every other language you at least have to conform to their syntax and basic semantics, and you have limited ability to control what happens at compile-time versus runtime. Lisps have great control over what happens at compile-time, which lets you do incredible things.

Lisp programmers have struggled ever since it was invented to explain why this is so powerful and why this has a major impact on simplifying software development. So I won’t try to explain how powerful this is in general and will focus on how instrumental it was for Rama. I’ll instead point you to Paul Graham’s essay “Beating the Averages”, which was the essay that first inspired me to learn Lisp back when I was in college. When I first read that essay I didn’t understand it completely, but I was particularly compelled by the lines “A big chunk of our code was doing things that are very hard to do in other languages. The resulting software did things our competitors’ software couldn’t do. Maybe there was some kind of connection.”

The new language at Rama’s core is an example of this. Other languages can only have multiple fundamentally different paradigms smoothly interoperating if designed and implemented at the language level. Otherwise, you have to resort to string manipulation (as is done with SQL), which is not smooth and creates a mess of complexity. No amount of abstraction can hide this complexity completely, and attempting to often creates new complexities (like ORMs).

With Clojure, you can do this at the library level. We required no special language support and built Rama on top of Clojure’s basic primitives.

Rama’s language is not our only example of mixing paradigms like this. Another example is Specter. Specter is a generically useful library for querying and manipulating data structures. It’s also a critical part of Rama’s API (since views in Rama, called PStates, are durable data structures of any composition), and it’s a critical part of Rama’s implementation. About 1% of the lines of code in Rama’s source and tests are Specter callsites.

You can define Specter’s abstractions in any language. What makes it special in Clojure is how performant it is. Queries and manipulations with Specter are faster than even hand-rolled Clojure code. The key to Specter’s performance is its inline caching and compilation system. Inline caching is a technique I’ve only seen used before at the language or VM level. It’s a critical part of how the JVM implements polymorphism, for example. Because of the flexibility of Clojure, and the ability to program what happens at compile-time for a Specter callsite, we’re able to utilize the technique at the library level. It’s all done completely behind the scenes, and users of Specter get an expressive and concise API that’s extremely fast.

Power of immutability and data structure orientation

Clojure is unique among Lisps in the degree that it emphasizes immutability. It’s core API is oriented to working with immutable data structures. Additionally, Clojure encourages representing program state with plain data structures and having an expressive API for working with those data structures. The quote “It’s better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.” is part of Clojure’s rationale.

These philosophies have had a major impact on Rama’s development, helping a tremendous amount in managing complexity within Rama’s implementation. The less state you have to think about in a section of code, the easier it is to reason about. When a project gets as big as Rama (190k lines of source, 220k lines of tests), with many layers of abstractions and innumerable subsystems, it’s impossible to keep even a fraction of the whole system “in your head” for reasoning. I frequently have to re-read sections of code to remind myself on the details of that particular subsystem. The dividends you get from lowering complexity of the system, with immutability being a huge part of that, compounds more and more the bigger the codebase gets.

Clojure doesn’t force immutability for every situation, which is also important. Rama tracks a lot of different kinds of state, and we find it much simpler in some cases to use mutability rather than work with state indirectly as you would through something like the State Monad in Haskell. There are also some algorithms that are much simpler to write when they use a volatile internal in the implementation. That said, the vast majority of code in Rama is written in an immutable style. When we use mutability it’s almost always isolated within a single thread. Rather than have concurrent mutability using something like an atom, we use a volatile and send events to its owning thread to interact with it.

Rama embraces and expands upon Clojure’s principles of immutability and orienting code around data structures. These principles are fundamental to Rama’s approach for expressing end-to-end backends. A lot of Rama programming revolves around materializing views (PStates), which are literally just data structures interacted with using the exact same Specter API as used to interact with in-memory data structures. This stands in stark contrast with databases, which have fixed data models and special APIs for interacting with them. Any database can be replicated in a PState in both expressivity and performance, since a data model is just a specific combination of data structures (e.g. key/value is a map, column-oriented is a map of sorted maps, document is a map of maps, etc.).

Rama’s language extends Clojure’s immutable principles into writing distributed, fault-tolerant, and async code. There’s a lot of similarities with Clojure like anonymous operations with lexical closures, immutable local variables, and identical semantics when it comes to shadowing. Rama takes things a step further for distributed computation, doing things like scope analysis to determine what vars needs to be transferred across network boundaries. Rama’s loops have similar syntax to Clojure and have the additional capability of being able to be a distributed computation that hops around the cluster during loop iterations. With Rama this is all written linearly through the power of dataflow, with switching threads/nodes being an operation like anything else (called a “partitioner”).

Clojure’s principles are just sound ideas that really do make a huge impact on simplifying software development. These principles are even more relevant in distributed systems / databases which historically have been overrun with complexity. That’s why these principles are so core to Rama and its implementation.

Conclusion

There’s a seeming contradiction here – if Clojure enables such productivity gains, then why is it still a niche language in the industry? Why aren’t those using Clojure crushing their competition so thoroughly that every programmer is now rushing to adopt Clojure to even the playing field?

I believe this is simply because Clojure does not address all aspects of software development. This is not a criticism of Clojure but a recognition of its scope. Things like durable data storage, deployment, monitoring, evolving an application’s state and logic over time, fault-tolerance, and scaling are huge costs of building end-to-end software. Oftentimes the principles of Clojure are corrupted when using a database, as the database forces you to orient your code around its data model and capabilities.

This is why we’re so excited about Rama and have worked so long on it, because Rama does address everything involved in building end-to-end backends, no matter the scale. Rama provides flexible data storage expressed in terms of data structures, has deployment and monitoring built-in, has first-class features for evolving an application and updating it, is completely fault-tolerant, and is inherently scalable. It does all this while maintaining Clojure’s great principles and functional programming roots.

If you’d like to discuss on the Clojure Slack, we’re active in the #rama channel.

Ability to do in libraries what requires language support in other languages

Power of immutability and data structure orientation

Conclusion

Leave a ReplyCancel reply

Discover more from Blog