Rama in five minutes (Clojure version)

As an engineer, you want to ship product features. However, backend engineering gets bogged down by everything around those features: adding caches, wiring background workers, adding queues, handling migrations, and coordinating deploys. The actual business logic is usually a small part of the work.

To make this concrete, imagine building a small todo app: users add todos, complete them, view their todo list, and see how many they’ve completed.

This post shows how this is built in a traditional Postgres stack versus Rama. You’ll see how Rama eliminates all the infrastructure sprawl and glue code. If you’re using NoSQL databases or processing frameworks, you would incur the same costs as the Postgres version, just wearing different clothes.

Building with a traditional Postgres stack

You start with two tables and an index on user_id .

1
2
3
4
5
6
7
8
9
10
11
12
13
14
CREATE TABLE todos (
  id SERIAL PRIMARY KEY,
  user_id BIGINT NOT NULL,
  text TEXT NOT NULL,
  completed_at TIMESTAMPTZ
);

CREATE INDEX ON todos(user_id);

CREATE TABLE todo_stats (
  user_id BIGINT PRIMARY KEY,
  completed_count BIGINT NOT NULL,
  total_count BIGINT NOT NULL
);

As traffic grows, reads become the bottleneck. To reduce load you add Memcached for the todo list and stats.

As traffic increases further, writes become the bottleneck. The todo_stats writes don’t need to be synchronous, so you reduce load by moving those writes to a background worker fed by Kafka. The web server updates todos and appends to Kafka, and the worker reads from Kafka in batches to update todo_stats and Memcached efficiently.

This system now has a web server, Postgres, Memcached, Kafka, and a worker:

Each has its own scaling and deployment procedures, all for a tiny, medium-scale application. At high scale, a complete rearchitecture would be needed to make everything horizontally scalable, adding even more systems into the mix.

Adding one small feature

Now your product manager asks to enable todos to be reordered.

A new column sort_key needs to be added to the todos table to create a fractional index, like so:

1
2
ALTER TABLE todos
  ADD COLUMN sort_key TEXT;

This column must be backfilled. A simple approach is to assign evenly spaced keys from the todo ID:

1
2
3
UPDATE todos
SET sort_key = to_char(id * 1000, 'FM999999999999')
WHERE sort_key IS NULL;

However, rewriting millions of rows in one transaction would create downtime, potentially hours. Instead, the logic has to be performed while the system is live, advancing through the table incrementally. This requires a custom background script.

Once backfill completes, you can enforce the invariant and create the index:

1
2
3
4
ALTER TABLE todos
  ALTER COLUMN sort_key SET NOT NULL;

CREATE INDEX CONCURRENTLY ON todos (user_id, sort_key);

The rollout of the new feature must happen in this order:

  1. Add sort_key column to Postgres
  2. Update web server to initialize sort_key column for new todos but otherwise ignore that column
  3. Run backfill script
  4. Update Postgres schema to enforce the invariant on sort_key and create index
  5. Update web server with reordering feature added (including new query to sort todos by sort_key )

This is all standard practice, but it’s significant engineering time spent coordinating schema changes, background jobs, and application code.

How you build it with Rama

Rama gives you a foundation that eliminates this glue code, infrastructure sprawl, and one-off scripts.

In the Postgres-based stack, you can’t send every write through a queue because processing that queue is asynchronous. Some operations, like adding a todo, must be visible immediately on the next read, not some indeterminate time later. So some writes go directly to the database while others go through a queue. The result is a split system where some paths are synchronous and others are asynchronous.

In Rama, every write goes through a queue called a “depot”. The key difference is that a depot append can support synchronous or asynchronous use cases. When appending you can choose to wait for only the log write to finish, or you can also wait for downstream processing to complete. This makes the depot-first model suitable for both interactive UI paths and background work, and this unified flow is the backbone of Rama’s architecture.

Besides depots, a Rama application (called a “module”) also includes business logic (called “topologies”) and storage (called “PStates”). Rama applications follow this flow: events enter a depot, your logic processes them, and any number of PStates are updated.

Every aspect of Rama is horizontally scalable. “PState” stands for “partitioned state” and is like a database (e.g. durable on disk), but much more flexible.

Explaining Rama’s full API would take more space than this post allows, so we’ll instead explain the implementation in broad strokes. To dive deeper, check out this blog post series which contains line by line tutorials of applying Rama to a wide variety of use cases.

Let’s start by building the “todos + completed_count” features, and the next section will add the ability to reorder lists. You can see the full code for this module here.

One depot is needed:

1
(declare-depot setup *todo-depot (hash-by :user-id))

Next is a topology that will have the business logic:

1
(let [s (stream-topology topologies "todos")]

This topology will have two PStates to store todo lists and completion stats:

1
2
3
4
5
6
7
(declare-pstate
  s
  $$todos
  {String [(fixed-keys-schema
             {:todo String
              :completed-at Long})]})
(declare-pstate s $$completed-stats {String Long})

As opposed to databases which have fixed data models, PStates are defined as any compound data structure.

Next are the event types that will be appended to the depot:

1
2
(defrecord NewTodo [user-id text])
(defrecord CompleteTodo [user-id index time-millis])

Next is the business logic. Rama’s API is more expressive than a traditional database API, so the code below will look unfamiliar if you haven’t seen Rama before. You don’t need to follow every detail. What matters is the shape: events enter the depot, the topology handles them, and the PStates are updated.

1
2
3
4
5
6
7
8
9
10
11
12
(<<sources s
  (source> *todo-depot :> *data)
  (<<subsource *data
    (case> NewTodo :> {:keys [*user-id *text]})
    (local-transform> [(keypath *user-id) NIL->VECTOR AFTER-ELEM (termval {:todo *text})]
                      $$todos)

    (case> CompleteTodo :> {:keys [*user-id *index *time-millis]})
    (local-transform> [(must *user-id *index) :completed-at (termval *time-millis)]
                      $$todos)
    (local-transform> [(keypath *user-id) (nil->val 0) (term inc)] $$completed-stats)
    ))

This specifies handlers for events of type NewTodo and CompleteTodo . A NewTodo appends to the list of todos for the user. A CompleteTodo sets :completed-at and increments the counter for the user in the $$completed-stats PState.

The web server does depot appends and PState queries using Rama’s client API, shown in a unit test for this module.

That’s the entire backend, all specified in one 30 LOC namespace. It has great performance and scales horizontally. There are no separate caches or background workers. The backend architecture looks like this:

This is ACID-compliant. Deploying, updating, and scaling this module are all one-line CLI commands. Rama replicates all state for fault-tolerance, and it has extensive monitoring of all aspects of module execution viewable through the Rama web UI.

Adding list reordering to the Rama version

Now let’s add the same “reorder todos” feature. The full code for this updated module is at this link. First, define a new event type for the reorder action:

1
(defrecord ReorderTodo [user-id from-index to-index])

Then add an additional handler for this event:

1
2
3
4
5
6
(case> ReorderTodo :> {:keys [*user-id *from-index *to-index]})
(local-transform> [(must *user-id)
                   (selected? (view count) (pred> *to-index))
                   (index-nav *from-index)
                   (termval *to-index)]
                  $$todos)

That’s the entire implementation. There’s no new column, no migration, no backfill script, no index to build, and no multi-step deploy sequence. You simply extend the module definition and update the module with a one-line CLI command.

The Postgres version needed the sort_key column because the todo list was not being stored as an actual list, but simulated as a list in a SQL table. In Rama, the todo list is stored directly as a list inside a PState, so reordering is just modifying the list in place. Rama enabling you to store your domain model directly in PStates avoids a ton of complexity you usually have when using databases with fixed data models, and this is just a small example of that. For the same reason, you never need anything like an ORM when using Rama.

If you do need to change your PState schema, Rama has great support for that. Most migrations, including ones needing backfill, can be performed instantly even if the PState has terabytes of data.

Conclusion

The gap between the traditional approach and Rama only grows with more features or higher scale. As an application grows, traditional stacks accumulate operational work: schema changes, backfills, and deploy choreography. Adding databases or other infrastructure is common. The complexity of both development and operations keeps multiplying.

Rama avoids that complexity and sprawl entirely. Rama applications require little code because they’re almost all business logic, and deployment and scaling are built-in. The same codebase will carry you all the way to large scale, so there’s no hidden rewrite waiting for you when usage grows.

Most importantly, Rama is general purpose. For an example of this, we re-implemented the entirety of Mastodon to be Twitter-scale in only 10k lines of code. The project implements an extremely diverse feature set: timelines, social graph, search, recommendations, trends, scheduled posts, and much more.

Rama also gives you huge flexibility from depots capturing every change. You can replay them later to build new derived views, and you gain a complete audit log for debugging if something goes wrong.

The more your product grows, the more time and money Rama saves. It keeps the backend focused on business logic rather than the scaffolding around it.

3 thoughts on “Rama in five minutes (Clojure version)

  1. Thanks a lot for sharing. I understand the pain points Rama is solving in comparison to the classic CRUD-based Postgres example. But despite being a die-hard fan of Clojure(Script), Datomic, event sourcing and everything in general which could make software system more comprehensible, there are still too many “blub paradoxes” for me to understand Rama. I really want that Clojure-based projects like Rama succeed, but I guess you shrinking your total addressable market to under 0.01% of the programmers/companies, who are able to understand Rama and apply it to their situation. I fully get that 5 minutes are not enough to explain something that is not familiar to most of the audience. I guess most people are not even familiar with event sourcing, which Rama is a variant of (if I understand it correctly). Maybe some very basic Clojure example using maps for the events and clojure.core/reduce to calculate some (light) PState in-memory would help. How does PStates compare to anything more mainstream, like btrees or rather do PStates expand to disk or are they in-memory only?

  2. PStates are durable on disk just like databases, using LSM trees underneath the hood. Rama does have a learning curve, but I’ve found it only takes one to two weeks for a programmer to get the hang of the basics and get to a point of reasonable productivity. You don’t need to learn all of Rama to get value out of it. With paths, for instance, you can accomplish most things with keypath, pred, and view.

    Dataflow looks different but is not as hard to learn as you may think. This post explains dataflow in terms of equivalent Clojure code: https://blog.redplanetlabs.com/2024/10/10/rama-on-clojures-terms-and-the-magic-of-continuation-passing-style/

    The referenced blog post series contains line-by-line tutorials on applying Rama to a wide variety of use cases: https://blog.redplanetlabs.com/next-level-backends-with-rama/

    rama-demo-gallery contains more heavily commented examples of using Rama: https://github.com/redplanetlabs/rama-demo-gallery

    The REST API module is the simplest, as it just does an HTTP request and then records the result in a K/V PState.

Leave a Reply to Max Weber Cancel reply

Your email address will not be published. Required fields are marked *