We’ve just open-sourced Agent-o-rama, a library for building scalable and stateful LLM agents on the JVM. Agent-o-rama provides two first-class APIs, one for Java and one for Clojure, with feature parity between them.
AI tooling today is overwhelmingly centered on Python, and while the JVM ecosystem has seen growing support through libraries like LangChain4j, it lacks the kind of integrated tooling that lets developers evaluate, observe, and deploy LLM-based systems rigorously and at scale. Available tools are fragmented or complex to set up, and nothing handles the entire workflow from development to production with proper observability.
Agent-o-rama fills that gap. It brings the same ideas popularized by LangGraph and LangSmith – structured agent graphs, tracing, datasets, experiments, evaluation – but makes them native to Java and Clojure. LLMs are powerful but inherently unpredictable, so building applications with LLMs that are helpful and performant with minimal hallucination requires being rigorous about testing and monitoring.









Agents are defined as simple graphs of Java or Clojure functions that execute in parallel. Agent-o-rama automatically captures detailed traces and includes a web UI for offline experimentation, online evaluation, and time-series telemetry (e.g. model latency, token usage, database latency). It also supports streaming, with a simple client API to stream model calls or other outputs from nodes in real time. Agent-o-rama extends the ideas from LangGraph and LangSmith with far greater scalability, full parallel execution, and built-in high-performance data storage and deployment.
Agent-o-rama is deployed onto your own infrastructure on a Rama cluster. Rama is free to use for clusters up to two nodes and can scale to thousands with a commercial license. Every part of Agent-o-rama is built-in and requires no other dependency besides Rama. Agent-o-rama also integrates seamlessly with any other tool, such as databases, vector stores, external APIs, or anything else. Unlike hosted observability tools, all data and traces stay within your infrastructure.
Example agent
Let’s take a look at an example agent! This is a research agent from the examples/ directory in the project. In that directory you’ll find equivalent Java and Clojure versions.
You’ll need Java 21 installed and API keys for OpenAI and Tavily (Tavily’s free tier is sufficient). Put the API keys in environment variables like so:
1 2 | export OPENAI_API_KEY=your_openai_key_here export TAVILY_API_KEY=your_tavily_key_here |
To run the agent, clone Agent-o-rama and follow these instructions (for Java or Clojure, whichever you prefer):
1 2 3 4 5 6 7 8 9 | # Java instructions cd examples/java ./run-example com.rpl.agent.research.ResearchAgentExample # Clojure instructions cd examples/clj lein repl (require '[com.rpl.agent.research-agent :as research-agent]) (research-agent/run-agent) |
This runs Rama’s “in-process cluster” (IPC) and launches the research agent on it. You’ll get a prompt at the terminal to enter a research topic. The agent will generate a set of analyst personas to analyze the topic, and you’ll be prompted again whether you want to give feedback on the generated analysts. Once you tell the agent you have no more feedback, it will spend a few minutes generating the report, including using information it finds through web searches and through Wikipedia, and then the final report will be printed.
As the report is being generated or when it’s finished, you can open the Agent-o-rama UI at
http://localhost:1974
.
Here’s an example back and forth:
If you click on the research agent in the UI, you’ll see this:

The invoke there is what we just ran. Clicking on it brings up the trace for the invoke:

This is displaying the parallel execution of the agent, with orange nodes being aggregations of data computed on multiple branches. On the right is aggregated statistics of everything that happened during the agent’s execution. You can see how many tokens it used, and if it did any database reads/writes you’d see stats about those too. If the agent invokes other agents, you can see a breakdown of stats by agent as well.
Clicking on the “write-report” node brings up a detailed trace of what happened when that node executed:

This node did one LLM call, and you can see the arguments to that LLM, what was returned, and stats on the call in the “Operations” section. The code for this node is just this:
1 2 3 4 5 6 7 8 9 | .node("write-report", "finish-report", (AgentNode agentNode, String sections, String topic) -> { ChatModel openai = agentNode.getAgentObject("openai"); String instructions = String.format(REPORT_WRITER_INSTRUCTIONS, topic, sections); List<ChatMessage> chatMessages = Arrays.asList( new SystemMessage(instructions), new UserMessage("Write a report based upon these memos.")); String report = openai.chat(chatMessages).aiMessage().text(); agentNode.emit("finish-report", "report", report); }) |
1 2 3 4 5 6 7 8 9 10 11 12 | (aor/node "write-report" "finish-report" (fn [agent-node sections topic] (let [openai (aor/get-agent-object agent-node "openai") instr (report-writer-instructions topic sections) text (chat-and-get-text openai [(SystemMessage. instr) (UserMessage. "Write a report based upon these memos.")])] (aor/emit! agent-node "finish-report" "report" text) ))) |
This code says that the node’s name is “write-report”, the node emits to the node “finish-report”, and the node’s implementation is the given function. The
agentNode
/
agent-node
argument is how you interact with the graph to return a result, emit to other nodes, or get agent objects like models, database connections, or anything else. When you emit to other nodes, you simply say what node you want to emit to and what arguments to pass to that node. Agent nodes run on virtual threads, so they can be efficiently written in a blocking style like this.
That’s most of what’s involved in programming agents with Agent-o-rama! There’s a bit more to learn with aggregation and how to declare agent objects, and this is all documented on the programming agents guide. The rest of using Agent-o-rama is creating and managing datasets, running experiments, setting up online evaluation and other actions on production runs, and analyzing agent telemetry.
Also, you can see from this code and the trace that model calls are automatically traced – this node didn’t have to record any tracing info explicitly. Though you can include your own info in traces with a simple API (see this Javadoc and this Clojuredoc).
Let’s take a look at running this on a real cluster! Let’s quickly set up a cluster locally by following these instructions:
- Download the latest Rama release from here.
- Unpack the release somewhere.
-
Run:
./rama devZookeeper & -
Run:
./rama conductor & -
Run:
./rama supervisor & -
Visit:
http://localhost:8888. When the page loads, the cluster is ready. - Download the latest Agent-o-rama release from here.
- Unpack it somewhere.
-
Run:
./aor --rama /path/to/rama-root &
Next, to deploy you need to build a jar first. Here’s how to build either the Java or Clojure version from the Agent-o-rama project:
1 2 3 4 5 6 7 | # For Java version cd examples/java mvn clean package -Dmaven.test.skip=true # For Clojure version cd examples/clj lein uberjar |
The Java version will build
target/java-examples-with-dependencies.jar
, and the Clojure version will build
target/agent-o-rama-examples-1.0.0-SNAPSHOT-standalone.jar
.
Next, to deploy the module just run this command:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | # Deploy the module (Java uberjar) ./rama deploy \ --action launch \ --jar /path/to/java-examples-with-dependencies.jar \ --module com.rpl.agent.research.ResearchAgentModule \ --tasks 4 \ --threads 2 \ --workers 1 # Deploy the module (Clojure uberjar) ./rama deploy \ --action launch \ --jar /path/to/agent-o-rama-examples-1.0.0-SNAPSHOT-standalone.jar \ --module com.rpl.agent.research-agent/ResearchAgentModule \ --tasks 4 \ --threads 2 \ --workers 1 |
Now it’s up and running! You can view the agent in the UI at
http://localhost:1974
and play with it. From the agent screen you can invoke the agent with the arguments
["", {"topic": "your topic here"}]
. On the trace, you’ll be able to see any human input prompts the agent makes and respond to them there.
Rama handles all of storage, deployment, and scaling. There are no other dependencies needed to run this. Setting up a production cluster is only slightly more work, and there are one-click deploys for AWS and for Azure.
Resources
Check out these resources to learn more or get involved:
- Tour of Agent-o-rama
- Quickstart
- Full documentation
- Javadoc
- Clojuredoc
- Mailing list
- #rama channel on Clojurians
Conclusion
Agent-o-rama lets developers gain the benefits of Rama without needing to learn it. Rama’s distributed programming model is powerful but has a learning curve: it introduces a rich dataflow API and uses compound data structures for indexing instead of fixed data models. Agent-o-rama abstracts away those concepts into a familiar API so developers can take advantage of Rama’s strengths for the specific domain of building LLM agents.
For those who want to learn how to program Rama directly, Agent-o-rama also serves as a great example of Rama in practice. The backend is about 15K lines of code and the front-end about 11K, yet together they form a complete, end-to-end distributed system with a diverse feature set. Along with our Twitter-scale Mastodon implementation, it shows the breadth of what can be built with Rama.
We’d love to hear what you build with Agent-o-rama. Join the rama-user mailing list or the #rama channel on the Clojurians Slack to ask questions, share feedback, or discuss ideas with others using Agent-o-rama.
If you’d like to talk directly with us about Agent-o-rama, whether to exchange ideas, get technical guidance, or explore working together on building an LLM agent, you can book a call with us.
