ReasoningBank: Enabling agents to learn from experience

April 21, 2026 admin

Distilling insights with ReasoningBank

ReasoningBank distills global reasoning patterns into high-level, structured memories. Each structured memory item contains the following:

Title: A concise identifier summarizing the core strategy.
Description: A brief summary of the memory item.
Content: The distilled reasoning steps, decision rationales, or operational insights extracted from past experiences.

The memory workflow operates in a continuous, closed loop of retrieval, extraction, and consolidation. Before taking action, the agent draws upon the ReasoningBank to gather relevant memories into its context. It then interacts with the environment and uses an LLM-as-a-judge to self-assess the resulting trajectory and extracts success insights or failure reflection. Notably, this self-judgement does not need to be perfectly accurate, as we find ReasoningBank to be quite robust against judgment noise. During extraction, the agent distills workflows and generalizable insights from the trajectory into new memories. For simplicity, we directly append these to the ReasoningBank, leaving more sophisticated consolidation strategies for future work.

Crucially, unlike existing workflow memory strategies that only focus on successful runs, ReasoningBank actively analyzes failed experiences to source counterfactual signals and pitfalls. By distilling these mistakes into preventative lessons, ReasoningBank builds powerful strategic guardrails. For example, instead of merely learning a procedural rule like “click the ‘Load More’ button”, the agent might learn from a past failure to “always verify the current page identifier first to avoid infinite scroll traps before attempting to load more results”.

Source link