Memory in Glean agents lets your agent keep track of what’s happened so far as it runs through its workflow. Think of memory as a “notebook” that your agent uses to remember inputs, outputs, and important information from each step.As your agent follows its instructions, it saves the details and results from every step in its memory. This means later steps can use what you or previous steps have already provided or discovered, without asking for it again.
A token is the basic unit of text that the LLM processes. It’s roughly analogous to a word or sub‑word fragment (for example, “un”, “break”, and “able” might each be separate tokens).
When you send a prompt or receive a response, the text is first converted into a sequence of tokens, which the model then analyzes and uses to generate a response.
Tokens determine how much “space” your input and output occupy within the agent’s memory. For performance reasons, Glean agents are limited to 128,000 tokens, or roughly 96,000 words.
When your agent runs, each step stores its outputs and any other relevant data in memory automatically.
Any step later in the workflow can look back at this memory to use information from previous steps, for example: using someone’s name collected in an early step for sending an email in a later step.
If you include a Sub-agent as a step, its “Respond” step outputs are saved to memory as well, so you can use the results from the Sub-agent just like results from normal steps.
How Search Results and Documents Are Added to Memory
By default, “Company search” steps retrieve “snippets” of documents, meaning they pick the most relevant parts of documents and only add those to the agent’s memory.
Company search can also be configured to read entire documents. This will consume a higher number of tokens, which can make the agent hit the token limit if there are a lot of searches.
Read document steps always read the entire contents of a document into the memory. For very long documents, this may use a lot of tokens.
To help prevent exceeding the token limit, agents will read only a portion of documents when the token limit is being approached. In these cases, you may see a warning similar to the image below:
Memory works automatically within Glean agents, you don’t have to set it up or manage it separately.
Every main agent and Sub-agent has access to the memory it needs, and returns its results to the parent agent’s memory at the end of its work.
The main agent can always use everything stored to make decisions or complete its tasks.
If you have turned off conversation history, Glean will still retain memory for up to 2 hours. This memory is only available for follow up actions in the agent.
You can control how each step in your agent uses memory. By managing memory on a per-action basis, you decide how much information from earlier steps should be available to the AI when it carries out a specific action. This can help your agent focus only on what’s relevant, especially in complex workflows or when you want to reduce the amount of data sent for each step.
In the agent builder, select the action you want to manage.
In the menu in the upper right corner of the action’s menu, select Advanced settings.
Under the Manage memory section, select your desired memory configuration.