Adaptive Reasoning
We’re introducing a new default mode for Glean: Adaptive Reasoning. Adaptive mode automatically adjusts how much reasoning Glean applies to each question, so users get the right balance of speed and intelligence. We recommend adaptive mode for all users. You can toggle reasoning modes at anytime in Glean.

Agentic search model
Adaptive reasoning is driven by Glean's agentic search model, a retrieval-optimized model that runs automatically before the frontier model on eligible queries. It quickly gathers the most relevant information from your organization, passes that evidence to the frontier model, and helps the final answer be faster, better grounded, and more efficient.
In practice, Glean finds the right context first, then answers with that context second.
How adaptive reasoning works
- A user asks a question in Glean.
- The agentic search model determines whether the question would benefit from a retrieval plan. If so, it issues targeted searches in parallel across your organization's content using a controlled set of tools.
- The pre-collected evidence is passed to the frontier model (GPT, Claude, or Gemini), which reasons over the initial searches, does its own reasoning, and produces a grounded, cited answer.
The agentic search model plans retrieval, breaking down the question, figuring out the right search tools to call, and when it has sufficient informaton to answer the question. It never generates user-visible text. The frontier model always runs afterward and is responsible for the final response.
Benefits
- Faster answers: By front-loading retrieval, Glean spends less time searching and more time reasoning, delivering noticeably faster responses. We observed a 50% reduction in latency in our testing.
- Lower LLM costs at scale: The agentic search model reduces frontier model token consumption by handling retrieval planning more efficiently.
Performance
The agentic search model delivers measurable latency improvements with no regression in answer quality:
| Metric | Improvement |
|---|---|
| P25 Time to First Token | -51.0% |
| P50 Time to First Token | -51.9% |
| P75 Time to First Token | -45.8% |
| Answer quality and satisfaction | No change (neutral) |
Who is affected by adaptive reasoning
Adaptive reasoning mode runs automatically for organizations that meet all three of the following criteria:
| Criteria | Description |
|---|---|
| Glean-managed infrastructure | Your deployment is hosted by Glean on GCP |
| Glean Universal Model Key | You use Glean's LLM key, not a customer-managed key |
Organizations on AWS or Azure, or organizations using their own LLM keys, are not affected by this change. Customers from the EU are not affected by this change.
Configuration
No configuration is required from admins or end users. The agentic search model runs automatically on eligible queries behind the scenes.
If a query falls outside the model's scope, it immediately hands off to the frontier model with no degradation in quality.
Data flow
With the agentic search model enabled, the query processing flow for eligible organizations is:
- User query is sent to your Glean deployment.
- Your Glean deployment calls the agentic search model (hosted on Glean-managed infrastructure on Google Vertex AI), executes retrieval tool calls, and collects relevant context.
- The agentic search model's response goes back to your Glean deployment with the retrieved tool calls and context.
- Your Glean deployment calls the frontier model. The frontier model does the rest of the reasoning for the query using the pre-collected context and generates the final response.

Model information
The agentic search model is built on NVIDIA's Nemotron-3 Nano (30B-A3B) model, fine-tuned by Glean using reinforcement learning. It's hosted entirely on Glean-managed infrastructure on Vertex AI in the United States and isn't served through a third-party model provider endpoint.
Customer data isn't used to train this model. The model was trained on Glean's own internal dataset of enterprise information-seeking queries.
Security and data handling
| Property | Detail |
|---|---|
| No data persistence | Query and response content is not logged in the agentic search model serving path. |
| No model training on customer data | The model was trained exclusively on Glean's internal data. Your organization's data is never used to train or fine-tune the model. |
| Stateless information flows | Each call to the agentic search model runs independently of any other calls, reducing complexity and increasing data isolation assurance. |
| Glean-hosted and controlled | The model runs on Glean's GCP infrastructure, not a third-party endpoint. |
| US-based infrastructure | Model inference runs in Google Cloud (US region). |
| Permission enforcement | The model only retrieves content the requesting user is already authorized to access. |
| Admin controls unchanged | All existing Glean policies, including document restrictions, folder exclusions, and data source restrictions, continue to apply. |
Updated terms
Glean's AI Terms Addendum has been updated to reflect that the Service supports Glean-hosted model deployments in addition to direct third-party LLM providers. You can review the updated terms at glean.com/legal.