Skip to main content

Adaptive Reasoning

We’re introducing a new default mode for Glean: Adaptive Reasoning. Adaptive mode automatically adjusts how much reasoning Glean applies to each question, so users get the right balance of speed and intelligence. We recommend adaptive mode for all users. You can toggle reasoning modes at anytime in Glean.

Adaptive Reasoning

Agentic search model

Adaptive reasoning is driven by Glean's agentic search model, a retrieval-optimized model that runs automatically before the frontier model on eligible queries. It quickly gathers the most relevant information from your organization, passes that evidence to the frontier model, and helps the final answer be faster, better grounded, and more efficient.

In practice, Glean finds the right context first, then answers with that context second.

How adaptive reasoning works

  1. A user asks a question in Glean.
  2. The agentic search model determines whether the question would benefit from a retrieval plan. If so, it issues targeted searches in parallel across your organization's content using a controlled set of tools.
  3. The pre-collected evidence is passed to the frontier model (GPT, Claude, or Gemini), which reasons over the initial searches, does its own reasoning, and produces a grounded, cited answer.

The agentic search model plans retrieval, breaking down the question, figuring out the right search tools to call, and when it has sufficient informaton to answer the question. It never generates user-visible text. The frontier model always runs afterward and is responsible for the final response.

Benefits

  • Faster answers: By front-loading retrieval, Glean spends less time searching and more time reasoning, delivering noticeably faster responses. We observed a 50% reduction in latency in our testing.
  • Lower LLM costs at scale: The agentic search model reduces frontier model token consumption by handling retrieval planning more efficiently.

Performance

The agentic search model delivers measurable latency improvements with no regression in answer quality:

MetricImprovement
P25 Time to First Token-51.0%
P50 Time to First Token-51.9%
P75 Time to First Token-45.8%
Answer quality and satisfactionNo change (neutral)

Who is affected by adaptive reasoning

Adaptive reasoning mode runs automatically for organizations that meet all three of the following criteria:

CriteriaDescription
Glean-managed infrastructureYour deployment is hosted by Glean on GCP
Glean Universal Model KeyYou use Glean's LLM key, not a customer-managed key

Organizations on AWS or Azure, or organizations using their own LLM keys, are not affected by this change. Customers from the EU are not affected by this change.

Configuration

No configuration is required from admins or end users. The agentic search model runs automatically on eligible queries behind the scenes.

If a query falls outside the model's scope, it immediately hands off to the frontier model with no degradation in quality.

Data flow

With the agentic search model enabled, the query processing flow for eligible organizations is:

  1. User query is sent to your Glean deployment.
  2. Your Glean deployment calls the agentic search model (hosted on Glean-managed infrastructure on Google Vertex AI), executes retrieval tool calls, and collects relevant context.
  3. The agentic search model's response goes back to your Glean deployment with the retrieved tool calls and context.
  4. Your Glean deployment calls the frontier model. The frontier model does the rest of the reasoning for the query using the pre-collected context and generates the final response.

Data flow diagram

Model information

The agentic search model is built on NVIDIA's Nemotron-3 Nano (30B-A3B) model, fine-tuned by Glean using reinforcement learning. It's hosted entirely on Glean-managed infrastructure on Vertex AI in the United States and isn't served through a third-party model provider endpoint.

Customer data isn't used to train this model. The model was trained on Glean's own internal dataset of enterprise information-seeking queries.

Security and data handling

PropertyDetail
No data persistenceQuery and response content is not logged in the agentic search model serving path.
No model training on customer dataThe model was trained exclusively on Glean's internal data. Your organization's data is never used to train or fine-tune the model.
Stateless information flowsEach call to the agentic search model runs independently of any other calls, reducing complexity and increasing data isolation assurance.
Glean-hosted and controlledThe model runs on Glean's GCP infrastructure, not a third-party endpoint.
US-based infrastructureModel inference runs in Google Cloud (US region).
Permission enforcementThe model only retrieves content the requesting user is already authorized to access.
Admin controls unchangedAll existing Glean policies, including document restrictions, folder exclusions, and data source restrictions, continue to apply.

Updated terms

Glean's AI Terms Addendum has been updated to reflect that the Service supports Glean-hosted model deployments in addition to direct third-party LLM providers. You can review the updated terms at glean.com/legal.