Overview
Glean provides the ability to use web search in world knowledge mode to access real-time web information alongside company data. This feature is available in Glean Assistant and Agents.Web Search in Assistant
When in world knowledge mode, Glean triggers web search or leverages LLM data based on the user query. Users will also see a “globe” button in the chat box which enables the user to use web search for every message. Thus, the user is able to determine when a web search should be used for their request.Examples of Requests that Would Trigger a Web Search
- What was Glean’s most recent funding round?
- What is the latest AI news?
- What are the 2025 trends for the financial services sector?
- Provide an overview of the most recent earnings report.
Web Search in Agents
Agent creators will be able to use web search action for an agent step. This will enable the creation of agents that can combine internal company knowledge with real-time web searches.How to Enable Web Search
To enable web search, Admins need to set up one or more web search action as follows:- Under Admin Console > Platform -> Actions select one of the following options to add an available web search action
- OpenAI actions
- Select an instance name e.g. OpenAI Web Search.
- Choose how you wish to use this provider i.e. using Glean Key or your own key. If using your own key, configure the Organization ID and OpenAI API Key that should be used with this action.
- Publish the action to be used in chat and/or agents.
- Save the action.
- Brave actions
- Select an instance name e.g. Brave Web Search.
- Publish the action to be used in chat and/or agents.
- Save the action.
- Google Gemini actions
- Select an instance name e.g. Google Gemini Web Search.
- Select a Gemini Search tool.
- Google consumer search: Uses public web results. Select this option if your application needs more up-to-date results for fast-changing topics. Learn more
- Google enterprise search: designed for regulated industries. It provides zero data retention, privacy and compliance-focused results. Learn more
- Publish the action to be used in Glean Assistant and/or Agents.
- Save the action.
- Microsoft Bing actions
- Select an instance name e.g. Microsoft Bing Web Search.
- Select an authentication method. Customers have the option of using Glean’s Bing key or using their own Bing API key.If you already have your own Bing API key, and prefer to leverage that you can do so in Glean. We recommend that customers use the S1 pricing tier of Bing API.
When customers use their own Bing key, they may turn on analytics for their Bing API key and can monitor web search usage on their own dashboard. - Publish the action to be used with Glean Assistant and/or Agents.
- Save the action.
Note: Bing Search APIs will be retired from Azure Marketplace on 11 August 2025. Any existing instances of Bing Search will be retired. As a result starting 7/17/25, customers will not be able to add this action to Glean and by 8/8/25, all existing instances of this action will be migrated to Brave and OpenAI actions.
- OpenAI actions
FAQ
What is the recommended web search provider for each cloud provider?
What is the recommended web search provider for each cloud provider?
We recommend that you use OpenAI with any cloud provider and Gemini search within the Google Cloud family.
How should I compare the performance of Google enterprise search versus consumer search?
How should I compare the performance of Google enterprise search versus consumer search?
Google Enterprise Search is the only web provider currently supported by Glean with zero data retention, though it has a slightly longer content refresh delay—typically every few hours. If you are latency sensitive, running searches or agents that examine the most recent industry news, then we recommend using Google consumer search.
What data is sent to a web search provider? What data is retained by a web search provider?
What data is sent to a web search provider? What data is retained by a web search provider?
When a user submits a query, we construct a corresponding web search query and include the user’s work location to personalize the search results based on their location. Glean does not log any of the information sent to or received from the web search provider.
-
For OpenAI
Zero Data Retention is turned on and data sent to the OpenAI API is not used to train or improve OpenAI models. - For Brave Zero Data Retention is turned on and data sent to the Brave API is not used to train or improve OpenAI models.
-
For Google Gemini
Data sent to Google will not be used to train or fine-tune any AI/ML models
If the consumer search option is chosen Google stores the web search query and contextual information sent to it, i.e. user’s work location for thirty (30) days for (1) purposes of creating Grounded Results and Search Suggestions and (2) debugging and testing of systems that support Grounding with Google Search.
If Enterprise search option is chosen, Zero Data Retention is turned on and no data is stored. Google Terms can be found here -
For Microsoft Bing
We have disabled Bing search analytics to further protect user privacy. Additionally We use Bing’s safeSearch parameter=“strict” to filter for webpages with adult content (Reference Link) However Note that Bing retains user data per its Microsoft’s Privacy Statement.
How does Glean verify and protect against prompt injection attacks?
How does Glean verify and protect against prompt injection attacks?
Only URLs entered by the user explicitly or returned by Bing are dereferenced. After parsing a URL, we perform an antivirus scan using ClamAV to confirm that the content is free from malware before using it for answer generation. Additionally, we leverage Google’s Web Risk Checker service to verify the safety of shared URLs.
For Google, we get the response from the Gemini API grounded with the web URLs. We don’t crawl the content of the URLs.
Similarly, for OpenAI, we get the response from the OpenAI API grounded with the web URLs. We don’t crawl the content of the URLs.
For Google, we get the response from the Gemini API grounded with the web URLs. We don’t crawl the content of the URLs.
Similarly, for OpenAI, we get the response from the OpenAI API grounded with the web URLs. We don’t crawl the content of the URLs.
Does Glean web search access data behind paywalls?
Does Glean web search access data behind paywalls?
Glean avoids all websites that are behind a paywall and have specific instructions in robots.txt that instruct crawlers to not fetch their data.
Looking for the original version of this page? You can find the archived version here.