Set up Glean with OpenAI GPT Models

This article provides instructions for customers hosted on GCP or AWS to configure Glean to use GPT models directly through their own OpenAI account for billing and capacity management.

Do not use this document if you are leveraging the Glean Key option. For the Glean Key option, Glean manages the configuration and provisioning of LLM resources transparently.

Enable access to models

Request access to the following models from the OpenAI Library:

Model name	How Glean uses the model
GPT-5.1 (Preferred)	Agentic Reasoning model used in Fast and Thinking Modes in Chat. This is the primary model for Glean Assistant.
GPT-5	Agentic Reasoning model used in Fast and Thinking Modes in Chat. This is the primary model for Glean Assistant.
GPT-4.1 (legacy) GPT-4o (gpt-4o-2024-05-13) (legacy)	Large model used for other, more complex tasks in Glean Assistant
GPT-4.1-mini (recommended) or GPT-4o-mini	Small model used for simpler tasks such as follow-up question generation

Capacity for OpenAI Models

We highly recommend that you use priority processing from OpenAI. Priority processing will grant you faster, more consistent performance while getting the flexibility of a pay-as-you-go model. Here is the FAQ for priority processing. Please check the OpenAI rate and usage limits for your organization. This can be found under Settings -> organization -> limits. Please ensure that you have the minimum capacity listed below, based on the number of users in your organization. Read more about the OpenAI tiers here.

Capacity Requirements for the latest assistant architecture on Agentic Engine 2 using GPT-5

Users	High capacity model		Low capacity model
Users	TPM	RPM	TPM	RPM
500	125000	10	5000	5
1000	250000	15	5000	5
2500	625000	35	10000	10
5000	1245000	65	15000	15
10000	2490000	130	30000	30

Select the model in Glean Workspace

Go to Admin Console → Platform → LLM
Click on Add LLM
Select OpenAI
Select:
- GPT-5.1 (Preferred) or GPT-5 for the agentic engine model
- GPT-4.1 (recommended) or GPT-4o for the large model
- GPT-4.1-mini (recommended) or GPT-4o-mini for the small model
Click Validate to ensure Glean can leverage the model
Once validated, click Save

Verify the model used by Glean Assistant

Go to Glean Assistant and select the Public Knowledge Assistant.
Ask the question: Who created you?

You should get a response similar to: I was created by OpenAI

FAQ

How do you ensure data security and handle potentially harmful content?

All data is encrypted in transit between your Glean instance and your OpenAI service. Please review the Data controls in the OpenAI Platform guide.You can choose to request Zero Data Protection and opt-out of modified abuse monitoring so that your prompts and generated content are not stored on OpenAI servers or subject to human review by OpenAI employees. Note that modified abuse monitoring is required for some OpenAI features, such as data analysis.

How can we estimate LLM costs?

The number of tokens we use will vary depending on the type of request (e.g. summarizing a long document will use many tokens). For requests that are retrieving an answer from the Glean search engine, the current token usage is:

Large Model: 19,000 input tokens + 450 output tokens
Small Model: 5,300 input tokens + 150 output tokens

General

Identity

Search

Assistant

Actions

Embedded Integrations

Glean MCP Servers

Protect

Knowledge

Management

Insights

Glean Customer Event Logs

Developer

Managing Agents

Set up Glean with OpenAI GPT Models

Enable access to models

Capacity for OpenAI Models

Capacity Requirements for the latest assistant architecture on Agentic Engine 2 using GPT-5

Select the model in Glean Workspace

Verify the model used by Glean Assistant

FAQ

Architecture Diagram

General

Identity

Search

Assistant

Actions

Embedded Integrations

Glean MCP Servers

Protect

Knowledge

Management

Insights

Glean Customer Event Logs

Developer

Managing Agents

​Enable access to models

​Capacity for OpenAI Models

​Capacity Requirements for the latest assistant architecture on Agentic Engine 2 using GPT-5

​Select the model in Glean Workspace

​Verify the model used by Glean Assistant

​FAQ

​Architecture Diagram

Enable access to models

Capacity for OpenAI Models

Capacity Requirements for the latest assistant architecture on Agentic Engine 2 using GPT-5

Select the model in Glean Workspace

Verify the model used by Glean Assistant

FAQ

Architecture Diagram