Enable access to models
Request access to the following models from the OpenAI Library:| Model name | How Glean uses the model |
|---|---|
| GPT-5 | Agentic Reasoning model used in Fast and Thinking Modes in Chat. This is the primary model for Glean Assistant. |
| GPT-4.1 (legacy) GPT-4o (gpt-4o-2024-05-13) (legacy) | Large model used for other, more complex tasks in Glean Assistant |
| GPT-4.1-mini (recommended) or GPT-4o-mini | Small model used for simpler tasks such as follow-up question generation |
Capacity for OpenAI Models
We highly recommend that you use priority processing from OpenAI. Priority processing will grant you faster, more consistent performance while getting the flexibility of a pay-as-you-go model. Here is the FAQ for priority processing. Please check the OpenAI rate and usage limits for your organization. This can be found underSettings -> organization -> limits. Please ensure that you have the minimum capacity listed below, based on the number of users in your organization. Read more about the OpenAI tiers here.
Capacity Requirements for the latest assistant architecture on Agentic Engine 2 using GPT-5
| Users | High capacity model | Low capacity model | ||
|---|---|---|---|---|
| TPM | RPM | TPM | RPM | |
| 500 | 125000 | 10 | 5000 | 5 |
| 1000 | 250000 | 15 | 5000 | 5 |
| 2500 | 625000 | 35 | 10000 | 10 |
| 5000 | 1245000 | 65 | 15000 | 15 |
| 10000 | 2490000 | 130 | 30000 | 30 |
Select the model in Glean Workspace
- Go to Admin Console → Platform → LLM
- Click on Add LLM
- Select OpenAI
- Select:
- GPT-5 for the agentic engine model
- GPT-4.1 (recommended) or GPT-4o for the large model
- GPT-4.1-mini (recommended) or GPT-4o-mini for the small model
- Click Validate to ensure Glean can leverage the model
- Once validated, click Save
Verify the model used by Glean Assistant
- Go to Glean Assistant and select the Public Knowledge Assistant.
- Ask the question:
Who created you?
I was created by OpenAI
FAQ
How do you ensure data security and handle potentially harmful content?
How do you ensure data security and handle potentially harmful content?
All data is encrypted in transit between your Glean instance and your OpenAI service. Please review the Data controls in the OpenAI Platform guide.You can choose to request Zero Data Protection and opt-out of modified abuse monitoring so that your prompts and generated content are not stored on OpenAI servers or subject to human review by OpenAI employees. Note that modified abuse monitoring is required for some OpenAI features, such as data analysis.
How can we estimate LLM costs?
How can we estimate LLM costs?
The number of tokens we use will vary depending on the type of request (e.g. summarizing a long document will use many tokens). For requests that are retrieving an answer from the Glean search engine, the current token usage is:
- Large Model: 19,000 input tokens + 450 output tokens
- Small Model: 5,300 input tokens + 150 output tokens