Enable access to foundation models in Bedrock
- Log into the AWS Console with a user account that has permissions to subscribe to Bedrock models.
- Navigate to Amazon Bedrock → Model access.
- Choose the same region as your Glean AWS instance (or the nearest supported one).
-
Request access to the following models:
Model name How Glean uses the model Claude Sonnet 4.5 (preferred model) claude-sonnet-4-5-20250929Agentic reasoning model used for assistant and autonomous agents. This is the primary model for Glean Chat. Claude 3.7 Sonnet Large model used for other, more complex tasks in Glean Assistant. Claude 3.5 Haiku Small model used for simpler tasks such as follow-up question generation.
If prompted for a use case for the models, you can state:
“Generate answers to questions about internal company documents.”
Ensure you have enough quota from Bedrock
For default quotas on these models for pay-as-you-go, please refer to the Amazon Bedrock quotas. If you need more quota, you must contact your AWS account manager, as Bedrock does not currently offer a self-service method for increasing quota.Capacity requirements
On average, Glean Assistant consumes the following per query with Claude Sonnet 4.5:- Full input: 64.4k tokens
- Cached input: 10.3k tokens
- Output: 1.2k tokens
| Users | TPM |
|---|---|
| 500 | 125,000 |
| 1000 | 245,000 |
| 2500 | 615,000 |
| 5000 | 1,225,000 |
| 10000 | 2,450,000 |
| 20000 | 4,895,000 |
It is highly recommended to use your deployment’s actual QPM for estimating capacity, as QPM per DAU can vary significantly across customers.
Select the models in Glean
- Navigate to Admin Console → Platform → LLM.
- Click on Add LLM.
- Choose Bedrock.
- Select the models:
- Claude Sonnet 4.5 for the agentic reasoning model.
- Claude 3.7 Sonnet for the large model.
- Claude 3.5 Haiku for the small model.
- Click Validate to confirm that Glean can use the models.
- After validation, click Save.
- To use Claude Sonnet 4.5 with Glean Assistant, the agentic engine features must be enabled. Until then, the assistant will use the large and small models you have configured.
- Glean will automatically apply an IAM policy to grant its servers access to Bedrock, so no extra authentication is needed.
Verify model used in Glean
- Go to Glean Chat and choose the Public Knowledge Assistant.
- Ask the question: “Who created you?”
- You should receive a response like: “I was created by the artificial intelligence company Anthropic.”
FAQ
How do you ensure data security?
How do you ensure data security?
All data is encrypted in transit between your Glean instance and the Amazon Bedrock service, which operate in the same AWS region. Amazon Bedrock does not use customer prompts and completions to train AWS models or share them with third parties. Model providers do not have access to Amazon Bedrock logs or customer data.
How do you handle potentially harmful content?
How do you handle potentially harmful content?
Please refer to the Amazon Bedrock abuse detection guide.
How can we estimate LLM costs?
How can we estimate LLM costs?
Token usage varies by request type. For answers retrieved from the Glean search engine, the current token usage is:
- Claude 3.5 Sonnet v2 or Claude 3.7 Sonnet: 19,000 input tokens + 450 output tokens.
- Claude 3.5 Haiku: 5,300 input tokens + 150 output tokens.