Supported Features and Limitations
The Outlook Connector supports indexing the text content of Outlook emails and related metadata, providing fast, secure searching of your mailbox. Incremental syncs and deletion handling ensure the index remains up to date. When emails are deleted from the original inbox, they are deleted from the Glean index as well.Supported Objects/Entities
Object/Entity | Description |
---|---|
Emails | Inbox, Sent mail, and configurable mail folders (for users in the product access group). Individual mail (messages) grouped by thread per user. |
Supported API Endpoints/Features
- Microsoft Graph API for mailbox data (Mail.Read and Calendars.Read and related scopes)
- Full and incremental email syncs using delta queries and webhooks
- Deletion handling (removal from Glean when deleted in Outlook)
- User/domain scoping via Azure AD product access group and allowlist
Limitations
- Attachments: Attachments are not currently indexed; only the text content of emails is available.
- Shared/Delegated Mailboxes: Not supported. Only user mailboxes within the allowed group are indexed.
- Junk/Spam Folders: Not indexed. If email is marked as junk after crawling, it is deleted on the next sync.
- Volume and Time Window: By default, up to 12 months and 15,000 threads per user.
- On-Premises/Legacy Exchange: Only cloud-based Microsoft 365 (Exchange Online) is supported.
- Group Conversations: Not supported.
Requirements
The Outlook Connector requires integration at the Azure tenant level and uses Microsoft Graph API with secure, certificate-based authentication. Some configuration and setup operations require tenant administrator access in Azure.Technical Requirements
- Microsoft 365 tenant with Exchange Online mailboxes
- Glean tenant with Admin Console access
- Registered Azure AD Application for Glean with required API permissions
Credential Requirements
- Azure AD Application (client ID)
- Directory/Tenant ID (from Azure)
- Uploaded certificate/public key for authentication (X.509 certificate)
- (Optional) productAccessGroupId (Azure AD group object ID corresponding to allowed users)
- (Optional) List of allowed domains for scoping index
Permission Requirements
- Required Microsoft Graph API application permissions:
Mail.Read.All
Calendar.Read.All
User.Read.All
GroupMember.Read.All
- Consent by Azure tenant administrator
Preliminary Source/System Setup
- Register Glean as an application in Azure AD
- Upload a generated certificate (usually via command line, as Azure UI creation may not be supported)
- Assign required Graph API permissions and grant admin consent
- Identify or create the Azure AD group to use as the product access group; copy the object ID for configuration
Permissions & Security
Data and Metadata Ingested:- Outlook email threads and message metadata (subject, thread, participants, timestamps)
- No attachments or content in spam/junk are indexed
- All index scope is enforced by the Azure product access group and (optionally) domain allowlist
- Permissions on indexed emails are mapped to the mailbox owner (user) as defined in Azure AD
- No cross-user permissioning or group/delegated mailbox support
- Only application-level read permissions are permitted (never write)
- Certificate-based authentication is mandatory
- Group/domain scoping restricts exposure of content
- Deletions and junk status in Outlook are synced to Glean, removing data as needed
- Attachments are excluded for security; support may be opt-in and domain-restricted once released
- Shared mailboxes, cross-mailbox thread deduplication, and delegated permissions are not supported
- All setup, deletion, and recreation require tenant-level admin rights
- Only Microsoft 365 (Exchange Online) is supported (not on-premises Exchange)
- If a user is removed from the product access group, their email content is deleted at the next sync
- All indexed data is restricted to the explicit scope of the Azure product access group and domain allowlist
- No information from deleted or excluded mailboxes is retained
- Least-privilege access model: only data needed for search is indexed and exposed