SharePoint and OneDrive
Supported Crawling Restrictions for SharePoint
Overview
- Greenlist restrictions permit crawling only for the specified content.
- Redlist restrictions prohibit crawling for the specified content.
Restriction Type | Greenlist | Redlist | Details |
---|---|---|---|
Time-based Restrictions | ✅ | ❌ | Restrict crawling to include/exclude content created/modified/viewed after a certain date. |
Identity-based Restrictions | ✅ | ❌ | Restrict crawling to include/exclude content created/modified/viewed by specific users or a specific group (plus public content). |
Content-based Restrictions | ✅ | ❌ | Restrict crawling to include/exclude specific content, documents, messages, or objects. |
Supported Restrictions
Restriction | Greenlist | Redlist | Details |
---|---|---|---|
Date | ✅ | ❌ | Restrict crawling to only content created/modified/viewed after a specific date. |
Entra ID Group | ✅ | ❌ | Restrict crawling to only content created/modified/viewed by users in a specific Entra ID group. Note: Public content is always crawled. |
Site | ✅ | ✅ | Restrict crawling to include/exclude specific SharePoint sites. |
Sites should be provided in URL format without a trailing forward slash. For example:
For Group restrictions when using Azure AD/Entra ID, the Object ID of the AD Group should be provided, NOT the Group name. For example:
Limitations
Date
Date
- This restriction cannot currently be applied in the Admin UI.
- Time-based restrictions do NOT speed up crawling as content is often not returned from vendor APIs ordered by date.
- Some vendor APIs stop returning viewed dates ~4-5 months in the past for specific types of content.
Entra ID Group
Entra ID Group
- This restriction cannot currently be applied in the Admin UI.
- Public content, depsite being accessible to all users in SharePoint, will only be viewable in Glean to users in this group.
Site
Site
- Every subsite must be explicitly specified. You cannot specify a site collection or parent-site and have the crawler include/exclude all subsites.
Applying Restrictions
Method | Supported | Details |
---|---|---|
Admin UI | ✅ | Restrictions can be applied in the Admin UI under the connector settings. |
Glean Support | ✅ | Restrictions can be applied by Glean support on request. |
Not all restrictions can be applied in the Admin UI. Please contact Glean support to apply the restriction if it is missing from the UI.