Box
Crawling Restrictions
Supported exclusion and inclusion rules for Box to allow you to manage what data is crawled by Glean.
Overview
- Greenlist restrictions permit Glean to only crawl and index specified content (specific include).
- Redlist restrictions permit Glean to crawl and index everything except the specified content (specific exclude).
Restriction Type | Greenlist | Redlist | Details |
---|---|---|---|
Time-based Restrictions | ✅ | ❌ | Restrict crawling to include/exclude content created/modified/viewed after a certain date. |
User-based Restrictions | ❌ | ✅ | Restrict crawling to include/exclude content created/modified/viewed by specific users or a specific group (plus public content). |
Content-based Restrictions | ❌ | ✅ | Restrict crawling to include/exclude specific content, documents, messages, or objects (see below). |
Supported Restrictions
Restriction | Greenlist | Redlist | Details |
---|---|---|---|
Date | ✅ | ❌ | Restrict crawling to only content created/modified/viewed after a specific date, e.g. YYYY-MM-DD |
User (Owner) | ❌ | ✅ | Restrict crawling to exclude content owned by specific users. |
Content (Folder) | ✅ | ✅ | Restrict crawling to exclude content within specific folders. When greenlisting folders, only the top-level folder ID is supported (greenlisting a nested folder is not supported): This limitation does not apply to redlisting. |
Content (File) | ❌ | ✅ | Restrict crawling to exclude specific files. |
Event Types | ❌ | ✅ | Restrict activity/content updates to specific event types (e.g. DOWNLOAD ) |
Service Account (Email) | ❌ | ✅ | Restrict activity/content updates from certain service accounts (e.g. Bots/services that synchronize or backup content to Box) |
When applying a folder greenlist, only the top-level folder ID is supported. Specifying a nested folder ID will not work. Redlisting is not affected by this limitation.
Applying Restrictions
Method | Supported | Details |
---|---|---|
Admin UI | ❌ | Restrictions cannot currently be applied in the Admin UI. |
Glean Support | ✅ | Restrictions can be applied by Glean Support on request. |
Format
When specifying restrictions for Owners, Folders, or Files, the ID of the owner, folder, or file within Box must be specified. For example:
-
Owner IDs:
-
Folder IDs:
-
File IDs:
Locating User IDs
As a Box admin, navigate to the Box Content Manager.
Click on a user from the user list, and the URL will reveal their user ID. For example, in https://app.box.com/master/content/2267862105/0/0
, the user ID is 2267862105
.
Was this page helpful?