SharePoint Connector API Endpoints
Overview of the SharePoint API endpoints used by the Glean SharePoint connector
Usage Methodology
Glean uses the Microsoft Graph API and the SharePoint REST API to crawl your SharePoint and OneDrive environments.
Glean uses the recommended best practices strategy provided by Microsoft to both crawl and record incremental changes for all documents.
Authentication Endpoints
Endpoint | Permissions | URL |
---|---|---|
Token request (Graph API) Obtain and refresh an access token to interact with the Graph API using OAuth 2.0. | - | https://login.microsoftonline.com/<tenant>/oauth2/v2.0/token |
Token request (SharePoint REST API) Obtain and refresh an access token to interact with the SharePoint REST API using OAuth 2.0. | - | https://accounts.accesscontrol.windows.net/<tenant_id>/tokens/OAuth/2 |
Identity Endpoints
Endpoint | Permissions | URL |
---|---|---|
List users List all the users within the tenant. | User.Read.All | https://graph.microsoft.com/v1.0/users |
List groups List all the groups within the tenant. | GroupMember.Read.All Member.Read.Hidden | https://graph.microsoft.com/v1.0/groups |
List group members List all the members of a group. | GroupMember.Read.All Member.Read.Hidden | https://graph.microsoft.com/v1.0/groups/<group_id>/members |
Get profilePhoto Get the profile photo of a user. | User.Read.All | https://graph.microsoft.com/v1.0/users/<user_id>/photo/$value |
Get site groups Get the default site groups and associated user memberships for a given site from the SharePoint REST API. | Sites.FullControl.All | https://<site_domain>.sharepoint.com/sites/<subsite_url>/_api/web/SiteGroups?$expand=Users |
Content Endpoints
Sites
Sites include both the SharePoint site pages, and associated site metadata required for document library crawls.
Endpoint | Permissions | URL |
---|---|---|
List sites List all site collections within the tenant. | Sites.Read.All | https://graph.microsoft.com/v1.0/sites/delta |
List subsites List all the subsites within a site or subsite. | Sites.Read.All | https://graph.microsoft.com/v1.0/sites/<id>/sites |
List lists List all the lists within the site. | Sites.Read.All | https://graph.microsoft.com/v1.0/sites/<site_id>/lists |
List columns List all columns within the site (attributes of site). | Sites.Read.All | https://graph.microsoft.com/v1.0/sites/<id>/sites/<id>/columns |
List items delta List all items from delta endpoint (metadata). Used heavily in conjunction with the `List sites` endpoint, as it only returns site collections from the main geolocation on its own. | Sites.FullControl.All | https://graph.microsoft.com/v1.0/sites/<id>/sites/ <id>/lists/ <id>/item /delta |
Get site list items Get the items within a list for a site using the SharePoint REST API. The SharePoint REST API is used as some content for classic sites is not available via Graph API. | Sites.FullControl.All | https://<site_domain>.sharepoint.com/sites/<subsite_url>/_api/web/lists('<list_id>')/item |
Get site item permissions Get the permissions for an item on the site using the SharePoint REST API. The SharePoint REST API is required for site pages / web components, as Graph API only exposes permissions for Document Library items. | Sites.FullControl.All | https://<site_domain>.sharepoint.com/sites/<subsite_url>/_api/web/lists('<list_id>')/items('<item_id>')/roleassignments |
Get page content Get the web parts on a particular page (e.g. blocks of content within text boxes, titles, etc.) using the SharePoint REST API. | Sites.FullControl.All | https://<site_domain>.sharepoint.com/sites/<subsite_url>/_api/web/GetFileById('<id>')/GetLimitedWebPartManager(scope=1)/ExportWebPart |
Drives
Drives include both OneDrive for Business (user drives) and Document Libraries on SharePoint Sites.
Endpoint | Permissions | URL |
---|---|---|
List drives List all the drives within a given site. | Files.Read.All | https://graph.microsoft.com/v1.0/sites/<site_id>/drives |
Get driveItem List all the items within a drive (change-based, as per Microsoft's scanning guidance) | Sites.FullControl.All | https://graph.microsoft.com/v1.0/drives/<drive_id>/root/delta |
Get driveItem resource Retrieve metadata for an item in a specified drive. | Files.Read.All | https://graph.microsoft.com/v1.0/drives/<drive_id>/items/<item_id> |
Download file Fetch the contents of an item to index its body. | Files.Read.All | https://graph.microsoft.com/v1.0/drives/<drive_id>/items/<item_id>/content |
Get permissions Get the permissions of a given item within a drive. | Files.Read.All | https://graph.microsoft.com/v1.0/drives/<drive_id>/items/<item_id>/permissions |
Activity Endpoints
Activity data is critical to ensuring search results are ranked correctly, and for ensuring timely updates of content within Glean.
Insights
The insights endpoint is used to enhance search rankings.
Endpoint | Permissions | URL |
---|---|---|
List used Lists recent activities performed by the user on specific items | Sites.Read.All | https://graph.microsoft.com/v1.0/users/<user_id>/insights/used |
Reports
Glean uses the reports
API endpoint to obtain site, page, user, and file usage information for SharePoint & OneDrive. This data is used to validate crawler progress, and to ensure your search index is scaled correctly in relation to the volume of data expected.
Endpoint | Permissions | URL |
---|---|---|
Get OneDrive Usage: File Count Get the total number of files across all sites and how many have been created, modified, and shared within the time period. | Reports.Read.All | https://graph.microsoft.com/v1.0/reports/getOneDriveUsageFileCounts(period='{period_value}') |
Get SharePoint Usage: Site Count Get the total number of active sites within the time period. | Reports.Read.All | https://graph.microsoft.com/v1.0/reports/getSharePointSiteUsageSiteCounts(period='{period_value}') |
Get SharePoint Usage: User Count Get the total number of active SharePoint users within the time period. | Reports.Read.All | https://graph.microsoft.com/v1.0/reports/getSharePointActivityUserCounts(period='{period_value}') |
Get SharePoint Usage: Pages Get the number of pages viewed across all sites within the time period. | Reports.Read.All | https://graph.microsoft.com/v1.0/reports/getSharePointSiteUsagePages(period='{period_value}') |
Webhooks
Webhooks allow Glean to sync changes to content in your environment as those changes occur; instead of waiting for the daily incremental crawl to complete. For example: If a document is deleted, or the access permissions on it change.
Endpoint | Permissions | URL |
---|---|---|
Create a webhook subscription Glean subscribes to the `driveItem` resource which requires (as least privilege) the `Files.ReadWrite.All` permission to create the subscription. | Files.ReadWrite.All | https://webhook.azurewebsites.net/api/send/<client> |
Reauthorize a webhook subscription Reauthorize a subscription after timeout when a `reauthorizationRequired` challenge is received. | Files.ReadWrite.All | https://graph.microsoft.com/v1.0/subscriptions/<subscriptionsId>/reauthorize |
Without webhooks, changes within SharePoint and OneDrive can take up to 24 hours to be processed (via incremental crawling), compared to within <2 hours with webhooks. This includes any changes to document permissions.