A policy is a configurable set of rules that defines how sensitive content is detected, monitored, and managed within your organization’s data ecosystem. Policies specify criteria such as:
  • The types of information to detect (e.g., predefined info types, custom terms, regular expressions)
  • The scope of data sources
  • Frequency
  • Exclusions
Each policy enables administrators to establish and automate protection measures tailored to organizational needs, ensuring compliance with data protection requirements and reducing the risk of unauthorized data exposure. You can manage policies through the sensitive findings, which provides tools for policy creation, enforcement, and continuous monitoring of violations. Note: Policies differ from Reports in that Policies are scheduled, ongoing scans with in‑product triage in the Sensitive findings dashboard. Reports are ad‑hoc, one‑time CSV exports for offline review.

Policy scope

Your policy’s scope determines what information Glean will review. The scope can be configured using the following criteria:
  • Data sources: Specify whether the policy applies to all data sources in your organization, or only to selected repositories or platforms.
  • Time period: Choose the range of document activity (such as when a document was viewed, created, or modified) that the policy will review.
  • Permissions: Set parameters for which documents are included based on user or group access levels (e.g., documents visible to all users, specific roles, or external collaborators).

Permissions

Narrow down the scope of documents to scan based on how broadly shared it is. If any one of these conditions are met, we will include that document in our sensitive content search.
  • “Visible to anyone in your organization” refers to documents that can be viewed by anyone at your company. For example, a Slack thread posted in a public channel or a Google Doc that can be searched and accessed by anyone at your company.
  • “Visible to anyone on the internet” refers to documents that can be searched and accessed by individuals outside your organization (e.g. a Google Doc that can be viewed by “Anyone on the internet with the link”).
  • “Visible to [N] people or more, internal or external to your organization” refers to documents that have been made accessible to at least N or more people. We prevent you from choosing a number that is too low (i.e. lower than 5 people) because documents accessible to four or fewer people generally present a lower risk and it may significantly increase the processing time.
  • “Specific teammates” refers to documents that can be accessed by specific users mentioned in this field.

Sensitive content

The sensitive content you define determines what types of information Glean’s sensitive insights feature will detect. You can specify sensitive data in multiple ways:
  • Defining specific info types, or by selecting them from a recommended list. Info types include things like credit card numbers, date of birth, SSN, and more.
  • Entering custom terms, which Glean will attempt to find matches for. Terms are specific words or phrases that match important company information, like employee IDs or job titles.
  • Defining rules using regular expressions to match specific data formats or keywords. Regular expressions help you find custom types of sensitive information that follow a flexible format, like record numbers or user IDs. We use the re2 syntax for these expressions.
This configurable approach allows you to create policies that accurately identify a broad range of sensitive content, supporting any compliance and organizational needs you may have. You can also adjust your policy to exclude content that is not sensitive, but may otherwise turn up in your findings. For example, if you set a policy to report email addresses as sensitive, you may wish to exclude a sample user (sample-user@example.com).

Frequency

All policies run on a recurring schedule. You can adjust how often policies run depending on your needs. For the highest priority sensitive content, you will likely want to set a continuous frequency to ensure that any findings are identified and addressed promptly. For lower priority sensitive content, you may wish to run on a weekly basis.
Since there are certain events/datasource that do not feed into continuous report scanning, there will be a periodic (weekly) scan that will run on the same continuous report to pick up all documents to be scanned that were not picked up by the continuous scan.

Create a policy

Create a policy to start generating findings.

Prerequisites

  • You must be a super admin or have the sensitive content moderator role enabled to create and view policies. If you’re running Glean on AWS and want to generate infotype policies, your organization must have configured AWS for infotype scanning with GCP.
  1. Navigate to the Glean Admin console > Glean Protect > Sensitive findings page, then select the policies tab.
  2. Select the Create policy button to start creating your policy. You can create a policy either from scratch or from a template.
  3. Once in the policy creation page:
    1. Define your policy’s scope:
      1. Choose a data source or scope your policy to all data sources in Glean
      2. Define a time period your policy will apply to
      3. Select the permissions or the viewership of the documents (ie: anyone in the organization, anyone on the internet, specific teammates. etc.)
    2. Define sensitive content. Choose any combination of info types, specific terms, regular expressions.
    3. Specify any terms to exclude from your policy’s search.
    4. Define the frequency with which your policy will run.
  4. Name and select the Create policy button to save your policy.

Archive a policy

Active policies run on a recurring basis. When you no longer need a policy, you can archive it. Archived policies no longer scan and no longer generate findings. Any content that is hidden will be made available. To archive a policy:
  1. Navigate to the Glean Admin console > Glean Protect > Sensitive findings page, then select the policies tab.
  2. Select the menu icon on the right side of the policy you wish to archive.
  3. Select the Archive option.

Restore a policy

You can restore an archived policy at any time. Once archived policies are restored, they will begin generating findings in accordance with their frequency. To restore a policy:
  1. Navigate to the Glean Admin console > Glean Protect > Sensitive findings page, then select the policies tab.
  2. Select the Restore button for the policy you wish to restore.
Please note that only the following data sources are supported for continuous policies: Aha, Airtable, Asana, Bitbucket, Box, Confluence, Egnyte, Google Chat, Google Drive, Gitlab, Github, Google Groups, Google Sites, Greenhouse, Guru, Jira, Lessonly, Lever, Miro, Microsoft Teams, O365 Onedrive, O365 Sharepoint, Pagerduty, Quip, Slack, Seismic, Trello, Wordpress, and Zendesk

Supported info types

see Supported info types
Last updated: July 2025