Run a sensitive content report

Configure AWS for info type scanning

Frequently asked questions

Supported info types

Data sources that do not support continuous scans

If you’d like to use sensitive content reporting, please reach out to your account executive or customer success manager at Glean.


A sensitive content report is a tool used to identify and manage documents that contain sensitive information within an organization. This report helps in detecting and mitigating the risks associated with broadly shared or externally exposed sensitive content. The instructions in this document provide guidance for running sensitive content reports and, if you’re running Glean on AWS, provide configuration instructions required for infotype scanning.

Run a sensitive content report

The parts below explain the components that make up a sensitive content report, and how you can build your own.

Prerequisites

Before you can run a sensitive content report, ensure you meet the following prerequisites:


Part 1: Identify scope of report

Select the data sources for scanning. Then, narrow the scope of content based on: 1) the date it was last viewed, edited, or modified, and 2) its level of accessibility.

Data source

Specify the set of data sources that should be included. We recommend limiting each report to a few data sources for faster processing, especially if the data source has a lot of content indexed.

Time period

Narrow down the scope of documents to scan based on when it was either last viewed, created, or modified.

We recommend choosing “past year” or using the “Custom” option to find documents from a specific time period. The “All-time” option is available but may take longer to process, especially for large data sources.

Permissions

Narrow down the scope of documents to scan based on how broadly shared it is. If any one of these conditions are met, we will include that document in our sensitive content search.

  • “Visible to anyone in your organization” refers to documents that can be viewed by anyone at your company. For example, a Slack thread posted in a public channel or a Google Doc that can be searched and accessed by anyone at your company.

  • “Visible to anyone on the internet” refers to documents that can be searched and accessed by individuals outside your organization (e.g. a Google Doc that can be viewed by “Anyone on the internet with the link”).

  • “Visible to [N] people or more, internal or external to your organization” refers to documents that have been made accessible to at least N or more people. We prevent you from choosing a number that is too low (i.e. lower than 5 people) because documents accessible to four or fewer people generally present a lower risk and it may significantly increase the processing time.


Part 2: Define sensitive content

Glean can search for three types of sensitive content: info types, terms, and regular expressions. You can specify up to 100 items in total across these categories for each report.

Info types

Sensitive info types include things like credit card numbers, date of birth, SSN, etc. You can toggle on our “recommended info types” and add any additional ones you’d like. You can see the complete list of supported info types and their descriptions below.

Terms

Sensitive terms are specific words or phrases that match important company information, like employee IDs or job titles. When Glean scans for these terms, it ignores case differences and replaces any non-letter or non-digit characters with spaces. For example, “John Doe” matches “john doe,” “John, Doe,” and “John (Doe).”

The “hide terms” toggle lets you temporarily hide certain terms while you enter or view data. This only affects what you see on your screen and does not change the actual data, reports, or how the system stores the data.

Regular expressions

Sensitive regular expressions help you find custom types of sensitive information that follow a flexible format, like record numbers or user IDs. We use the re2 syntax for these expressions. You can learn more about this syntax on Github.


Part 3: Run report

You can run reports either individually or on a recurring basis. Glean supports the following frequencies:

  • One-time

    One-time reports start generating immediately after creation.

  • Weekly
    Weekly reports start generating on Friday evening (Pacific Time) on a weekly basis. The specific time and cadence of these scheduled reports are not configurable.

  • (GCP Only) Continuous
    Continuous reports generate an hourly report of documents created or modified in the past hour and run until canceled. Any time period specified in the scope section of the report will be ignored. Continuous reports do not appear for download in the Glean UI. Instead, they are placed in your cloud storage.

For both one-time and weekly reports, the report run time varies depending on the number of documents. Typically, report generation takes between an hour to a day.

Note:

Continuous reports cannot scan certain information sources that do not support webhooks or incremental crawls. This lack of support means that Glean cannot perform full scans or identity crawls in real-time or incrementally.

Continuous Reports Supported Data Sources

Glean supports continuous reports on the following data sources:

  • Aha

  • Airtable

  • Asana

  • Bitbucket

  • Box

  • Confluence

  • Egnyte

  • Gchat

  • Gdrive

  • Gitlab

  • Github

  • Google Groups

  • Google Sites

  • Greenhouse

  • Guru

  • Jira

  • Lessonly

  • Lever

  • Miro

  • Microsoft Teams

  • O365 Onedrive

  • O365 Sharepoint

  • Pagerduty

  • Quip

  • Slack

  • Seismic

  • Trello

  • Wordpress

  • Zendesk


Part 4: Review the report

The user who generated the report will receive an email when the report is complete. The report can be downloaded as a CSV file from the Sensitive content reporting page by the user who generated it and anyone with the “Sensitive Content Moderator” role. Reports automatically expire after 1 year.

The report includes a row for each finding with the following information:

  • Document ID (internal to Glean, but prefixed with the data source the document belongs to)

  • Hyperlink to document

  • Email of the document owner

  • Department of the document owner

  • Sharing level, externally exposed or broadly shared

  • Reason for document being considered broadly shared

  • Number of people the document is accessible to (“public_access” if accessible anonymously)

  • Name of the container (e.g. folder) the document is in

  • The document type of the document

  • The datasource (e.g. Google Drive) of the document

  • Number of unique visitors to the document

  • Number of total visits to the document

  • Match type of the sensitive content (e.g. SSN, PASSWORD, etc)

  • Likelihood of the match (e.g. LIKELY, VERY_LIKELY, etc)

  • Text snippet that caused the match (e.g. “passw0rd123!”)

  • Timestamp

If an error occurs, we will notify the user who tried to generate the report and start an investigation. Please do not try to generate the report again, as it may not fix the issue. No further action is needed from the user at this time.


Configure AWS for Infotype Scanning

Glean sensitive content reports rely on Google’s DLP API for data classification for info type scanning. If you’re only using regex and term detection, you can skip this section. Glean customers deployed on AWS must create or use an existing GCP account to run infotype scanning. You can use the Glean UI to add a GCP DLP API key, which is then securely stored within your AWS deployment and is used to make API calls.

Prerequisites

  • You must be running Glean on AWS

  • You must create or have an existing GCP account and project

  • Your GCP project must link to a billing account

Configure your GCP Project and Connect the DLP Service to Glean

  1. From your GCP project, enable the DLP API using the link:
    https://console.cloud.google.com/apis/api/dlp.googleapis.com/overview?project=[project_ID]

    Replace the [project_ID] with your GCP project ID.

  2. From the service accounts page, create a service account by selecting the project, then selecting create service account.

  3. From the IAM page, grant the DLP administrator IAM role access to the service account.

  4. From the service accounts page, generate an API key and download it to your computer.

  5. Upload the API key you created in the step above to Glean’s Sensitive content reporting page.


Frequently Asked Questions

Q: How many reports can I generate at a time?

A: You can generate up to 5 reports at a time. This includes all reports currently being generated as well as weekly scheduled reports.

Q: How do I cancel a report?

A: To cancel a report in progress, go to Sensitive content reporting. You will see a banner with a button to “Cancel report” or “Cancel weekly report.” Canceling a weekly report will cancel the entire series, as we do not support skipping weekly runs.

Q: Can I see what configurations I used for a report?

A: Yes! Go to the Sensitive content reporting, and click on the name of the report you want to review. You can see the parameters you set for that report. If the report is done generating, you can also see how long it took, the total number of documents scanned, and the number of violations found.


Supported info types

ADVERTISING_IDIdentifiers used by developers to track users for advertising purposes. These include Google Play Advertising IDs, Amazon Advertising IDs, Apple’s identifierForAdvertising (IDFA), and Apple’s identifierForVendor (IDFV).
AGEAn age measured in months or years.
CREDIT_CARD_NUMBERA credit card number is 12 to 19 digits long. They are used for payment transactions globally.
CREDIT_CARD_TRACK_NUMBERA credit card track number is a variable length alphanumeric string. It is used to store key cardholder information.
DATEA date. This infoType includes most date formats, including the names of common world holidays.
DATE_OF_BIRTHA date of birth.
DOMAIN_NAMEA domain name as defined by the DNS standard.
EMAIL_ADDRESSAn email address identifies the mailbox that emails are sent to or from. The maximum length of the domain name is 255 characters, and the maximum length of the local-part is 64 characters.
ETHNIC_GROUPA person’s ethnic group.
FEMALE_NAMEA common female name.
FIRST_NAMEA first name is defined as the first part of a PERSON_NAME.
GENDERA person’s gender identity.
IBAN_CODEAn International Bank Account Number (IBAN) is an internationally agreed-upon method for identifying bank accounts defined by the International Standard of Organization (ISO) 13616:2007 standard. The European Committee for Banking Standards (ECBS) created ISO 13616:2007. An IBAN consists of up to 34 alphanumeric characters, including elements such as a country code or account number.
HTTP_COOKIEAn HTTP cookie is a standard way of storing data on a per website basis. This detector will find headers containing these cookies.
ICD9_CODEThe International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) lexicon is used to assign diagnostic and procedure codes associated with inpatient, outpatient, and physician office use in the United States. The US National Center for Health Statistics (NCHS) created the ICD-9-CM lexicon. It is based on the ICD-9 lexicon, but provides for more morbidity detail. The ICD-9-CM lexicon is updated annually on October 1.
ICD10_CODELike ICD-9-CM codes, the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) lexicon is a series of diagnostic codes. The World Health Organization (WHO) publishes the ICD-10-CM lexicon to describe causes of morbidity and mortality.
IMEI_HARDWARE_IDAn International Mobile Equipment Identity (IMEI) hardware identifier, used to identify mobile phones.
IP_ADDRESSAn Internet Protocol (IP) address (either IPv4 or IPv6).
LAST_NAMEA last name is defined as the last part of a PERSON_NAME.
LOCATIONA physical address or location.
MAC_ADDRESSA media access control address (MAC address), which is an identifier for a network adapter.
MAC_ADDRESS_LOCALA local media access control address (MAC address), which is an identifier for a network adapter.
MALE_NAMEA common male name.
MEDICAL_TERMTerms that commonly refer to a person’s medical condition or health.
ORGANIZATION_NAMEA name of a chain store, business or organization.
ADVERTISING_IDIdentifiers used by developers to track users for advertising purposes. These include Google Play Advertising IDs, Amazon Advertising IDs, Apple’s identifierForAdvertising (IDFA), and Apple’s identifierForVendor (IDFV).
AGEAn age measured in months or years.
CREDIT_CARD_NUMBERA credit card number is 12 to 19 digits long. They are used for payment transactions globally.
CREDIT_CARD_TRACK_NUMBERA credit card track number is a variable length alphanumeric string. It is used to store key cardholder information.
DATEA date. This infoType includes most date formats, including the names of common world holidays.
DATE_OF_BIRTHA date of birth.
DOMAIN_NAMEA domain name as defined by the DNS standard.
EMAIL_ADDRESSAn email address identifies the mailbox that emails are sent to or from. The maximum length of the domain name is 255 characters, and the maximum length of the local-part is 64 characters.
ETHNIC_GROUPA person’s ethnic group.
FEMALE_NAMEA common female name.
FIRST_NAMEA first name is defined as the first part of a PERSON_NAME.
GENDERA person’s gender identity.
IBAN_CODEAn International Bank Account Number (IBAN) is an internationally agreed-upon method for identifying bank accounts defined by the International Standard of Organization (ISO) 13616:2007 standard. The European Committee for Banking Standards (ECBS) created ISO 13616:2007. An IBAN consists of up to 34 alphanumeric characters, including elements such as a country code or account number.
HTTP_COOKIEAn HTTP cookie is a standard way of storing data on a per website basis. This detector will find headers containing these cookies.
ICD9_CODEThe International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) lexicon is used to assign diagnostic and procedure codes associated with inpatient, outpatient, and physician office use in the United States. The US National Center for Health Statistics (NCHS) created the ICD-9-CM lexicon. It is based on the ICD-9 lexicon, but provides for more morbidity detail. The ICD-9-CM lexicon is updated annually on October 1.
ICD10_CODELike ICD-9-CM codes, the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) lexicon is a series of diagnostic codes. The World Health Organization (WHO) publishes the ICD-10-CM lexicon to describe causes of morbidity and mortality.
IMEI_HARDWARE_IDAn International Mobile Equipment Identity (IMEI) hardware identifier, used to identify mobile phones.
IP_ADDRESSAn Internet Protocol (IP) address (either IPv4 or IPv6).
LAST_NAMEA last name is defined as the last part of a PERSON_NAME.
LOCATIONA physical address or location.
MAC_ADDRESSA media access control address (MAC address), which is an identifier for a network adapter.
MAC_ADDRESS_LOCALA local media access control address (MAC address), which is an identifier for a network adapter.
MALE_NAMEA common male name.
MEDICAL_TERMTerms that commonly refer to a person’s medical condition or health.
ORGANIZATION_NAMEA name of a chain store, business or organization.
PASSPORTA passport number that matches passport numbers for the following countries: Australia, Canada, China, France, Germany, Japan, Korea, Mexico, The Netherlands, Poland, Singapore, Spain, Sweden, Taiwan, United Kingdom, and the United States.
PERSON_NAMEA full person name, which can include first names, middle names or initials, and last names.
PHONE_NUMBERA telephone number.
STREET_ADDRESSA street address.
SWIFT_CODEA SWIFT code is the same as a Bank Identifier Code (BIC). It’s a unique identification code for a particular bank. These codes are used when transferring money between banks, particularly for international wire transfers. Banks also use the codes for exchanging other messages.
TIMEA timestamp of a specific time of day.
URLA Uniform Resource Locator (URL).
AUTH_TOKENAn authentication token is a machine-readable way of determining whether a particular request has been authorized for a user. This detector currently identifies tokens that comply with OAuth or Bearer authentication.
BASIC_AUTH_HEADERA basic authentication header is an HTTP header used to identify a user to a server. It is part of the HTTP specification in RFC 1945, section 11.
ENCRYPTION_KEYAn encryption key within configuration, code, or log text.
GCP_CREDENTIALSGoogle Cloud service account credentials. Credentials that can be used to authenticate with Google API client libraries and service accounts.
HTTP_COOKIEAn HTTP cookie is a standard way of storing data on a per website basis. This detector will find headers containing these cookies.
PASSWORDClear text passwords in configs, code, and other text.
WEAK_PASSWORD_HASHA weakly hashed password is a method of storing a password that is easy to reverse engineer. The presence of such hashes often indicate that a system’s security can be improved.
XSRF_TOKENAn XSRF token is an HTTP header that is commonly used to prevent cross-site scripting attacks. Cross-site scripting is a type of security vulnerability that can be exploited by malicious sites.

United States

InfoTypeDescription
AMERICAN_BANKERS_CUSIP_IDAn American Bankers’ Committee on Uniform Security Identification Procedures (CUSIP) number is a 9-character alphanumeric code that identifies a North American financial security.
FDA_CODEThe US National Drug Code (NDC) is a unique identifier for drug products, mandated in the United States by the Food and Drug Administration (FDA).
US_ADOPTION_TAXPAYER_IDENTIFICATION_NUMBERA United States Adoption Taxpayer Identification Number (ATIN) is a type of United States Tax Identification Number (TIN). An ATIN is issued by the Internal Revenue Service (IRS) to individuals who are in the process of legally adopting a US citizen or resident child.
US_BANK_ROUTING_MICRThe American Bankers Association (ABA) Routing Number (also called the transit number) is a nine-digit code. It’s used to identify the financial institution that’s responsible to credit or entitled to receive credit for a check or electronic transaction.
US_DEA_NUMBERA US Drug Enforcement Administration (DEA) number is assigned to a health care provider by the US DEA. It allows the health care provider to write prescriptions for controlled substances. The DEA number is often used as a general “prescriber number” that is a unique identifier for anyone who can prescribe medication.
US_DRIVERS_LICENSE_NUMBERA driver’s license number for the United States. Format can vary depending on the issuing state.
US_EMPLOYER_IDENTIFICATION_NUMBERA United States Employer Identification Number (EIN) is also known as a Federal Tax Identification Number, and is used to identify a business entity.
US_HEALTHCARE_NPIThe US National Provider Identifier (NPI) is a unique 10-digit identification number issued to health care providers in the United States by the Centers for Medicare and Medicaid Services (CMS). The NPI has replaced the unique provider identification number (UPIN) as the required identifier for Medicare services. It’s also used by other payers, including commercial healthcare insurers.
US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBERA United States Individual Taxpayer Identification Number (ITIN) is a type of Tax Identification Number (TIN), issued by the Internal Revenue Service (IRS). An ITIN is a tax processing number only available for certain nonresident and resident aliens, their spouses, and dependents who cannot get a Social Security Number (SSN).
US_PASSPORTA United States passport number.
US_PREPARER_TAXPAYER_IDENTIFICATION_NUMBERA United States Preparer Taxpayer Identification Number (PTIN) is an identification number that all paid tax return preparers must use on US federal tax returns or claims for refund submitted to the US Internal Revenue Service (IRS).
US_SOCIAL_SECURITY_NUMBERA United States Social Security number (SSN) is a 9-digit number issued to US citizens, permanent residents, and temporary residents. The Social Security number has effectively become the United States national identification number.
US_STATEA United States state name.
US_TOLLFREE_PHONE_NUMBERA US toll-free telephone number.
US_VEHICLE_IDENTIFICATION_NUMBERA vehicle identification number (VIN) is a unique 17-digit code assigned to every on-road motor vehicle.

Argentina

InfoTypeDescription
ARGENTINA_DNI_NUMBERAn Argentine Documento Nacional de Identidad (DNI), or national identity card, is used as the main identity document for citizens.

Australia

InfoTypeDescription
AUSTRALIA_DRIVERS_LICENSE_NUMBERAn Australian driver’s license number.
AUSTRALIA_MEDICARE_NUMBERA 9-digit Australian Medicare account number is issued to permanent residents of Australia (except for Norfolk island). The primary purpose of this number is to prove Medicare eligibility to receive subsidized care in Australia.
AUSTRALIA_PASSPORTAn Australian passport number.
AUSTRALIA_TAX_FILE_NUMBERAn Australian tax file number (TFN) is a number issued by the Australian Tax Office for taxpayer identification. Every taxpaying entity, such as an individual or an organization, is assigned a unique number.

Belgium

InfoTypeDescription
BELGIUM_NATIONAL_ID_CARD_NUMBERA 12-digit Belgian national identity card number.

Brazil

InfoTypeDescription
BRAZIL_CPF_NUMBERThe Brazilian Cadastro de Pessoas Físicas (CPF) number, or Natural Persons Register number, is an 11-digit number used in Brazil for taxpayer identification.

Canada

InfoTypeDescription
CANADA_BANK_ACCOUNTA Canadian bank account number.
CANADA_BC_PHNThe British Columbia Personal Health Number (PHN) is issued to citizens, permanent residents, temporary workers, students, and other individuals who are entitled to health care coverage in the Province of British Columbia.
CANADA_DRIVERS_LICENSE_NUMBERA driver’s license number for each of the ten provinces in Canada.
CANADA_OHIPThe Ontario Health Insurance Plan (OHIP) number is issued to citizens, permanent residents, temporary workers, students, and other individuals who are entitled to health care coverage in the Province of Ontario.
CANADA_PASSPORTA Canadian passport number.
CANADA_QUEBEC_HINThe Québec Health Insurance Number (HIN) is issued to citizens, permanent residents, temporary workers, students, and other individuals who are entitled to health care coverage in the Province of Québec.
CANADA_SOCIAL_INSURANCE_NUMBERThe Canadian Social Insurance Number (SIN) is the main identifier used in Canada for citizens, permanent residents, and people on work or study visas. With a Canadian SIN and mailing address, one can apply for health care coverage, driver’s licenses, and other important services.

Chile

InfoTypeDescription
CHILE_CDI_NUMBERA Chilean Cédula de Identidad (CDI), or identity card, is used as the main identity document for citizens.

China

InfoTypeDescription
CHINA_RESIDENT_ID_NUMBERA Chinese resident identification number.
CHINA_PASSPORTA Chinese passport number.

Colombia

InfoTypeDescription
COLOMBIA_CDC_NUMBERA Colombian Cédula de Ciudadanía (CDC), or citizenship card, is used as the main identity document for citizens.

Denmark

InfoTypeDescription
DENMARK_CPR_NUMBERA Personal Identification Number (CPR, Det Centrale Personregister) is a national ID number in Denmark. It is used with public agencies such as health care and tax authorities. Banks and insurance companies also use it as a customer number. The CPR number is required for people who reside in Denmark, pay tax or own property there.

France

InfoTypeDescription
FRANCE_CNIThe French Carte Nationale d’Identité Sécurisée (CNI or CNIS) is the French national identity card. It’s an official identity document consisting of a 12-digit identification number. This number is commonly used when opening bank accounts and when paying by check. It can sometimes be used instead of a passport or visa within the European Union (EU) and in some other countries.
FRANCE_NIRThe French Numéro d’Inscription au Répertoire (NIR) is a permanent personal identification number that’s also known as the French social security number for services including healthcare and pensions.
FRANCE_PASSPORTA French passport number.
FRANCE_TAX_IDENTIFICATION_NUMBERThe French tax identification number is a government-issued ID for all individuals paying taxes in France.

Finland

InfoTypeDescription
FINLAND_NATIONAL_ID_NUMBERA Finnish personal identity code, a national government identification number for Finnish citizens used on identity cards, driver’s licenses and passports.

Germany

InfoTypeDescription
GERMANY_DRIVERS_LICENSE_NUMBERA German driver’s license number.
GERMANY_IDENTITY_CARD_NUMBERThe German Personalausweis, or identity card, is used as the main identity document for citizens of Germany.
GERMANY_PASSPORTA German passport number. The format of a German passport number is 10 alphanumeric characters, chosen from numerals 0–9 and letters C, F, G, H, J, K, L, M, N, P, R, T, V, W, X, Y, Z.
GERMANY_TAXPAYER_IDENTIFICATION_NUMBERAn 11-digit German taxpayer identification number assigned to both natural-born and other legal residents of Germany for the purposes of recording tax payments.
GERMANY_SCHUFA_IDA German Schufa identification number. Schufa Holding AG is a German credit bureau whose aim is to protect clients from credit risk.

Hong Kong

InfoTypeDescription
HONG_KONG_ID_NUMBERThe 香港身份證, or Hong Kong identity card (HKIC), is used as the main identity document for citizens of Hong Kong.

India

InfoTypeDescription
INDIA_AADHAAR_INDIVIDUALThe Indian Aadhaar number is a 12-digit unique identity number obtained by residents of India, based on their biometric and demographic data.
INDIA_GST_INDIVIDUALThe Indian GST identification number (GSTIN) is a unique identifier required of every business in India for taxation.
INDIA_PAN_INDIVIDUALThe Indian Personal Permanent Account Number (PAN) is a unique 10-digit alphanumeric identifier used for identification of individuals—particularly people who pay income tax. It’s issued by the Indian Income Tax Department. The PAN is valid for the lifetime of the holder.

Indonesia

InfoTypeDescription
INDONESIA_NIK_NUMBERAn Indonesian Single Identity Number (Nomor Induk Kependudukan, or NIK) is the national identification number of Indonesia. The NIK is used as the basis for issuing Indonesian resident identity cards (Kartu Tanda Penduduk, or KTP), passports, driver’s licenses and other identity documents.

Italy

InfoTypeDescription
ITALY_FISCAL_CODEAn Italy fiscal code number is a unique 16-digit code assigned to Italian citizens as a form of identification.

Japan

InfoTypeDescription
JAPAN_BANK_ACCOUNTA Japanese bank account number.
JAPAN_DRIVERS_LICENSE_NUMBERA Japanese driver’s license number.
JAPAN_INDIVIDUAL_NUMBERThe Japanese national identification number—sometimes referred to as “My Number”—is a new national ID number as of January 2016.
JAPAN_PASSPORTA Japanese passport number. The passport number consists of two alphabetic characters followed by seven digits.

Korea

InfoTypeDescription
KOREA_PASSPORTA Korean passport number.
KOREA_RRNA South Korean Social Security number.

Mexico

InfoTypeDescription
MEXICO_CURP_NUMBERThe Mexico Clave Única de Registro de Población (CURP) number, or Unique Population Registry Code or Personal Identification Code number. The CURP number is an 18-character state-issued identification number assigned by the Mexican government to citizens or residents of Mexico and used for taxpayer identification.
MEXICO_PASSPORTA Mexican passport number.

The Netherlands

InfoTypeDescription
NETHERLANDS_BSN_NUMBERA Dutch Burgerservicenummer (BSN), or Citizen’s Service Number, is a state-issued identification number that’s on driver’s licenses, passports, and international ID cards.
NETHERLANDS_PASSPORTA Dutch passport number.

Norway

InfoTypeDescription
NORWAY_NI_NUMBERNorway‘s Fødselsnummer, National Identification Number, or Birth Number is assigned at birth, or on migration into the country. It is registered with the Norwegian Tax Office.

Paraguay

InfoTypeDescription
PARAGUAY_CIC_NUMBERA Paraguayan Cédula de Identidad Civil (CIC), or civil identity card, is used as the main identity document for citizens.

Peru

InfoTypeDescription
PERU_DNI_NUMBERA Peruvian Documento Nacional de Identidad (DNI), or national identity card, is used as the main identity document for citizens.

Poland

InfoTypeDescription
POLAND_PESEL_NUMBERThe PESEL number is the national identification number used in Poland. It is mandatory for all permanent residents of Poland, and for temporary residents staying there longer than 2 months. It is assigned to just one person and cannot be changed.
POLAND_NATIONAL_ID_NUMBERThe Polish identity card number. is a government identification number for Polish citizens. Every citizen older than 18 years must have an identity card. The local Office of Civic Affairs issues the card, and each card has its own unique number.
POLAND_PASSPORTA Polish passport number. Polish passport is an international travel document for Polish citizens. It can also be used as a proof of Polish citizenship.

Portugal

InfoTypeDescription
PORTUGAL_CDC_NUMBERA Portuguese Cartão de cidadão (CDC), or Citizen Card, is used as the main identity, Social Security, health services, taxpayer, and voter document for citizens.

Singapore

InfoTypeDescription
SINGAPORE_NATIONAL_REGISTRATION_ID_NUMBERA unique set of nine alpha-numeric characters on the Singapore National Registration Identity Card.
SINGAPORE_PASSPORTA Singaporean passport number.

Spain

InfoTypeDescription
SPAIN_CIF_NUMBERThe Spanish Código de Identificación Fiscal (CIF) was the tax identification system used in Spain for legal entities until 2008. It was then replaced by the Número de Identificación Fiscal (NIF) for natural and juridical persons.
SPAIN_DNI_NUMBERA Spain national identity number.
SPAIN_DRIVERS_LICENSE_NUMBERA Spanish driver’s license number.
SPAIN_NIE_NUMBERThe Spanish Número de Identificación de Extranjeros (NIE) is an identification number for foreigners living or doing business in Spain. An NIE number is needed for key transactions such as opening a bank account, buying a car, or setting up a mobile phone contract.
SPAIN_NIF_NUMBERThe Spanish Número de Identificación Fiscal (NIF) is a government identification number for Spanish citizens. An NIF number is needed for key transactions such as opening a bank account, buying a car, or setting up a mobile phone contract.
SPAIN_PASSPORTA Spanish Ordinary Passport (Pasaporte Ordinario) number. There are 4 different types of passports in Spain. This detector is for the Ordinary Passport (Pasaporte Ordinario) type, which is issued for ordinary travel, such as vacations and business trips.
SPAIN_SOCIAL_SECURITY_NUMBERThe Spanish Social Security number (Número de Afiliación a la Seguridad Social) is a 10-digit sequence that identifies a person in Spain for all interactions with the country’s Social Security system.

Sweden

InfoTypeDescription
SWEDEN_NATIONAL_ID_NUMBERA Swedish Personal Identity Number (personnummer), a national government identification number for Swedish citizens.
SWEDEN_PASSPORTA Swedish passport number.

Taiwan

InfoTypeDescription
TAIWAN_PASSPORTA Taiwanese passport number.

Thailand

InfoTypeDescription
THAILAND_NATIONAL_ID_NUMBERThe Thai บัตรประจำตัวประชาชนไทย, or identity card, is used as the main identity document for Thai nationals.

Turkey

InfoTypeDescription
TURKEY_ID_NUMBERA unique Turkish personal identification number, assigned to every citizen of Turkey.

United Kingdom

InfoTypeDescription
SCOTLAND_COMMUNITY_HEALTH_INDEX_NUMBERThe Scotland Community Health Index Number (CHI number) is a 10-digit sequence used to uniquely identify a patient within National Health Service Scotland (NHS Scotland).
UK_DRIVERS_LICENSE_NUMBERA driver’s license number for the United Kingdom of Great Britain and Northern Ireland (UK).
UK_NATIONAL_HEALTH_SERVICE_NUMBERA National Health Service (NHS) number is the unique number allocated to a registered user of the three public health services in England, Wales, and the Isle of Man.
UK_NATIONAL_INSURANCE_NUMBERThe National Insurance number (NINO) is a number used in the United Kingdom (UK) in the administration of the National Insurance or social security system. It identifies people, and is also used for some purposes in the UK tax system. The number is sometimes referred to as NI No or NINO.
UK_PASSPORTA United Kingdom (UK) passport number.
UK_TAXPAYER_REFERENCEA United Kingdom (UK) Unique Taxpayer Reference (UTR) number. This number, comprised of a string of 10 decimal digits, is an identifier used by the UK government to manage the taxation system. Unlike other identifiers, such as the passport number or social insurance number, the UTR is not listed on official identity cards.

Uruguay

InfoTypeDescription
URUGUAY_CDI_NUMBERA Uruguayan Cédula de Identidad (CDI), or identity card, is used as the main identity document for citizens.

Venezuela

InfoTypeDescription
VENEZUELA_CDI_NUMBERA Venezuelan Cédula de Identidad (CDI), or national identity card, is used as the main identity document for citizens.

Data sources that do not support continuous scans

  • Azure

  • BambooHR

  • Brightspot

  • Coda

  • Docebo

  • Fifteen Five

  • Figma

  • Freshservice

  • Gmail

  • Gong

  • Highspot

  • Lattice

  • Looker

  • LumApps

  • Monday.com

  • Notion

  • Okta

  • Pingboard

  • People Data API

  • Push API for Content

  • Slack Enterprise

  • Salesforce

  • ServiceNow

  • Simpplr

  • SmartSheet

  • Stack Overflow

  • Tableau

  • Stack Overflow

  • Web pages (Internet/Intranet)

  • Yammer