Overview

Glean provides our customers the ability to deploy Glean software inside their own Google Cloud Platform (GCP) project. This deployment requires your GCP admin to:

  1. Create a new GCP project.
  2. Associate a valid billing account.
  3. Enable applicable GCP APIs.
  4. Request the required quota increases from GCP.
  5. Create a Service Account with Project Owner role and associate a JSON account key.
  6. Notify Glean of the GCP zone selected, the Project Name, Project ID, Project Number, and the service account JSON key.

After completing the above, Glean’s systems will automatically build and deploy the required compute, workflows, and software into your GCP project.

At this stage, Glean will advise you that your tenant is ready; allowing your admins to proceed with the setup process in our Getting Started guide.

This document will cover the steps required by your GCP admins to prepare a GCP project that is ready for your Glean build.


1. Select a GCP Region

You must first select a supported GCP region for Glean to build your environment in.

You must notify Glean of the GCP zone selected, e.g. asia-northeast1-a

The region selected cannot be changed once your tenant has been built. Changing region will require a complete rebuild of your tenant.

2. Create the GCP Project

  1. Go to the Manage resources page in the GCP console and click Create Project.

  2. In the New Project window that appears, add a project name, organization, and location.

    • For the project name, the preferred format is glean-{customer name} or glean-{customer name}-{prod/sandbox}
    • E.g. glean-company or glean-company-prod
  3. Make sure that your project is created under the same organization as your Google Workplace account, and not “No Organization”.

Glean is not able to proceed with the build if the project is created under “No Organization”. If you are unsure of how to resolve this, please contact your GCP account team or GCP support.

  1. Save the Project ID (which is directly below the Project name) and Project Number.

  2. Click Create.

  3. Notify Glean of the following information:

    a. Project name, eg glean-company → This was set in Step 2 above.

    b. Project ID, eg glean-company → This was saved in Step 4 above.

    c. Project number, eg 715000000000 → This was saved in Step 4 above.

    d. Region and Zone where you want to deploy Glean, e.g. us-central1-a

3. Configure Billing

  1. Go to Billing in the GCP console.

  2. Click Link a billing account to set up billing for this project.

Ensure that the billing account has a corporate credit card attached to it. Using the “free trial billing tier” will not work.

4. Enable APIs

Glean requires that the following GCP APIs are enabled for the deployment to succeed. Substitute your Project ID to the end of the URL for each API below to enable the API on the project.

API NameURL
Cloud Resource Manager API (cloudresourcemanager.googleapis.com)https://console.cloud.google.com/apis/api/cloudresourcemanager.googleapis.com/overview?project=[PROJECT_ID]
Service Usage API (serviceusage.googleapis.com)https://console.developers.google.com/apis/api/serviceusage.googleapis.com/overview?project=[PROJECT_ID]
Compute Engine API (compute.googleapis.com)https://console.developers.google.com/apis/api/compute.googleapis.com/overview?project=[PROJECT_ID]
Cloud SQL Admin API (sqladmin.googleapis.com)https://console.developers.google.com/apis/api/sqladmin.googleapis.com/overview?project=[PROJECT_ID]

5. Request Quota Changes

Search for [Quotas] in the search box of the GCP Console and navigate to All Quotas, under IAM & Admin.

For each of the quotas in the table below, request a quota change by completing the following:

  1. Click on the required quota.
  2. Select Edit Quotas
  3. Enter the value specified by Glean for the quota.
  4. Click Submit Request.

Please note that some quota requests will require filing a ticket with GCP support. Response time is typically within 2 days.

You must ensure that the region/location specified in your quota request(s) match the GCP Region and Zone that you wish to deploy in. For more information, see Supported GCP Regions.

Quota TypeServiceMetricLocationNew ValueJustification
All QuotasCompute Engine APICPUsus-central1 (or primary deployment region)110The Glean search system deploys 20+ instances of crawler services on nodes, as well as multiple nodes of the Elastic index service in Kubernetes cluster and so this quota is needed. Without this quota the system cannot be deployed to the project. We generally run on less than 50% of this quota, and go beyond 50% during Elastic index rolling deployments.
All QuotasCompute Engine APIN2 CPUsus-central1 (or primary deployment region)110The Glean search system deploys 20+ instances of crawler services on nodes, as well as multiple nodes of the Elastic index service in Kubernetes cluster and so this quota is needed. Without this quota the system cannot be deployed to the project. We generally run on less than 50% of this quota, and go beyond 50% during Elastic index rolling deployments.
All QuotasCompute Engine APIN2D CPUsus-central1 (or primary deployment region)110The Glean search system deploys 20+ instances of crawler services on nodes, as well as multiple nodes of the Elastic index service in Kubernetes cluster and so this quota is needed. Without this quota the system cannot be deployed to the project. We generally run on less than 50% of this quota, and go beyond 50% during Elastic index rolling deployments.
All QuotasCompute Engine APIT2D CPUsus-central1 (or primary deployment region)128The Glean search system runs batch Dataflow pipelines to generate training data, compute statistics, and perform model inference. Without this quota, these pipelines cannot efficiently run.
All QuotasCompute Engine APIVM Instancesus-central1 (or primary deployment region)240The Glean search system deploys Dataflow jobs and Kubernetes cluster, which create VM instances when jobs are launched.
All QuotasCompute Engine APINVIDIA T4 GPUsus-central1 (or primary deployment region)4The Glean search system runs batch Dataflow pipelines to generate training data, compute statistics, and perform model inference. Without this quota, these pipelines cannot efficiently run.
All QuotasVertex AI APICustom model training TPU V2 Coresus-central1 (or primary deployment region)8The Glean search system trains a custom AI language model on the corpus, enabling features such as semantic search, synonyms, and more. We use these TPU accelerators to power the training.
All QuotasVertex AI APICustom model training Nvidia V100 GPUs per regionus-central1 (or primary deployment region)8The Glean search system trains a custom AI language model on the corpus, enabling features such as semantic search, synonyms, and more. When there are no TPUs available, we use these GPU accelerators to power the training.
All QuotasVertex AI APICustom model training Nvidia T4 GPUs per regionus-central1 (or primary deployment region)4The Glean search system trains a custom AI language model on the corpus, enabling features such as semantic search, synonyms, and more. When there are no TPUs or V100’s available, we use these GPU accelerators to power the training.
All QuotasCompute Engine APIPersistent Disk Standardus-central1 (or primary deployment region)10TBThe Glean search system stores millions of enterprise documents in Cloud SQL and in a search index with persistent storage. Due to the number and size of documents stored we need the quota to be increased.
All QuotasCompute Engine APIIn-use IP addressesus-central1 (or primary deployment region)20The Glean search system deploys 20-25 flex instances of crawler services on Kubernetes Engine, and each flex instance requires its own IP address.

6. Create a Service Account

The service account is used to allow Glean’s systems to access the project and perform the build. You will create the service account and provide Glean with the private JSON key required to use it.

  1. Go to the Service Accounts page in the GCP console and click Select a Project.

  2. Click Create Service Account. Enter the service account name (glean-admin), ID, and description (optional), then click Create.

  3. Click the Select a role dropdown to make your service account an Owner of the project. Click Continue.

  4. Ignore the Grant users access to this service account option. It is not required.

  5. Click Create Key. In the panel that appears, select the key type JSON, then Create. This will save a private JSON key to your computer.

7. Upload the Service Account Key to the Glean Admin UI

  1. If you haven’t already, follow the instructions from the Access the Admin UI section of the Getting Started guide.

  2. On the page titled Create a Google Cloud Platform project, click the box under Step 2 to upload the private JSON key to Glean.

  3. Click Save. Glean will now use the JSON key to validate that all the steps above have been performed correctly.

If the save fails, you will be presented with a red error message detailing the issues to correct. The key must be saved correctly before the build of your Glean tenant can proceed.

Troubleshooting

For Error Codes and troubleshooting steps, please see the Troubleshooting section.

FAQ