Skip to main content

Monitoring and debugging access in customer-hosted deployments

In customer-hosted deployments, customers often want to understand exactly what access Glean needs for monitoring, incident investigation, and operational support.

Glean's access model is centered on two areas:

  • Operational log visibility for debugging and production support
  • Read-only IAM principals for the alert-monitoring@glean.com alias used by Glean's on-call team (GCP), and the glean-viewer IAM role for AWS deployments
Important

For operational supportability, customers should ensure that the agreed debugging access path remains available for incident response and production support.

Glean may be unable to investigate issues if a customer removes or blocks Glean's access to the deployment entirely.

System logs

The following system logs are available in a customer-hosted deployment:

Log typeRetentionHow to accessNotes
Non-PII logs400 daysViewable by Glean employees for debugging purposesAvailable in the Stackdriver or Cloud Logging console for GCP and CloudWatch Log Groups for AWS
PII logs30 daysRestricted to the AWS/GCP project adminsStored in the glean_sensitive_logs_bigquery and audit_logs BigQuery tables for GCP. For AWS, stored in similarly named CloudWatch Log groups and S3 buckets.

PII in this context includes information such as employee email addresses or permission group names. Glean doesn't log the content stored in the document body.

note

In rare debugging scenarios, Glean employees can look up specific log entries using dedicated debugging APIs. All such access is audit-logged, requires justification, and must be authorized by a small set of Glean engineering leaders.

User activity logs

User activity logs are available for searches and actions performed by a customer's employees in Glean.

Log locationRetentionContentsHow to access
scio-<projectid>-query-endpoint-access bucket270 daysLogs for all search queries, including user identity and queryNot accessible to Glean employees unless the customer has allowed access for debugging
scio-<projectid>-search-query, scio-<projectid>-search-result, scio-<projectid>-search-result-feedback buckets270 daysQueries, returned results, clicks, and viewsPrimarily used by ranking pipelines to improve search

For log locations in AWS deployments, replace projectid with account id.

Error reporting

Error reporting counts, analyzes, and aggregates crashes in the running cloud services. These stack traces are visible to Glean employees and are used to help diagnose and fix production issues.

GCP: IAM roles for alert-monitoring@glean.com

Glean provides a fully managed SaaS experience even when the deployment is hosted in a GCP project owned by the customer. To monitor and manage the system efficiently, Glean requests that alert-monitoring@glean.com be granted specific read-only IAM permissions.

If a customer doesn't allow standard access to alert-monitoring@glean.com, the recommended approach is to create a customer-managed principal and apply the same bindings to that principal. This preserves the intended read-only debugging model while allowing the customer to manage approval and access workflows within their own environment.

These permissions don't provide access to customer data stored in Cloud SQL, Kubernetes, or Cloud Storage, and they don't provide access to logs with PII that aren't sent to the Stackdriver Logging Console.

Custom roles

RoleExpanded permissionsPurpose
roles/glean_cost_reader_v1billing.resourceCosts.getAllows the on-call team to access billing information for the project and monitor costs associated with cloud services and usage
roles/glean_dataflow_oncall_v1dataflow.jobs.list, dataflow.metrics.get, dataflow.jobs.get, dataflow.messages.listAllows the on-call team to view Dataflow jobs, monitor job status, access performance metrics, and review job-related messages
roles/glean_pubsub_reader_v1pubsub.schemas.get, pubsub.schemas.list, pubsub.subscriptions.get, pubsub.subscriptions.list, pubsub.topics.get, pubsub.topics.list, resourcemanager.projects.get, serviceusage.quotas.get, serviceusage.services.get, serviceusage.services.listGrants read-only visibility into Pub/Sub topics, subscriptions, schemas, quota state, and related service metadata
roles/glean_sql_oncall_v1cloudsql.instances.get, cloudsql.instances.listAllows the on-call team to inspect Cloud SQL instance status and configuration
Custom Cloud Trace read-only rolecloudtrace.insights.get, cloudtrace.insights.list, cloudtrace.stats.get, cloudtrace.tasks.get, cloudtrace.tasks.list, cloudtrace.traces.get, cloudtrace.traces.listEnables trace inspection, performance analysis, and debugging using Cloud Trace

Predefined roles

RolePurpose
roles/aiplatform.viewerRead-only access to AI Platform resources for monitoring models and training jobs
roles/cloudbuild.builds.viewerView Cloud Build logs for debugging deployment errors
roles/cloudfunctions.viewerView Cloud Functions configuration and metadata
roles/cloudscheduler.viewerView scheduled job configuration and status
roles/cloudtasks.viewerView Cloud Tasks status and configuration
roles/compute.viewerView VM instance status and configuration in Compute Engine
roles/container.viewerMonitor Kubernetes Engine and other container resources
roles/errorreporting.viewerView and manage application errors reported in Error Reporting
roles/logging.viewerView non-PII logs in the Stackdriver Logging console
roles/ml.viewerMonitor machine learning resources on AI Platform, including models and job status
roles/monitoring.viewerView monitoring data, dashboards, and alerts in Google Cloud Monitoring
roles/run.viewerView Google Cloud Run service configuration and status
roles/servicehealth.viewerMonitor the health and status of Google Cloud services
roles/workflows.viewerView Google Cloud Workflows configuration and execution history

Privileged Access Manager (PAM)

For customers who want time-bound elevated access, Google Cloud's Privileged Access Manager (PAM) provides a more controlled just-in-time model than temporary group membership alone.

PAM uses entitlements and grants to let approved Glean engineers request short-lived access to specific roles, with built-in justification, optional customer approval workflows, notifications, audit logs, and automatic revocation at the end of the grant window. Customers can still use Google Groups to control who may request or approve access, while PAM ensures elevated roles are granted only to the individual requester for a limited duration.

Common best practices include:

  • Use narrowly scoped roles
  • Set short TTLs
  • Create a separate break-glass entitlement for emergency cases

AWS: IAM roles for Glean

For AWS deployments, use Glean's standard glean-viewer role with view-only, non-sensitive access to observe infrastructure configuration. For customers with stricter access controls, preserve the same glean-viewer permissions while gating role assumption through the customer's internal approval mechanism.

Anonymized logs sent to Glean's central server

For analytics purposes, Glean sends anonymized non-PII logs from the customer project to Glean's central server.

AspectDetails
Data sentAnonymized non-PII logs
SanitizationUser IDs, document URLs, query terms, and other PII are scrubbed and hashed before export
Export mechanismA GCP log sink exports the anonymized logs from the customer project to a BigQuery table in a locked-down Glean-managed GCP project
PurposeCorrelating actions within search sessions and supporting analytics without exposing user, query, or document details

The logs are anonymized at creation time through a sanitization process in Glean code before they're exported.

FAQ

See also