This feature is available for both cloud-prem deployments and SaaS customers who need to connect to on-premises or private data sources.
Overview
Glean on GCP can connect to your on-premises or cloud-based data sources using private networking. This allows Glean to securely crawl private services within your network without exposing them to the public internet. Glean uses a transit VPC to minimize IP address conflicts with your network and reduce infrastructure exposure.Connectivity methods
| Method | Use when | Deployment speed | Tooling maturity |
|---|---|---|---|
| Shared VPC | Your data sources are on GCP and you want centralized network management | Fast | Mature |
| VPC peering | Your data sources are on GCP and you need direct VPC-to-VPC connection | Fast | Mature |
| Site-to-site VPN | Your data sources are on AWS, Azure, or on-premises | Fast | Mature |
| Private Service Connect | You need service-level isolation and fine-grained access control | Slow | Newer |
Shared VPC
Connect Glean to your data sources using GCP’s Shared VPC architecture, where a host project shares VPC network resources with service projects.
How it works
Your Shared VPC host project shares network resources with Glean’s service project. Glean’s resources are deployed in a service project attached to your Shared VPC, allowing direct access to data sources within the shared network.What you need to provide
- Subnet ID for Glean to use in the shared VPC
- Format:
projects/{project}/regions/{region}/subnetworks/{subnet-name} - Example:
projects/connectivity-vpn-57503/regions/asia-east1/subnetworks/glean-subnet
- Format:
- Subnet CIDR range (if you cannot provide subnet viewing permissions)
- Must not overlap with 10.1.0.0/16
- IAM permissions: Grant
roles/compute.networkUserrole to Glean’s deployer service account- Service account format:
cloud-build@<glean-project>.iam.gserviceaccount.com - This allows Glean to attach network interfaces to the shared subnet
- Service account format:
- Data source hostnames/IPs that Glean needs to access
Configuration notes
- Requires Shared VPC Admin role to attach service projects
- Firewall rules configured at the host project level
- Best performance and lowest latency
- Cost-effective (no gateway fees)
- Most popular choice
- Provides centralized network management and security policies
- Limited to GCP-to-GCP connectivity
VPC peering
Direct network connection between your GCP VPC and Glean’s transit VPC.
How it works
Your VPC peers with Glean’s transit VPC, which peers with Glean’s default VPC. Traffic flows privately through these peering connections for both crawler access and webhook delivery.What you need to provide
- GCP project ID
- VPC network name (format:
projects/{project}/global/networks/{network}) - VPC CIDR ranges (must not overlap with 10.1.0.0/16)
- Data source hostnames/IPs that Glean needs to access
Configuration notes
- Requires firewall rules to allow traffic from Glean’s transit VPC CIDR (provided by Glean)
- Best performance and lowest latency
- Cost-effective (no gateway fees)
- Limited to GCP-to-GCP connectivity
Site-to-site VPN
Encrypted IPsec tunnel between your network and Glean’s GCP environment.
How it works
IPsec VPN tunnel connects your VPN gateway to Glean’s Cloud VPN gateway. Glean uses a /29 CIDR range (provided by you) for the transit VPC. All traffic is encrypted in transit.What you need to provide
- VPN gateway public IP address
- IKE version (v1 or v2) - IKE v2 recommended
- Pre-shared key (generate a strong 32+ character key)
- Dedicated /29 CIDR range for Glean’s transit VPC (e.g., 10.99.0.0/29)
- Must not overlap with your networks or 10.1.0.0/16
- Routes to advertise (networks where data sources reside)
- Data source hostnames/IPs that Glean needs to access
Configuration notes
- Works with any cloud provider (AWS, Azure, GCP) or on-premises datacenter
- Higher latency than VPC peering due to VPN gateway hop
- VPN gateway and data transfer costs apply
- Supports static routing or BGP (GCP-to-GCP only)
Private Service Connect
Service-level isolation using GCP’s Private Service Connect producer/consumer model.
How it works
PSC requires two separate configurations: For webhooks (Customer → Glean):- Glean creates a PSC producer (publishes Internal LB)
- You create a PSC consumer endpoint in your VPC
- Your data sources send webhooks to
<deployment>-internal-psc.glean.com
- You create a PSC producer (publish data sources via Internal LB)
- Glean creates PSC consumer endpoint(s)
- Glean crawlers access your data sources through consumer endpoints
What you need to provide
For webhook setup:- GCP project ID
- Preferred region for PSC endpoint
- Service attachment ID after creating your PSC producer
- Format:
projects/{project}/regions/{region}/serviceAttachments/{name}
- Format:
- Glean project ID added to trusted consumers (Glean provides this)
Configuration steps
Webhook setup (Your network → Glean)
Webhook setup (Your network → Glean)
After receiving Glean’s service attachment ID:
- Reserve a static internal IP in your VPC (see GCP documentation)
- Navigate to VPC Network → Private Service Connect → Connected Endpoints
- Click “Connect Endpoint” and enter Glean’s service attachment ID
- Configure Cloud DNS private zone:
<deployment>-internal-psc.glean.com→ your consumer IP- Replace
<deployment>with your specific deployment identifier provided by Glean.
- Replace
- Test connectivity:
curl https://<deployment>-internal-psc.glean.com/health
Crawler setup (Glean → Your data sources)
Crawler setup (Glean → Your data sources)
-
Create a target service pointing to your data sources (if not existing). Supported target types include:
- Internal passthrough Network Load Balancer
- Regional internal Application Load Balancer
- Cross-region internal Application Load Balancer
- Internal protocol forwarding
- Regional internal proxy Network Load Balancer
- Secure Web Proxy instance
- Allocate a /24 subnet for PSC NAT (e.g., 10.100.250.0/24)
- Navigate to VPC Network → Private Service Connect → Published Services
-
Create service attachment:
- Target: Your target service (load balancer or forwarding rule from step 1)
- Subnet: The /24 subnet allocated above
- Add Glean’s project ID to trusted projects
- Provide service attachment ID to Glean
-
Configure private DNS zone: We’ll set up a private DNS zone to resolve your data source domain to the consumer IP
- Example:
*.sharepoint.com→ Consumer IP address (IP2) - This ensures Glean’s crawlers resolve to the private endpoint instead of public IPs
- Example:
Configuration notes
- Requires GCP-to-GCP connectivity
- Fine-grained access control via project allowlisting
- More manual configuration than peering/VPN
- Service-level isolation without exposing entire VPC
Security & network details
Encryption & isolation
- VPC peering: Traffic uses GCP’s internal network encryption
- VPN: IPsec tunnel encryption with IKE v1/v2
- PSC: Traffic stays within Google’s private network
Access control
- VPC peering/VPN: Firewall rules control connectivity
- PSC: Project allowlists provide explicit trust model
Glean network ranges
Reserved CIDR:- 10.1.0.0/16 - Glean default VPC (do not use this range)
- VPC peering: Avoid overlap with 10.1.0.0/16
- VPN: Allocate /29 for transit VPC (must not overlap with 10.1.0.0/16 or your networks)
- PSC: Allocate /24 for PSC NAT
Firewall ports
Ensure firewall rules allow Glean to access:- Port 443 (HTTPS) - Most data sources
- Port 80, 8080 (HTTP) - Some internal applications
- Custom ports - Work with Glean to identify specific requirements
Implementation process
1. Choose your method
Use the comparison table above based on where your data sources are hosted.2. Prepare information
Gather the required information listed in your chosen method section above.3. Contact Glean
Reach out to your Glean Customer Success or Solutions Engineering team with:- Chosen connectivity method
- Required information from step 2
- Timeline and compliance requirements
- Technical point of contact (name, email, role)
4. Deploy & validate
Glean will:- Configure Glean infrastructure (VPN gateway, peering request, or PSC producer)
- Provide connection details (IP addresses, service attachment IDs, etc.)
- Coordinate connectivity testing
- Enable data source crawlers after validation
- Monitor initial crawl
Support
- Technical Documentation: GCP Cloud Prem FAQ
- Network Issues: Contact your Glean Solutions Engineer
- Security Questions: security@glean.com
- General Support: https://support.glean.com/hc/en-us