The Glean Knowledge Graph serves as the foundation of Glean’s enterprise search platform, operating on a real-time model of your enterprise’s indexed information. This sophisticated system enables highly personalized and contextually relevant search results by understanding the relationships between content, people, and activities within your organization.

Core Architecture

The Knowledge Graph is built on three fundamental pillars that work together to create a comprehensive understanding of your enterprise data:

Content

Documents, messages, tickets, and other content types across your organization

People

User identities, roles, teams, and organizational relationships

Activity

User interactions, document history, and engagement patterns

Content Integration

Glean provides over 100 easy-to-use connectors, each specifically designed for different applications’ data models and API endpoints.

The content crawler performs comprehensive indexing that includes:

  • Full content analysis (titles, body copy, comments, media)
  • Metadata extraction (creator, creation time, update history, file type, folder structure)
  • Permissions management
  • Customizable search weights
  • Faceted search capabilities

Crawl Configuration

People Intelligence

One in ten enterprise searches are people-related, making comprehensive people data crucial for effective enterprise search.

Glean's Knowledge Graph provides rich people insights

The People pillar of the Knowledge Graph provides:

1

Unified Identity

Creates a consolidated view of each person across all connected applications.

2

Organizational Context

Maps relationships between roles, teams, tenure, and location.

3

Collaboration Insights

Identifies close collaborators and recent project involvement.

4

Customizable Profiles

Flexible data model that can be tailored to your organization’s needs.

Activity Tracking

Activity data is collected securely from connected applications to enhance search personalization and relevance.

Activity data is collected from multiple sources:

  • Teams
  • Slack
  • Email
  • Plugins
  • Chrome extension

Data Usage and Privacy

The activity information serves two primary purposes:

Individual Personalization

Learning patterns to improve personal search results while maintaining strict user privacy - individual data remains isolated to each user.

Collective Intelligence

Enhancing results for user groups through aggregated insights, with privacy thresholds ensuring data collection only occurs across multiple users.

Activity data never leaves your exclusive GCP project and is subject to strict data protection rules to ensure privacy.