The Wordpress connector allows Glean to index content from WordPress sites, enabling users to search public pages, posts, comments, tags, and categories directly within Glean. Once configured, the connector brings WordPress content—including selected metadata—into the Glean platform for enterprise search. Integration is primarily intended for published, public-facing content.

Supported Features and Limitations

The Wordpress connector supports crawling standard content available through the WordPress REST API. Only content that is published and publicly accessible will be indexed. Permission enforcement for restricted content is not supported in the current version: all indexed content will be visible to all users within the organization using Glean.

Supported Objects/Entities

  • Pages (published, public)
  • Posts (published, public)
  • Comments (attached to posts/pages)
  • Tags
  • Categories
  • Custom Posts (requires manual endpoint configuration)

Supported API Endpoints/Features

  • /pages: Retrieve and index page content and metadata
  • /posts: Retrieve and index post content and metadata
  • /comments: Index comments on pages and posts
  • /tags: Index tag metadata associated with posts/pages
  • /categories: Index category metadata
Custom post types can be included, but require manual entry of their endpoints in the Glean configuration.

Limitations

  • Only published, non-password-protected, and non-private content is crawled. Drafts, password-protected, and private pages/posts are not indexed.
  • No permission propagation: all indexed content is available to all Glean users in the organization, regardless of WordPress role-based access controls.
  • Nested categories are not fully supported for faceting; only directly assigned categories are indexed.
  • No author attribution: Although the WordPress API can return an author ID, the connector does not attach or surface author information in indexed documents. Author names, emails, or IDs are not recorded or displayed.
  • Activity data (such as view counts) is not supported in the initial version.
  • Advanced features such as popularity-based ranking, private content handling, and per-user visibility are not included in the current connector.

Requirements

This section outlines the requirements to deploy and operate the WordPress connector with Glean.

Technical Requirements

  • WordPress site must be accessible over the internet (publicly or via allow-listed IP ranges if necessary).
  • The WordPress REST API must be enabled and reachable.
  • Minimum WordPress version 5.6+ is required to support Application Passwords for API authentication.

Credential Requirements

  • Administrator account credentials on the WordPress site to generate an Application Password.
  • Application Password and corresponding WordPress username must be provided to Glean for API access.
  • Target WordPress site hostname is required for connector configuration.

Permission Requirements

  • The connector requires use of a WordPress administrator account to generate API Application Passwords. Only administrator or users with sufficient privileges can create these credentials.
  • Since the connector only crawls public content, no additional permission mapping or user role scoping is performed.

Preliminary Source/System Setup

  • An Application Password must be created in WordPress for the dedicated integration account.
  • (Optional) Custom post endpoints should be identified if custom post types should be indexed.

External References

  • WordPress REST API reference: developer.wordpress.org/rest-api/reference/
  • Application Passwords (WordPress 5.6+): make.wordpress.org/core/2020/11/05/application-passwords-integration-guide/

Configuration and Setup Instructions

The setup process for the WordPress connector is completed through the Glean Admin Console. All primary configuration activities happen within the console, with necessary information sourced from the WordPress admin portal.

Prerequisites

  • Running WordPress site (v5.6+)
  • WordPress admin access
  • Glean Admin Console access

Authentication and Credentials

  • In WordPress, log in with an administrator account. Navigate to the user’s profile and generate a new Application Password for Glean use.
  • Record both the newly created Application Password and the associated username.
  • In Glean’s Admin Console, enter the hostname of the WordPress site, the admin username, and the Application Password in the relevant connector fields.

Step-by-Step Setup

  1. Log in to your WordPress site as an administrator and navigate to your user profile.
  2. Create a new Application Password specifically for Glean integration.
  3. Note the Application Password and username; store this information securely.
  4. Open Glean’s Admin Console and select “Add Connector” > “WordPress.”
  5. Enter the following in the configuration:
    • WordPress site hostname (e.g., mycompany.com)
    • Application Password (from Step 3)
    • Username (from Step 3)
  6. (Optional) To crawl custom post types, add the REST API endpoints for those custom posts in the Glean connector settings.
  7. Save your configuration and trigger a crawl/test to confirm successful authentication and setup.