This article explains the size limits and processing rules for item content, metadata, and permissions that Glean’s crawler and indexer apply to all datasources.
Is Content Over the Size Limit still Searchable via Glean?
Do these Limits Apply for All Datasources?
When the Content is Less Than the Size Limit, is All of the Content Indexed?
If the Crawler Defaults to Converting and Stripping Content, what other Options are Available?
Will Chat (AI Assistant) functionality as well as Summarize Capabilities in the Search Engine Result Page be affected for Non-Indexed Content?
These Limits Seem Small, what’s the Reason?
Is this a Hard Limit that can Never Change?
What types of content are affected?