Skip to main content

File Upload

Overview

File Upload enables users to upload and analyze local files directly in Glean, Public Knowledge, and Apps. This feature allows users to query, summarize, and generate content from uploaded files. The upload limits vary based on the model used in your Glean instance.

info

File content and metadata are stored in the chat sessions of users and are retained while the chat session is present in the history.

Supported File Formats

Key Features

File Upload

Users can upload up to 5 files directly from their local computer, with each file having a maximum size of 64 MB.

Real-Time Querying

Query the text content of uploaded files immediately after upload for instant analysis.

Document Metadata

The chat UI displays document metadata including title and file type for easy reference.

File Management

1

File Deletion

Users can delete uploaded files before submitting their first query. After query submission, files can only be removed by deleting the chat session history.

2

Retention Policy

File content and metadata are retained while the chat session is present in history.

API Support

Developer platform customers can utilize our API for file uploads. Refer to our API documentation for details.

Security and Privacy

Security Features

  • Files are parsed and scanned for malware before storage
  • Malware-infected files trigger upload errors

Access Controls

  • Uploaded files can only be downloaded by the original uploader or by users who have access to the associated shared chat session
  • Files are not publicly accessible, even if the chat session was previously marked as public

Upload Limits

The maximum file size and number of files allowed per session depends on your model:

ModelFile LimitSize Limit
128K Models5 files64 MB
32K Models4 files32 MB
8K Models2 files16 MB
info

Minimum file size for upload is 1 KB

Analyze archive files

You can upload and analyze .zip archives directly in Glean. This is useful for working with bundled documents, code packages, or compressed datasets. Drag and drop a .zip file into the composer, then ask questions about its contents. Other archive formats are also accepted: .tar, .tar.gz / .tgz, .tar.bz2, .gz, and .bz2.

Unlike other uploads, archives aren't parsed and indexed up front. Instead, Glean opens them in the Agent Sandbox — a virtual computer with a file system, shell, and code interpreter — and inspects or extracts only the files needed to answer your question.

Requirements

Archive analysis runs in the Agent Sandbox, so both of the following must be true:

  • Agent Sandbox is turned on for your organization.
  • The chat is in Thinking mode. You can't add archives in Fast mode, and switching a chat to Fast mode removes any staged archive files.
info

Archive analysis runs in the Agent Sandbox and may be subject to usage-based pricing.

Archive limits

LimitDefault
Archive (compressed) file size64 MB
Total uncompressed contents256 MB

Archives count toward the per-session file limit for your model, shown in Upload Limits. To protect against zip bombs, Glean also rejects archives with an unusually high compression ratio.

Security restrictions

Glean validates every archive before processing and rejects it if it contains any of the following:

  • Executable or script files (for example, .exe, .dll, .so, .sh, .bat, or .msi)
  • Nested archives, that is, another archive inside the archive
  • Password-protected or encrypted entries
  • Symbolic or hard links
  • Entries with unsafe paths, such as a path that points outside the archive

Glean scans archives for malware, the same as for other file uploads.

Known Limitations

warning
  • Multi-media support for video and audio files is not available. Files outside of the supported file types above cannot be uploaded.
  • Custom data retention policies for file uploads are not supported beyond your configured chat history retention. You can ask users to disable chat session history or to manually delete chat sessions if you would like to delete files and metadata sooner.
  • Optical Character Recognition (OCR) must be enabled for scanned PDFs to work properly - contact Glean support if you experience issues
  • For analytical queries (row counts, filtered counts, aggregations) on large spreadsheets referenced from a connected source such as SharePoint or OneDrive, uploading the file directly typically produces more accurate results than referencing the file by URL. See Data Analysis: Overview for guidance.

Enabling File Upload

To enable file upload for all users:

  1. Navigate to Glean section in workspace Settings
  2. Locate the File Upload toggle
  3. Enable the feature
info

File Upload is disabled by default for existing customers (as of GA on 9/24) but enabled by default for new customers.

File Upload Settings 1

FAQ

See also

Future Development