ai_security.ai_security
BigQuery table contains enriched records of AI security violations. These records include debug metadata, workflow context, raw content, and validation/model metadata.
You can use this table for various purposes, including:
- Root-cause analysis and triage
- Dashboards and automated alerts
- Offline machine learning or data aggregation work
Quick Investigation Checklist
Before you begin an investigation, ensure you have the following:- Required Data: The event_id, Run ID, or a specific timestamp range from the Findings dashboard.
- Permissions: The necessary bigquery.dataViewer and bigquery.jobUser permissions to access and query the data.
- Query Best Practices: Use saved views or pivoted queries to minimize manual errors.
- Data Security: Mask or redact any sensitive fields before sharing query results outside of the security team.
How to access the data
Required Privileges
To access and query the data, a user typically needs the following Identity and Access Management (IAM) roles:- roles/bigquery.dataViewer: To read the table data.
- roles/bigquery.jobUser: To run queries.
GCP Console (web UI)
- Sign in to Google Cloud Console and select the project that owns the dataset.
- Open BigQuery → Explorer panel → find
your_project
→ datasetai_security
→ tableai_security
. - Click the table to view schema, details and preview rows.
- Use the Query editor to run SQL; review results in Table / JSON view; use the Visualization tab to build simple charts.
- Save frequently used diagnostic queries with the Console’s Saved Queries feature for reuse and sharing.
CLI (Cloud Shell or local with gcloud + bq)
- Authenticate and set the project:
- Run a simple query with
bq
:
- Output JSON for programmatic parsing:
- Save queries as stored procedures in BigQuery and call them from the CLI for automation.
Relevant columns (top-level BigQuery fields)
These fields are present on each exported log row and give resource / ingestion context:logName
— Log stream name.resource.type
— GCP resource type (e.g.,k8s_container
).resource.labels.pod_name
— Pod name (k8s).resource.labels.location
— Region / zone.resource.labels.namespace_name
— K8s namespace.resource.labels.cluster_name
— Cluster name.resource.labels.project_id
— GCP project id.timestamp
— Event timestamp (when it occurred).receiveTimestamp
— When the log was ingested.insertId
— Unique insert id (dedupe).labels.commit_hash
,labels.branch
,labels.full_version
— Build/version metadata.
Payload structure (jsonPayload.ai_security)
jsonPayload.ai_security
contains a JSON representation of the AiSecurityLogEntry
proto. Important fields and their meaning:
-
event_id
(string) — globally unique event id. -
event_type
(enum) — e.g.,VIOLATION
. -
event_description
(string) — human readable description. -
user_id
(string) — user that triggered the event. -
session_info
— object withtab_id
andsession_tracking_token
. -
action
(enum) — enforcement taken (BLOCK_REQUEST
/ALLOW_REQUEST
). -
content_raw
(string) — raw content that caused the event (user prompt or retrieved content). -
content_metadata
(repeated{key,value}
) — context keys such as:-
RESOURCE_NAME
,RESOURCE_ID
,RESOURCE_URL
, -
AGENT_NAME
,RUN_ID
,CHAT_SESSION_ID
,AGENT_ID
,SOURCE
-
-
validation_metadata
(repeated{key,value}
) — model prediction / validation debugging key-values. - (other context fields may exist — e.g., workflow entries, LLM call details, agent spans)
Common SQL patterns & examples
Notes:UNNEST()
is used to flatten repeated metadata arrays. Replace YOUR_PROJECT
accordingly.
Get latest 100 violations with key content & resource id
Extract a single row’s full JSON for deep debugging
Pivot content_metadata into columns (common keys)
Count of violations by action type (daily)
Top validation metadata keys/values (for model debugging)
Useful investigative workflows
- Triage a single finding
- Start with the
event_id
from the Findings dashboard or theRun ID
. - Query
event_id
in BigQuery to fetchcontent_raw
,content_metadata
andvalidation_metadata
. - Inspect LLM call traces /
llm_call
andagent_span
fields (if present) to see prompt / response context.
- Start with the
- Find similar incidents
- Use
content_metadata.RESOURCE_ID
or normalizedcontent_raw
hashes to group similar violations. - Search by
validation_metadata
keys (e.g., model label or confidence buckets) to identify common false positives.
- Use
- Root cause of skipped users / digest generation issues
- Combine
digest
entries with workflow/compiler logs (workflow
,workflow_compiler
) in the exported fields to see enqueue vs execution differences.
- Combine
- Automated daily rollups
- Create scheduled queries that aggregate violations by agent, action, and resource; write results to
ai_security_reporting
dataset for dashboards.
- Create scheduled queries that aggregate violations by agent, action, and resource; write results to
Best practices & operational recommendations
- Minimize scanned bytes: Project only needed fields (avoid
SELECT *
). UseUNNEST()
carefully and filter early. - Protect PII:
content_raw
may contain sensitive user content. Limit access via IAM and consider creating sanitized views that mask or redactcontent_raw
before sharing with wider teams. - Stored procedures & saved queries: Convert complex investigation SQL into stored procedures or saved queries for repeatable triage.
- Alerting: For high-severity events (e.g., many
BLOCK_REQUEST
in short window), schedule queries to write a metric table and tie it to Cloud Monitoring or a Cloud Function that publishes alerts. - Retention & export: Define retention in BigQuery or set up periodic exports for offline ML if long-term analysis is required.
Example: create a reusable view for violation triage
Troubleshooting tips
- If queries are slow/expensive:
- Add time filters, use partition pruning, cluster by high-cardinality fields, and avoid scanning
content_raw
unnecessarily.
- Add time filters, use partition pruning, cluster by high-cardinality fields, and avoid scanning
- For reproducible investigations:
- Capture the exact SQL and result snapshot (e.g., export query output to Cloud Storage) to preserve context for later audits.