ai_security.ai_security BigQuery table contains enriched records of AI security violations. These records include debug metadata, workflow context, raw content, and validation/model metadata.
You can use this table for various purposes, including:
- Root-cause analysis and triage
- Dashboards and automated alerts
- Offline machine learning or data aggregation work
Quick Investigation Checklist
Before you begin an investigation, ensure you have the following:- Required Data: The event_id, Run ID, or a specific timestamp range from the Findings dashboard.
- Permissions: The necessary bigquery.dataViewer and bigquery.jobUser permissions to access and query the data.
- Query Best Practices: Use saved views or pivoted queries to minimize manual errors.
- Data Security: Mask or redact any sensitive fields before sharing query results outside of the security team.
How to access the data
Required Privileges
To access and query the data, a user typically needs the following Identity and Access Management (IAM) roles:- roles/bigquery.dataViewer: To read the table data.
- roles/bigquery.jobUser: To run queries.
GCP Console (web UI)
- Sign in to Google Cloud Console and select the project that owns the dataset.
- Open BigQuery → Explorer panel → find
your_project→ datasetai_security→ tableai_security. - Click the table to view schema, details and preview rows.
- Use the Query editor to run SQL; review results in Table / JSON view; use the Visualization tab to build simple charts.
- Save frequently used diagnostic queries with the Console’s Saved Queries feature for reuse and sharing.
CLI (Cloud Shell or local with gcloud + bq)
- Authenticate and set the project:
- Run a simple query with
bq:
- Output JSON for programmatic parsing:
- Save queries as stored procedures in BigQuery and call them from the CLI for automation.
Relevant columns (top-level BigQuery fields)
These fields are present on each exported log row and give resource / ingestion context:logName— Log stream name.resource.type— GCP resource type (e.g.,k8s_container).resource.labels.pod_name— Pod name (k8s).resource.labels.location— Region / zone.resource.labels.namespace_name— K8s namespace.resource.labels.cluster_name— Cluster name.resource.labels.project_id— GCP project id.timestamp— Event timestamp (when it occurred).receiveTimestamp— When the log was ingested.insertId— Unique insert id (dedupe).labels.commit_hash,labels.branch,labels.full_version— Build/version metadata.
Payload structure (jsonPayload.ai_security)
jsonPayload.ai_security contains a JSON representation of the AiSecurityLogEntry proto. Important fields and their meaning:
-
event_id(string) — globally unique event id. -
event_type(enum) — e.g.,VIOLATION. -
event_description(string) — human readable description. -
user_id(string) — user that triggered the event. -
session_info— object withtab_idandsession_tracking_token. -
action(enum) — enforcement taken (BLOCK_REQUEST/ALLOW_REQUEST). -
content_raw(string) — raw content that caused the event (user prompt or retrieved content). -
content_metadata(repeated{key,value}) — context keys such as:-
RESOURCE_NAME,RESOURCE_ID,RESOURCE_URL, -
AGENT_NAME,RUN_ID,CHAT_SESSION_ID,AGENT_ID,SOURCE
-
-
validation_metadata(repeated{key,value}) — model prediction / validation debugging key-values. - (other context fields may exist — e.g., workflow entries, LLM call details, agent spans)
Common SQL patterns & examples
Notes:UNNEST() is used to flatten repeated metadata arrays. Replace YOUR_PROJECT accordingly.
Get latest 100 violations with key content & resource id
Extract a single row’s full JSON for deep debugging
Pivot content_metadata into columns (common keys)
Count of violations by action type (daily)
Top validation metadata keys/values (for model debugging)
Useful investigative workflows
- Triage a single finding
- Start with the
event_idfrom the Findings dashboard or theRun ID. - Query
event_idin BigQuery to fetchcontent_raw,content_metadataandvalidation_metadata. - Inspect LLM call traces /
llm_callandagent_spanfields (if present) to see prompt / response context.
- Start with the
- Find similar incidents
- Use
content_metadata.RESOURCE_IDor normalizedcontent_rawhashes to group similar violations. - Search by
validation_metadatakeys (e.g., model label or confidence buckets) to identify common false positives.
- Use
- Root cause of skipped users / digest generation issues
- Combine
digestentries with workflow/compiler logs (workflow,workflow_compiler) in the exported fields to see enqueue vs execution differences.
- Combine
- Automated daily rollups
- Create scheduled queries that aggregate violations by agent, action, and resource; write results to
ai_security_reportingdataset for dashboards.
- Create scheduled queries that aggregate violations by agent, action, and resource; write results to
Best practices & operational recommendations
- Minimize scanned bytes: Project only needed fields (avoid
SELECT *). UseUNNEST()carefully and filter early. - Protect PII:
content_rawmay contain sensitive user content. Limit access via IAM and consider creating sanitized views that mask or redactcontent_rawbefore sharing with wider teams. - Stored procedures & saved queries: Convert complex investigation SQL into stored procedures or saved queries for repeatable triage.
- Alerting: For high-severity events (e.g., many
BLOCK_REQUESTin short window), schedule queries to write a metric table and tie it to Cloud Monitoring or a Cloud Function that publishes alerts. - Retention & export: Define retention in BigQuery or set up periodic exports for offline ML if long-term analysis is required.
Example: create a reusable view for violation triage
Troubleshooting tips
- If queries are slow/expensive:
- Add time filters, use partition pruning, cluster by high-cardinality fields, and avoid scanning
content_rawunnecessarily.
- Add time filters, use partition pruning, cluster by high-cardinality fields, and avoid scanning
- For reproducible investigations:
- Capture the exact SQL and result snapshot (e.g., export query output to Cloud Storage) to preserve context for later audits.