SQL queries in Braintrust provide a precise, standard syntax for querying Braintrust logs, experiments, and datasets. Use SQL to:
Filter and search for relevant logs and experiments. Use WHERE clauses to filter individual records and HAVING clauses to filter aggregated results after grouping.
Create consistent, reusable queries for monitoring.
Build automated reporting and analysis pipelines.
Write complex queries to analyze model performance.
Braintrust supports two syntax styles: standard SQL syntax, and the legacy BTQL syntax with pipe-delimited clauses. The parser automatically detects which style you’re using. SQL syntax is recommended for all new queries.
To test SQL with autocomplete, validation, and a table of results, use the SQL sandbox in your project.In the sandbox, you can use Loop to generate and optimize queries from natural language:Example queries:
“Find the most common errors in logs over the last week”
“What are the highest scoring rows in my experiment”
“Show me error distribution over time”
“List all traces where latency exceeded 60 seconds”
Loop automatically populates the sandbox with the generated query, runs it, and provides a text summary of the results along with suggestions for additional queries.Once you have a query in the sandbox, ask Loop to refine it:
“Update the query to show error distribution over time”
“Add a filter to only show errors from specific models”
“Group by user instead”
When your query has errors, Loop can help fix them. Select Fix with Loop next to the error message in the sandbox. Loop analyzes the issue type and context to provide targeted fixes for:
Syntax errors
Schema validation issues
Field name corrections
If a project_logs() query is missing a range filter on created, _xact_id, _pagination_key, or a specific root_span_id/id, the sandbox proactively warns you so you don’t have to wait for a timeout to discover the issue.
Access SQL programmatically with the Braintrust API:
curl -X POST https://api.braintrust.dev/btql \ -H "Authorization: Bearer <YOUR_API_KEY>" \ -H "Content-Type: application/json" \ -d '{"query": "SELECT * FROM project_logs('"'<YOUR_PROJECT_ID>'"') WHERE tags INCLUDES '"'triage'"' AND created > now() - interval 7 day"}'
The API accepts these parameters:
query (required): your SQL query string
fmt: response format (json or parquet, defaults to json)
tz_offset: timezone offset in minutes for time-based operations
audit_log: include audit log data
version: an _xact_id string to query data as it existed at a specific point in time (useful for recovering deleted rows). Supported for experiment and dataset sources; not supported for project_logs.
For correct day boundaries, set tz_offset to match your timezone. For example, use 480 for US Pacific Standard Time.
SQL queries follow a familiar structure that lets you define what data you want, how you want it returned, and how to analyze it.This example returns logs from the last 7 days from a project where Factuality is greater than 0.8, sorts by created date descending, and limits the results to 100.
SELECT *FROM project_logs('<PROJECT_ID>', shape => 'spans')WHERE scores.Factuality > 0.8 AND created > now() - interval 7 dayORDER BY created DESCLIMIT 100
SELECT / select:: choose which fields to retrieve
FROM / from:: specify the data source. Has an optional designator for the shape of the data: spans, traces, summary. If not specified, defaults to spans
WHERE / filter:: define conditions to filter the data
GROUP BY / dimensions:: group rows for aggregation
HAVING / final_filter:: filter aggregated results
sample:: (BTQL-only) randomly sample a subset of the filtered data (rate or count-based)
ORDER BY / sort:: set the order of results (ASC/DESC or asc/desc)
LIMIT / limit:: control result size
OFFSET '<CURSOR_TOKEN>' / cursor:: pagination token from the previous query response (SQL OFFSET must be a string literal cursor token; use with cursor-compatible sorts such as _pagination_key or _xact_id)
Always include a range filter (created, _xact_id, or _pagination_key) or scope to a specific root_span_id/id in every project_logs() query. Without one, queries scan your entire project history and will be slow or time out on large datasets.
Braintrust also supports BTQL, an alternative pipe-delimited clause syntax. The parser automatically detects whether your query is SQL or BTQL:
SQL queries start with SELECT, WITH, etc. followed by whitespace
BTQL queries use clause syntax like select:, filter:, etc.
SQL Clause
BTQL Clause
SELECT ...
select: ...
FROM table('id', shape => 'traces')
from: table('id') traces
WHERE ...
filter: ...
GROUP BY ...
dimensions: ...
HAVING ...
final_filter: ...
ORDER BY ...
sort: ...
LIMIT n
limit: n
OFFSET '<CURSOR_TOKEN>'
cursor: '<CURSOR_TOKEN>'
SQL syntax specifies the shape with a named parameter (e.g., FROM experiment('id', shape => 'traces')), while BTQL uses a trailing token (e.g., from: experiment('id') traces). Table aliases (e.g., AS t) are reserved for future use.
Full-text search: Use the MATCH infix operator for full-text search:
WHERE input MATCH 'search term' → filter: input MATCH 'search term'
Multiple columns require OR: WHERE input MATCH 'x' OR output MATCH 'x' → filter: input MATCH 'x' OR output MATCH 'x'
Unsupported SQL features: The SQL parser does not support JOIN, subqueries, UNION/INTERSECT/EXCEPT, or window functions. For PIVOT, only IN (ANY) is supported (explicit value lists, subqueries, and ORDER BY are not supported). Use BTQL’s native syntax for queries that would require these features.
When retrieving records with SQL, you can either use SELECT or SELECT ... GROUP BY. You can use most tools when using either method, but you must use GROUP BY if you want to aggregate functions to retrieve results.Both retrieval methods work with all data shapes (spans, traces, and summary). Using GROUP BY with the summary shape enables trace-level aggregations.
SELECT in SQL lets you choose specific fields, compute values, or use * to retrieve every field.
-- Get specific fieldsSELECT metadata.model AS model, scores.Factuality AS score, created AS timestampFROM project_logs('my-project-id')WHERE created > now() - interval 7 day
Implicit aliasing: Multi-part identifiers like metadata.model automatically create implicit aliases using their last component (e.g., model), which you can use in WHERE, ORDER BY, and GROUP BY clauses when unambiguous. See Field access for details.
SQL allows you to transform data directly in the SELECT clause. This query returns metadata.model, whether metrics.tokens is greater than 1000, and a quality indicator of either “high” or “low” depending on whether or not the Factuality score is greater than 0.8.
SELECT -- Simple field access metadata.model, -- Computed values metrics.tokens > 1000 AS is_long_response, -- Conditional logic (scores.Factuality > 0.8 ? "high" : "low") AS qualityFROM project_logs('my-project-id')WHERE created > now() - interval 7 day
You can also use functions in the SELECT clause to transform values and create meaningful aliases for your results. This query extracts the day the log was created, the hour, and a Factuality score rounded to 2 decimal places.
SELECT -- Date/time functions day(created) AS date, hour(created) AS hour, -- Numeric calculations round(scores.Factuality, 2) AS rounded_scoreFROM project_logs('my-project-id')WHERE created > now() - interval 7 day
Instead of a simple SELECT, you can use SELECT ... GROUP BY to group and aggregate data. This query returns a row for each distinct model with the day it was created, the total number of calls, the average Factuality score, and the latency percentile.
-- Analyze model performance over timeSELECT metadata.model AS model, day(created) AS date, count(1) AS total_calls, avg(scores.Factuality) AS avg_score, percentile(latency, 0.95) AS p95_latencyFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayGROUP BY metadata.model, day(created)
The available aggregate functions are:
count(expr): number of rows
count_distinct(expr): number of distinct values
sum(expr): sum of numeric values
avg(expr): mean (average) of numeric values
min(expr): minimum value
max(expr): maximum value
any_value(expr): an arbitrary non-null value from the group for the given expression
percentile(expr, p): a percentile where p is between 0 and 1
LIMIT works with GROUP BY queries to restrict the number of grouped results returned. When combined with ORDER BY, rows are sorted before limiting. See LIMIT for examples.
HAVING filters the results after aggregation, letting you narrow down grouped data based on aggregate values. Use HAVING with GROUP BY when you need to filter by aggregated metrics like counts, averages, or sums.This query returns models with high average scores:
-- Find models with strong performanceSELECT metadata.model AS model, avg(scores.Factuality) AS avg_score, count(1) AS total_callsFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayGROUP BY metadata.modelHAVING avg(scores.Factuality) > 0.8 AND count(1) > 100ORDER BY avg_score DESC
You can combine WHERE and HAVING to filter both before and after aggregation. This query filters individual logs before grouping, then filters the aggregated results:
-- Track high-performing production modelsSELECT metadata.model AS model, day(created) AS date, avg(scores.Factuality) AS avg_score, count(1) AS call_countFROM project_logs('my-project-id')WHERE metadata.is_production = true AND created > now() - interval 7 dayGROUP BY metadata.model, day(created)HAVING avg(scores.Factuality) > 0.7 AND count(1) >= 10ORDER BY date DESC, avg_score DESC
HAVING supports the same operators and aggregate functions as other clauses. You can reference aggregated values by their alias or by repeating the aggregate expression.
The FROM clause identifies where the records are coming from. This can be an identifier like project_logs or a function call like experiment("id").You can add an optional parameter to the FROM clause that defines how the data is returned. The options are spans (default), traces, and summary.
spans returns individual spans that match the filter criteria. This example returns 10 LLM call spans that took more than 0.2 seconds to use the first token.
SELECT *FROM project_logs('my-project-id', shape => 'spans')WHERE span_attributes.type = 'llm' AND metrics.time_to_first_token > 0.1 AND created > now() - interval 7 dayLIMIT 10
The response is an array of spans. Check out the Extend traces page for more details on span structure.
traces returns all spans from traces that contain at least one matching span. This is useful when you want to see the full context of a specific event or behavior, for example if you want to see all spans in traces where an error occurred.This example returns all spans for a specific trace where one span in the trace had an error.
SELECT *FROM project_logs('my-project-id', shape => 'traces')WHERE root_span_id = 'trace-id' AND error IS NOT NULL
The response is an array of spans. Check out the Extend traces page for more details on span structure.
summary provides trace-level views of your data by aggregating metrics across all spans in a trace. This shape is useful for analyzing overall performance and comparing results across experiments.The summary shape can be used in two ways:
Individual trace summaries (using SELECT): Returns one row per trace with aggregated span metrics. Use this to see trace-level details. Example: “What are the details of traces with errors?”
Aggregated trace analytics (using GROUP BY): Groups multiple traces and computes statistics. Use this to analyze patterns across many traces. Example: “What’s the average cost per model per day?”
Use SELECT with the summary shape to retrieve individual traces with aggregated metrics. This is useful for inspecting specific trace details, debugging issues, or exporting trace-level data.This example returns 10 summary rows from the project logs for ‘my-project-id’:
Summary rows include some aggregated metrics and some preview fields that show data from the root span of the trace.The following fields are aggregated metrics across all spans in the trace.
scores: an object with all scores averaged across all spans
metrics: an object with aggregated metrics across all spans
prompt_tokens: total number of prompt tokens used
completion_tokens: total number of completion tokens used
prompt_cached_tokens: total number of cached prompt tokens used
prompt_cache_creation_tokens: total number of tokens used to create cache entries
total_tokens: total number of tokens used (prompt + completion)
estimated_cost: total estimated cost of the trace in US dollars (prompt + completion costs)
llm_calls: total number of LLM calls
tool_calls: total number of tool calls
errors: total number of errors (LLM + tool errors)
llm_errors: total number of LLM errors
tool_errors: total number of tool errors
start: Unix timestamp of the first span start time
end: Unix timestamp of the last span end time
duration: maximum duration of any span in seconds. Note: this is not the total trace duration.
llm_duration: sum of all durations across LLM spans in seconds
time_to_first_token: the average time to first token across LLM spans in seconds
span_type_info: an object with span type info. Some fields in this object are aggregated across all spans and some reflect attributes from the root span.
cached: true only if all LLM spans were cached
has_error: true if any span had an error
Root span preview fields include input, output, expected, error, and metadata.
Use GROUP BY with the summary shape to group and aggregate traces. This is useful for analyzing patterns, monitoring performance trends, and comparing metrics across models or time periods.This example shows how to group traces by model to track performance over time:
SELECT metadata.model AS model, day(created) AS date, count(1) AS trace_count, avg(scores.Factuality) AS avg_score, avg(metrics.estimated_cost) AS avg_costFROM project_logs('my-project-id', shape => 'summary')WHERE created > now() - interval 7 dayGROUP BY 1, 2ORDER BY date DESC
The WHERE clause lets you specify conditions to narrow down results. It supports a wide range of operators and functions, including complex conditions.This example WHERE clause only retrieves data where:
Factuality score is greater than 0.8
model is “gpt-4”
tag list includes “triage”
input contains the word “question” (case-insensitive)
created date is later than January 1, 2024
more than 1000 tokens were used or the data being traced was made in production
SELECT *FROM project_logs('my-project-id')WHERE -- Simple comparisons scores.Factuality > 0.8 AND metadata.model = 'gpt-4' AND -- Array operations tags INCLUDES 'triage' AND -- Text search (case-insensitive) input ILIKE '%question%' AND -- Date ranges created > '2024-01-01' AND -- Complex conditions ( metrics.tokens > 1000 OR metadata.is_production = true )
By default, each returned trace includes at least one span that matches all filter conditions. Use ANY_SPAN() to find traces where different spans match different conditions. This returns complete traces where at least one span matches the specified condition.Single span filters work with the traces and summary data shapes.
Wrap any filter expression with ANY_SPAN() to mark it as a single span filter. This returns complete traces where at least one span matches the condition. This WHERE example returns traces where one span has an error and another span is an LLM call.
WHERE ANY_SPAN(error IS NOT NULL) AND ANY_SPAN(span_attributes.type = 'llm')
By default, ANY_SPAN() matches against all spans in a trace. To restrict matching to only root spans, add is_root to the condition: ANY_SPAN(is_root AND error IS NOT NULL).
While ANY_SPAN() helps you find traces you care about, matching spans filters let you filter spans within the traces you’ve already found. Use FILTER_SPANS() to return only specific spans that match your criteria, rather than entire traces. This is analogous to trace_filter in BTQL.Matching spans filters work with the traces and summary data shapes. On the spans shape, FILTER_SPANS() acts as a no-op wrapper.
Use FILTER_SPANS() to return only the spans that match the filter condition from within traces. While ANY_SPAN() finds traces where at least one span matches, FILTER_SPANS() returns only the matching spans from those traces.
-- Return only score spans from tracesWHERE FILTER_SPANS(span_attributes.type = 'score')-- Return only spans where Factuality equals 1WHERE FILTER_SPANS(scores.Factuality = 1)
You can combine FILTER_SPANS() with other filter conditions:
-- Return score spans from traces with duration over 1 secondWHERE FILTER_SPANS(span_attributes.type = 'score') AND metrics.duration > 1
SQL supports the % wildcard for pattern matching with LIKE (case-sensitive) and ILIKE (case-insensitive).The % wildcard matches any sequence of zero or more characters.
-- Match any input containing "question"WHERE input ILIKE '%question%'-- Match inputs starting with "How"WHERE input LIKE 'How%'-- Match emails ending with specific domainsWHERE metadata.email ILIKE '%@braintrust.com'-- Escape literal wildcards with backslashWHERE metadata.description LIKE '%50\% off%' -- Matches "50% off"
SQL supports intervals for time-based operations.This query returns all project logs from ‘my-project-id’ that were created in the last day.
SELECT *FROM project_logs('my-project-id')WHERE created > now() - interval 1 day
This query returns all project logs from ‘my-project-id’ that were created up to an hour ago.
SELECT *FROM project_logs('my-project-id')WHERE created > now() - interval 1 hour AND created < now()
This query returns all project logs from ‘my-project-id’ that were created within the last month but not within the last week
-- Examples with different unitsSELECT *FROM project_logs('my-project-id')WHERE created < now() - interval 7 day -- Last week AND created > now() - interval 1 month -- Last month
The ORDER BY clause determines the order of results. The options are DESC (descending) and ASC (ascending) on a numerical field. You can sort by a single field, multiple fields, or computed values.
-- Sort by single fieldORDER BY created DESC-- Sort by multiple fieldsORDER BY scores.Factuality DESC, created ASC-- Sort by computed valuesORDER BY len(tags) DESC
PIVOT and UNPIVOT are advanced operations that transform your results for easier analysis and comparison. Both SQL and BTQL syntax support these operations.
PIVOT transforms rows into columns, which makes comparisons easier by creating a column for each distinct value. This is useful when comparing metrics across different categories, models, or time periods.Structure:
SELECT <non-pivoted columns>, <pivoted columns>FROM <table>PIVOT( <AggregateFunction>(<ColumnToBeAggregated>) FOR <PivotColumn> IN (ANY))
Requirements:
The pivot column must be a single identifier (e.g., metadata.model)
Must include at least one aggregate measure (e.g., SUM(value), AVG(score))
Only IN (ANY) is supported (explicit value lists, subqueries, ORDER BY, and DEFAULT ON NULL are not supported)
SELECT list must include the pivot column, all measures, and all GROUP BY columns (or use SELECT *)
Pivot columns are automatically named by combining the pivot value and measure name. For example, if you pivot metadata.model with a model named “gpt-4” for measure avg_score, the column becomes gpt-4_avg_score. When using aliases, the alias replaces the measure name in the output column.Single aggregate - pivot one metric across categories:
-- Compare total score values across all scoresSELECT score, SUM(value)FROM project_logs('my-project-id')UNPIVOT (value FOR score IN (scores))PIVOT(SUM(value) FOR score IN (ANY))WHERE created > now() - interval 7 day
Multiple aggregates - pivot multiple metrics at once:
-- Compare model performance metrics across modelsSELECT day(created) AS date, metadata.model, AVG(scores.Factuality), COUNT(1)FROM project_logs('my-project-id')WHERE metadata.model IN ('gpt-4', 'gpt-3.5-turbo') AND created > now() - interval 7 dayGROUP BY day(created), metadata.modelPIVOT( AVG(scores.Factuality), COUNT(1) FOR metadata.model IN (ANY))
With aliases - name your pivoted columns:
-- Use custom column namesSELECT score, totalFROM project_logs('my-project-id')UNPIVOT (value FOR score IN (scores))PIVOT(SUM(value) AS total FOR score IN (ANY))WHERE created > now() - interval 7 day
With grouping - combine PIVOT with GROUP BY for multi-dimensional analysis:
-- Analyze scores by both model and dateSELECT day(created) AS date, metadata.model, AVG(scores.Factuality)FROM project_logs('my-project-id')WHERE created > now() - interval 7 dayGROUP BY day(created), metadata.modelPIVOT(AVG(scores.Factuality) FOR metadata.model IN (ANY))
Using SELECT * - automatically includes all required columns:
-- SELECT * includes pivot column, measures, and GROUP BY columns automaticallySELECT *FROM project_logs('my-project-id')UNPIVOT (value FOR score IN (scores))PIVOT(SUM(value) FOR score IN (ANY))WHERE created > now() - interval 7 day
UNPIVOT transforms columns into rows, which is useful when you need to analyze arbitrary scores and metrics without specifying each field name in advance. This is helpful when working with dynamic sets of metrics or when you want to normalize data for aggregation.Key-value unpivot - transforms an object into rows with key-value pairs:
-- Convert scores object into rows with score names and valuesSELECT id, score, valueFROM project_logs('my-project-id')UNPIVOT (value FOR score IN (scores))WHERE created > now() - interval 7 day
When using key-value unpivot, the source column must be an object (e.g., scores). When using array unpivot with _, the source column must be an array (e.g., tags).
Array unpivot - expands arrays by using _ as the name column:
-- Expand tags array into individual rowsSELECT id, tagFROM project_logs('my-project-id')UNPIVOT (tag FOR _ IN (tags))WHERE created > now() - interval 7 day
Array of objects unpivot - expands arrays of objects and allows accessing nested fields:
-- Expand classifications array and group by nested fieldSELECT classification.id, COUNT(1) AS countFROM project_logs('my-project-id')UNPIVOT (classification FOR _ IN (classifications.Issues))WHERE created >= NOW() - INTERVAL 3 DAYGROUP BY classification.id
This pattern is useful for analyzing classifications where each log may have multiple topic classifications, and you want to aggregate by specific properties of those classifications.Multiple unpivots - chain multiple UNPIVOT operations to expand multiple columns:
-- Expand both scores and tagsSELECT score, tagFROM project_logs('my-project-id')UNPIVOT (value FOR score IN (scores))UNPIVOT (tag FOR _ IN (tags))WHERE created > now() - interval 7 day
With aggregations - use UNPIVOT with GROUP BY to aggregate across unpivoted rows:
-- Calculate average value for each score across all logsSELECT score, AVG(value) AS avg_value, COUNT(1) AS countFROM project_logs('my-project-id')UNPIVOT (value FOR score IN (scores))WHERE created > now() - interval 7 dayGROUP BY scoreORDER BY avg_value DESC
The LIMIT clause controls the size of the result in number of records.
-- Basic limitLIMIT 100
When using LIMIT with GROUP BY, it restricts the number of grouped results returned. This is useful for getting top-N results after aggregation. When combined with ORDER BY, rows are sorted before limiting.
-- Get top 5 models by usageSELECT metadata.model AS model, count(1) AS total_callsFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayGROUP BY metadata.modelORDER BY total_calls DESCLIMIT 5
-- Limit aggregated results without sortingSELECT metadata.category AS category, avg(metrics.duration) AS avg_durationFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayGROUP BY metadata.categoryLIMIT 10
Cursors are supported in both SQL and BTQL queries. Cursors are automatically returned in query responses. After an initial query, pass the returned cursor token in the follow-on query. When a cursor has reached the end of the result set, the data array will be empty, and no cursor token will be returned.In SQL syntax, pass cursor tokens using OFFSET '<CURSOR_TOKEN>'. Numeric offsets are not supported. For cursor pagination, use cursor-compatible sorts such as _pagination_key (recommended) or _xact_id.
-- Pagination with cursor token-- Page 1 (first 100 results)SELECT *FROM project_logs('<PROJECT_ID>')WHERE created > now() - interval 7 dayORDER BY _pagination_key DESCLIMIT 100-- Page 2 (next 100 results)SELECT *FROM project_logs('<PROJECT_ID>')WHERE created > now() - interval 7 dayORDER BY _pagination_key DESCLIMIT 100OFFSET '<CURSOR_TOKEN>' -- From previous query response
You can use the following operators in your SQL queries.
-- Comparison operators= -- Equal to (alias for 'eq')!= -- Not equal to (alias for 'ne', can also use '<>')> -- Greater than (alias for 'gt')< -- Less than (alias for 'lt')>= -- Greater than or equal (alias for 'ge')<= -- Less than or equal (alias for 'le')IN -- Check if value exists in a list of values-- Null operatorsIS NULL -- Check if value is nullIS NOT NULL -- Check if value is not nullISNULL -- Unary operator to check if nullISNOTNULL -- Unary operator to check if not null-- Text matchingLIKE -- Case-sensitive pattern matching (supports % wildcard)NOT LIKE -- Negated case-sensitive pattern matchingILIKE -- Case-insensitive pattern matching (supports % wildcard)NOT ILIKE -- Negated case-insensitive pattern matchingMATCH -- Full-word semantic search (faster but requires exact word matches, e.g. 'apple' won't match 'app')NOT MATCH -- Negated full-word semantic search-- Array operatorsINCLUDES -- Check if array/object contains value (alias: CONTAINS)NOT INCLUDES -- Check if array/object does not contain value-- Logical operatorsAND -- Both conditions must be trueOR -- Either condition must be trueNOT -- Unary operator to negate condition-- Arithmetic operators+ -- Addition (alias: add)- -- Subtraction (alias: sub)* -- Multiplication (alias: mul)/ -- Division (alias: div)% -- Modulo (alias: mod)-x -- Unary negation (alias: neg)
You can use the following functions in SELECT, WHERE, GROUP BY clauses, and aggregate measures.
-- Date/time functionssecond(timestamp) -- Extract second from timestampminute(timestamp) -- Extract minute from timestamphour(timestamp) -- Extract hour from timestampday(timestamp) -- Extract day from timestampweek(timestamp) -- Extract week from timestampmonth(timestamp) -- Extract month from timestampyear(timestamp) -- Extract year from timestampdate_trunc(interval, timestamp) -- Truncate timestamp to specified interval -- Intervals: 'second', 'minute', 'hour', 'day', 'week', 'month', 'year'current_timestamp() -- Get current timestamp (alias: now())current_date() -- Get current date-- String functionslower(text) -- Convert text to lowercaseupper(text) -- Convert text to uppercaseconcat(text1, text2, ...) -- Concatenate strings-- Array functionslen(array) -- Get length of arraycontains(array, value) -- Check if array contains value (alias: includes)-- JSON functionsjson_extract(json_str, path) -- Extract value from JSON string using a path expression-- Null handling functionscoalesce(val1, val2, ...) -- Return first non-null valuenullif(val1, val2) -- Return null if val1 equals val2least(val1, val2, ...) -- Return smallest non-null valuegreatest(val1, val2, ...) -- Return largest non-null value-- Type conversionround(number, precision) -- Round to specified precision-- Cast functionsto_string(value) -- Cast value to stringto_boolean(value) -- Cast value to booleanto_integer(value) -- Cast value to integerto_number(value) -- Cast value to numberto_date(value) -- Cast value to dateto_datetime(value) -- Cast value to datetimeto_interval(value) -- Cast value to interval-- Aggregate functions (only in measures/with GROUP BY)count(expr) -- Count number of rowscount_distinct(expr) -- Count number of distinct valuessum(expr) -- Sum numeric valuesavg(expr) -- Calculate mean of numeric valuesmin(expr) -- Find minimum valuemax(expr) -- Find maximum valuepercentile(expr, p) -- Calculate percentile (p between 0 and 1)
SQL provides flexible ways to access nested data in arrays and objects:
-- Object field accessmetadata.model -- Access nested object field e.g. {"metadata": {"model": "value"}}metadata."field name" -- Access field with spaces e.g. {"metadata": {"field name": "value"}}metadata."field-name" -- Access field with hyphens e.g. {"metadata": {"field-name": "value"}}metadata."field.name" -- Access field with dots e.g. {"metadata": {"field.name": "value"}}-- Array access (0-based indexing)tags[0] -- First elementtags[-1] -- Last element-- Combined array and object accessmetadata.models[0].name -- Field in first array elementresponses[-1].tokens -- Field in last array elementspans[0].children[-1].id -- Nested array traversal-- Facets accessfacets.task -- Task facet valuefacets.sentiment -- Sentiment facet valuefacets."custom-facet" -- Custom facet (quotes for hyphens)-- Classifications access (array of objects)classifications.Task[0].label -- Classification labelclassifications.Task[0].metadata.distance -- Distance metricclassifications.Sentiment[0].label -- Sentiment classification
Array indices are 0-based, and negative indices count from the end (-1 is the last element).
When you have JSON data stored as a string field (rather than as native SQL objects), use the json_extract function to access values within it. The path parameter is treated as a literal string key name:
-- Extract from JSON string fieldsjson_extract(metadata.config, 'api_key') -- Extract the "api_key" fieldjson_extract(metadata.config, 'user_id') -- Extract the "user_id" fieldjson_extract(output, 'result') -- Extract the "result" field
When you reference multi-part identifiers (e.g., metadata.category), SQL automatically creates an implicit alias using the last component of the path (e.g., category). This allows you to use the short form in your queries when unambiguous.
-- Use short form in WHERE clauseSELECT metadata.category, metadata.modelFROM project_logs('my-project-id')WHERE category = 'support' AND model = 'gpt-4' AND created > now() - interval 7 day-- Use short form in ORDER BYSELECT metadata.user.nameFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayORDER BY name-- Use short form in GROUP BYSELECT metadata.model, COUNT(*) as cntFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayGROUP BY model
Important notes about implicit aliasing:
Ambiguity prevention: If multiple fields share the same last component (e.g., metadata.name and user.name), the short form name becomes ambiguous and cannot be used. You must use the full path instead.
Top-level field priority: Top-level fields take precedence over nested fields. If you have both id and metadata.id, the short form id refers to the top-level field.
Explicit aliases override: When you provide an explicit alias (e.g., metadata.category AS cat), the implicit alias is disabled and you must use either the explicit alias or the full path.
Duplicate alias detection: SQL will detect and reject queries with duplicate aliases in the SELECT list, whether explicit or implicit. For example, SELECT id, user.number AS id will raise an error.
Examples of ambiguous references:
-- ERROR: Cannot use 'name' - ambiguous between two fieldsSELECT metadata.name, user.nameFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayORDER BY name -- Error: ambiguous-- CORRECT: Use full path when ambiguousSELECT metadata.name, user.nameFROM project_logs('my-project-id')WHERE created > now() - interval 7 dayORDER BY metadata.name
Freeing up short forms with explicit aliases:When one field uses an explicit alias, its short form becomes available for other fields:
-- 'user_name' is the explicit alias, 'name' now refers to metadata.nameSELECT user.name AS user_name, metadata.nameFROM project_logs('my-project-id')WHERE name = 'configuration' -- Refers to metadata.name AND created > now() - interval 7 day
SQL supports conditional logic using the ternary operator (? :):
-- Basic conditionsSELECT (scores.Factuality > 0.8 ? "high" : "low") AS quality, (error IS NOT NULL ? -1 : metrics.tokens) AS valid_tokensFROM project_logs('my-project-id')WHERE created > now() - interval 7 day
-- Use in calculationsSELECT (metadata.model = "gpt-4" ? metrics.tokens * 2 : metrics.tokens) AS adjusted_tokens, (error IS NULL ? metrics.latency : 0) AS valid_latencyFROM project_logs('my-project-id')WHERE created > now() - interval 7 day
This query helps you monitor token consumption across your application.
SELECT day(created) AS time, sum(metrics.total_tokens) AS total_tokens, sum(metrics.prompt_tokens) AS input_tokens, sum(metrics.completion_tokens) AS output_tokensFROM project_logs('<YOUR_PROJECT_ID>')WHERE created > '<ISO_8601_TIME>'GROUP BY 1ORDER BY time ASC
Track model performance across different versions and configurations.
-- Compare factuality scores across modelsSELECT metadata.model AS model, day(created) AS date, avg(scores.Factuality) AS avg_factuality, percentile(scores.Factuality, 0.05) AS p05_factuality, percentile(scores.Factuality, 0.95) AS p95_factuality, count(1) AS total_callsFROM project_logs('<PROJECT_ID>')WHERE created > '2024-01-01'GROUP BY 1, 2ORDER BY date DESC, model ASC
-- Find potentially problematic responsesSELECT *FROM project_logs('<PROJECT_ID>')WHERE scores.Factuality < 0.5 AND metadata.is_production = true AND created > now() - interval 1 dayORDER BY scores.Factuality ASCLIMIT 100
-- Compare performance across specific modelsSELECT *FROM project_logs('<PROJECT_ID>')WHERE metadata.model IN ('gpt-4', 'gpt-4-turbo', 'claude-3-opus') AND scores.Factuality IS NOT NULL AND created > now() - interval 7 dayORDER BY scores.Factuality DESCLIMIT 500
Identify and investigate errors in your application.
-- Error rate by modelSELECT metadata.model AS model, hour(created) AS hour, count(1) AS total, count(error) AS errors, count(error) / count(1) AS error_rateFROM project_logs('<PROJECT_ID>')WHERE created > now() - interval 1 dayGROUP BY 1, 2ORDER BY error_rate DESC
-- Find common error patternsSELECT error.type AS error_type, metadata.model AS model, count(1) AS error_count, avg(metrics.latency) AS avg_latencyFROM project_logs('<PROJECT_ID>')WHERE error IS NOT NULL AND created > now() - interval 7 dayGROUP BY 1, 2ORDER BY error_count DESC
-- Exclude known error types from analysisSELECT *FROM project_logs('<PROJECT_ID>')WHERE error IS NOT NULL AND error.type NOT IN ('rate_limit', 'timeout', 'network_error') AND metadata.is_production = true AND created > now() - interval 1 dayORDER BY created DESCLIMIT 100
-- Track p95 latency by endpointSELECT metadata.endpoint AS endpoint, hour(created) AS hour, percentile(metrics.latency, 0.95) AS p95_latency, percentile(metrics.latency, 0.50) AS median_latency, count(1) AS requestsFROM project_logs('<PROJECT_ID>')WHERE created > now() - interval 1 dayGROUP BY 1, 2ORDER BY hour DESC, p95_latency DESC
-- Find slow requestsSELECT metadata.endpoint, metrics.latency, metrics.tokens, input, createdFROM project_logs('<PROJECT_ID>')WHERE metrics.latency > 5000 -- Requests over 5 seconds AND created > now() - interval 1 hourORDER BY metrics.latency DESCLIMIT 20
-- Track prompt token efficiencySELECT metadata.prompt_template AS template, day(created) AS date, avg(metrics.prompt_tokens) AS avg_prompt_tokens, avg(metrics.completion_tokens) AS avg_completion_tokens, avg(metrics.completion_tokens) / avg(metrics.prompt_tokens) AS token_efficiency, avg(scores.Factuality) AS avg_factualityFROM project_logs('<PROJECT_ID>')WHERE created > now() - interval 7 dayGROUP BY 1, 2ORDER BY date DESC, token_efficiency DESC
-- Find similar promptsSELECT *FROM project_logs('<PROJECT_ID>')WHERE input MATCH 'explain the concept of recursion' AND scores.Factuality > 0.8 AND created > now() - interval 7 dayORDER BY created DESCLIMIT 10
-- Monitor feedback patternsSELECT tags[0] AS primary_tag, metadata.model AS model, count(1) AS feedback_count, avg(scores.Factuality > 0.8 ? 1 : 0) AS high_quality_rateFROM project_logs('<PROJECT_ID>')WHERE tags INCLUDES 'feedback' AND created > now() - interval 30 dayGROUP BY 1, 2ORDER BY feedback_count DESC
-- Track issue resolutionSELECT created, tags, metadata.model, scores.Factuality, responseFROM project_logs('<PROJECT_ID>')WHERE tags INCLUDES 'needs-review' AND NOT tags INCLUDES 'resolved' AND created > now() - interval 1 dayORDER BY scores.Factuality ASC
A common pattern is filtering traces by both tags and scores when automated scorers apply scores at the span level while tags exist on root spans. Use separate ANY_SPAN() clauses to match traces where any span contains the tag AND any span contains the score.Each ANY_SPAN() clause evaluates independently across all spans in a trace. The conditions don’t need to match on the same span - the first ANY_SPAN() finds traces where at least one span has the tag, while the second finds traces where at least one span has the score.
-- Find traces with a specific tag and scoreSELECT *FROM project_logs('<PROJECT_ID>', shape => 'traces')WHERE ANY_SPAN(tags INCLUDES 'content-change') AND ANY_SPAN(scores."email sent" IS NOT NULL) AND created > now() - interval 7 dayLIMIT 100
For score names with spaces or special characters, use double quotes in SQL syntax:
-- Filter by tag and score with spaces in nameSELECT *FROM project_logs('<PROJECT_ID>', shape => 'summary')WHERE ANY_SPAN(tags INCLUDES 'production') AND ANY_SPAN(scores."Email Draft vs Sent Similarity" IS NOT NULL) AND created > now() - interval 7 dayORDER BY created DESCLIMIT 50
Use single span filters with aggregations to analyze traces based on span-level conditions. This is useful for understanding patterns across complex, multi-step operations.
-- Find traces with both errors and LLM spans, grouped by error typeSELECT error.type AS error_type, metadata.model AS model, count(1) AS trace_count, avg(metrics.estimated_cost) AS avg_costFROM project_logs('<PROJECT_ID>', shape => 'traces')WHERE ANY_SPAN(error IS NOT NULL) AND ANY_SPAN(span_attributes.type = 'llm') AND created > now() - interval 7 dayGROUP BY error.type, metadata.modelORDER BY trace_count DESC
-- Analyze average cost by day for traces with errors and LLM callsSELECT day(created) AS date, avg(metrics.estimated_cost) AS avg_cost, count(1) AS error_countFROM project_logs('<PROJECT_ID>', shape => 'summary')WHERE ANY_SPAN(error IS NOT NULL) AND ANY_SPAN(span_attributes.type = 'llm') AND created > now() - interval 7 dayGROUP BY day(created)ORDER BY date DESC
Use FILTER_SPANS() to analyze only the spans that match specific criteria:
-- Analyze score distribution for only scorer spansSELECT span_attributes.name AS scorer_name, avg(scores.Factuality) AS avg_score, count(1) AS span_countFROM project_logs('<PROJECT_ID>', shape => 'traces')WHERE FILTER_SPANS(span_attributes.type = 'score') AND created > now() - interval 7 dayGROUP BY span_attributes.nameORDER BY avg_score DESC
Use json_extract to extract values from a JSON string using a key name. This is useful when you have JSON data stored as a string field and need to access specific values within it. The path parameter is treated as a literal key name (not a path expression with traversal).
-- Extract a simple fieldSELECT id, json_extract(metadata.config, 'api_key') AS api_keyFROM project_logs('my-project-id')WHERE created > now() - interval 7 day
-- Extract fields with special characters in the key nameSELECT id, json_extract(metadata.settings, 'user.preferences.theme') AS theme_keyFROM project_logs('my-project-id')WHERE created > now() - interval 7 day-- Note: This extracts a key literally named "user.preferences.theme", not a nested path
-- Extract and filterSELECT *FROM project_logs('my-project-id')WHERE json_extract(metadata.config, 'environment') = 'production' AND json_extract(metadata.config, 'version') > 2.0 AND created > now() - interval 7 day
json_extract returns null for invalid JSON or missing keys rather than raising an error, making it safe to use in filters and aggregations. The path parameter is a literal key name, not a path expression - characters like dots, brackets, etc. are treated as part of the key name itself.
Classifications are categorical labels assigned by topic maps (e.g., classifications.Task[0].label). Each classification includes .label (the category), .metadata (with distance metrics), and .source information. Classifications are generated by topics automations.Filter and analyze logs by topic classifications to understand patterns in your data.
SELECT classifications.Task[0].label as topic, count(*) as countFROM project_logs('my-project-id')WHERE classifications.Task IS NOT NULL AND created > now() - interval 7 dayGROUP BY topicORDER BY count DESC
To analyze all classifications in an array rather than just the first element, use UNPIVOT to expand the array. See Array of objects unpivot for examples.
Filter by specific topic and distance threshold:
SELECT id, facets.task, classifications.Task[0].label as topicFROM project_logs('my-project-id')WHERE classifications.Task[0].label = 'Creating datasets' AND classifications.Task[0].metadata.distance < 0.5 AND created > now() - interval 7 dayORDER BY classifications.Task[0].metadata.distance ASCLIMIT 50
Facets are AI-extracted attributes that summarize logs (e.g., facets.task, facets.sentiment). They’re generated by topics automations.Query logs by facet values to identify patterns and issues.
SELECT id, created, facets.task, facets.sentimentFROM project_logs('my-project-id')WHERE facets.sentiment IN ('NEGATIVE', 'MIXED') AND created > now() - interval 7 dayORDER BY created DESCLIMIT 100
Analyze relationships between facets and classifications to gain deeper insights.
SELECT facets.sentiment, classifications.Task[0].label as topic, count(*) as count, avg(metrics.total_tokens) as avg_tokensFROM project_logs('my-project-id', shape => 'traces')WHERE facets.sentiment IS NOT NULL AND classifications.Task IS NOT NULL AND created > now() - interval 7 dayGROUP BY facets.sentiment, topicORDER BY count DESCLIMIT 20
Analyze topic distribution with distance metrics:
SELECT classifications.Task[0].label as topic, count(*) as occurrences, avg(classifications.Task[0].metadata.distance) as avg_distanceFROM project_logs('my-project-id', shape => 'traces')WHERE classifications.Task IS NOT NULL AND created > now() - interval 7 dayGROUP BY topicHAVING count(*) > 10ORDER BY occurrences DESC