Act on findings from Topics

Patterns surfaced by Topics are most useful when you act on them. This guide covers three workflows: building evaluation datasets from classified logs, scoring logs based on topic classifications, and assigning logs for human review.

Build datasets from topics

Filter logs by topic to build targeted evaluation datasets.

Go to Logs and click Filter.
Select Topics and choose the topic you want to filter by. Alternately, click SQL and enter a filter clause. See the SQL reference for more query patterns.
classifications.Task.label = "Dataset creation"
Select the logs you want to include.
Click + Dataset and choose an existing dataset or create a new one.

Common use cases:

“Error investigation” tasks → test your error handling.
Negative sentiment interactions → improve responses.
“Pricing questions” → evaluate your pricing explanations.

See Promote traces from logs for more on curating datasets from production traces.

Score logs based on topics

Create scorers that flag logs with negative sentiment, penalize specific issue types, or alert when certain topics appear together. Example scorer that flags negative checkout experiences:

TypeScript
Python

topic_scorer.ts

import braintrust from "braintrust";
import { z } from "zod";

const project = braintrust.projects.create({ name: "my-project" });

project.scorers.create({
  name: "Checkout experience",
  slug: "checkout-experience",
  description: "Flag traces with negative checkout experiences",
  parameters: z.object({
    trace: z.any(),
  }),
  handler: async ({ trace }) => {
    if (!trace) return { score: null };

    const spans = await trace.getSpans();
    const rootSpan = spans.find((s) => s.span_id === s.root_span_id);
    if (!rootSpan) return { score: null };

    const classifications = rootSpan.classifications || {};
    const taskClassification = (classifications.Task || [{}])[0];
    const sentimentClassification = (classifications.Sentiment || [{}])[0];

    if (
      taskClassification.label === "Checkout Flow" &&
      sentimentClassification.label === "NEGATIVE"
    ) {
      return {
        score: 0,
        metadata: { reason: "Negative sentiment during checkout" },
      };
    }

    return { score: 1 };
  },
});

Save the code to a file and push it:

bt functions push topic_scorer.ts

topic_scorer.py

import braintrust
from pydantic import BaseModel

project = braintrust.projects.create(name="my-project")

class TraceParams(BaseModel):
    trace: dict

async def checkout_experience_scorer(trace):
    if not trace:
        return {"score": None}

    spans = await trace.get_spans()
    root_span = next(
        (s for s in spans if s.get("span_id") == s.get("root_span_id")),
        None
    )
    if not root_span:
        return {"score": None}

    classifications = root_span.get("classifications", {})
    task_classification = classifications.get("Task", [{}])[0]
    sentiment_classification = classifications.get("Sentiment", [{}])[0]

    if (
        task_classification.get("label") == "Checkout Flow"
        and sentiment_classification.get("label") == "NEGATIVE"
    ):
        return {
            "score": 0,
            "metadata": {"reason": "Negative sentiment during checkout"},
        }

    return {"score": 1}

project.scorers.create(
    name="Checkout experience",
    slug="checkout-experience",
    description="Flag traces with negative checkout experiences",
    parameters=TraceParams,
    handler=checkout_experience_scorer,
)

Save the code to a file and push it:

bt functions push topic_scorer.py

Then configure the automation:

Go to Settings > Automations and click + Rule.
Select your scorer, set Scope to Trace, configure the sampling rate, and click Create rule.

See Score online and Trace-level scorers for more details.

Assign topics for review

Assign logs matching specific topics for human review.

Go to Logs and click Filter.
Select Topics and choose the topic you want to filter by. Alternately, click SQL and enter a filter clause. See the SQL reference for more query patterns.
classifications.Task.label = "Dataset creation"
Select the logs you want to assign.
Select Assign and choose a team member.

Team members receive email notifications when rows are assigned to them.

See Add human feedback for more on human review.

Next steps

Manage Topics to tune the pipeline that produces these classifications.
Custom facets to define new dimensions to act on.

​Build datasets from topics

​Score logs based on topics

​Assign topics for review

​Next steps

Build datasets from topics

Score logs based on topics

Assign topics for review

Next steps