> ## Documentation Index
> Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Deploy Braintrust

> Deploy a self-hosted Braintrust data plane on AWS, GCP, or Azure

<Tabs>
  <Tab title="AWS - Lambda" icon="https://img.logo.dev/aws.amazon.com?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
    Deploy the Braintrust data plane in your AWS account using the Braintrust [Terraform module](https://github.com/braintrustdata/terraform-aws-braintrust-data-plane). This is the recommended way to self-host Braintrust on AWS.

    Braintrust recommends deploying in a dedicated AWS account. AWS enforces account-level [Lambda concurrency limits](https://docs.aws.amazon.com/lambda/latest/dg/lambda-concurrency.html), and since Braintrust's API runs on Lambda, sharing an account with other workloads can lead to throttling and service disruptions. A dedicated account also aligns with AWS best practices for workload isolation and security.

    <Tip>
      To test infrastructure provisioning before committing to production-sized resources, use the [sandbox example](https://github.com/braintrustdata/terraform-aws-braintrust-data-plane/tree/main/examples/braintrust-data-plane-sandbox). It uses minimal instance sizes and has deletion protection disabled for easy teardown. It is not suitable for performance or load testing.
    </Tip>

    ## 1. Configure the Terraform module

    The Braintrust [Terraform module](https://github.com/braintrustdata/terraform-aws-braintrust-data-plane) contains all the necessary resources for a self-hosted Braintrust data plane.

    1. Copy the entire contents of the [`examples/braintrust-data-plane`](https://github.com/braintrustdata/terraform-aws-braintrust-data-plane/tree/main/examples/braintrust-data-plane) directory from the [terraform-aws-braintrust-data-plane](https://github.com/braintrustdata/terraform-aws-braintrust-data-plane) repository into your own repository.

    2. In `provider.tf`, configure your AWS account and region.

       Supported regions: `us-east-1`, `us-east-2`, `us-west-2`, `eu-west-1`, `ca-central-1`, `ap-southeast-2`, and `sa-east-1`. If you require support for a different region, contact Braintrust.

    3. In `terraform.tf`, set up your remote backend (typically S3 and DynamoDB).

    4. In `main.tf`, customize the Braintrust deployment settings.

       The defaults are suitable for a large production-sized deployment. Adjust them based on your needs, but keep in mind the [hardware requirements](/admin/self-hosting/index#hardware-requirements).

           <Warning>
             Each deployment must have a unique `deployment_name` within the same AWS account (max 18 characters). The default is `"braintrust"`, change this if you have multiple deployments. Resource names (IAM roles, RDS instances, S3 buckets) are prefixed with this value and will collide if duplicated.
           </Warning>

           <Note>
             Brainstore instances require instance types with **local NVMe storage** for caching (e.g., `c8gd`, `c5d`, `m5d`, `i3`, `i4i` families). Generic instance types without local storage (`t3`, `m5`, `c5`) are not supported and will fail at plan time.
           </Note>

    ## 2. Initialize AWS account

    If you're using a new AWS account, run the [`create-service-linked-roles.sh`](https://github.com/braintrustdata/terraform-aws-braintrust-data-plane/blob/main/scripts/create-service-linked-roles.sh) script to create all necessary IAM service-linked roles for the deployment:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    ./scripts/create-service-linked-roles.sh
    ```

    ## 3. Configure Brainstore license

    Your deployment includes Brainstore, a high-performance query engine for real-time trace ingestion.
    Brainstore requires a license key.

    1. Go to **<Icon icon="settings-2" /> Settings** > [**<Icon icon="lock" /> Data plane**](https://www.braintrust.dev/app/~/configuration/org/api-url).

           <Note>
             Only [organization owners](/admin/access-control#built-in-permission-groups) can access this page. If you don't see your data plane configuration, [contact Braintrust](mailto:support@braintrust.dev) to enable self-hosting.
           </Note>

    2. Copy your Brainstore license.

    3. Pass the key to Terraform. The recommended approach is to store the license key in AWS Secrets Manager and reference it using a Terraform data source:

       ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
       data "aws_secretsmanager_secret_version" "brainstore_license" {
         secret_id = "braintrust/brainstore-license-key"
       }
       ```

       Then pass `data.aws_secretsmanager_secret_version.brainstore_license.secret_string` as the `brainstore_license_key` value in the module.

       Alternatively, you can pass the key without storing it in Secrets Manager:

       * Set `TF_VAR_brainstore_license_key=your-key` in your environment.
       * Pass it via command line: `terraform apply -var 'brainstore_license_key=your-key'`.
       * Add it to an uncommitted `terraform.tfvars` or `.auto.tfvars` file.

           <Warning>
             Do not commit the license key to your git repository.
           </Warning>

    ## 4. Deploy the module

    Initialize and apply the Terraform configuration:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    terraform init
    terraform apply
    ```

    <Note>
      The first `terraform apply` may fail with transient errors such as ASG health check timeouts (while instances are still booting) or Lambda rate limits. Re-running `terraform apply` resolves these.
    </Note>

    This will create all necessary AWS resources including:

    * Two isolated VPCs:
      * **Main VPC**: Hosts Braintrust services (API, database, Redis, Brainstore)
      * **Quarantine VPC**: Runs user-defined functions (scorers, tools) in network isolation. This creates \~30 Lambda functions across multiple runtimes. This is required for most production use cases.
    * Lambda functions for the Braintrust API
    * Public CloudFront endpoint and API Gateway
    * EC2 Auto-scaling group for Brainstore
    * PostgreSQL database, Redis cache, and S3 buckets
    * KMS key for encryption

    ## 5. Get your API URL

    After the deployment completes, get your API URL from the Terraform outputs:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    terraform output
    ```

    You should see output similar to:

    ```
    api_url = "https://dx6atff6gocr6.cloudfront.net"
    ```

    Save this URL. You'll need it to configure your Braintrust organization.

    ## 6. Configure your organization

    Connect your Braintrust organization to your newly deployed data plane.

    <Warning>
      Changing your live organization's API URL can disrupt access for existing users. If you are testing, create a new Braintrust organization for your data plane instead of updating your live environment.
    </Warning>

    1. Go to **<Icon icon="settings-2" /> Settings** > [**<Icon icon="lock" /> Data plane**](https://www.braintrust.dev/app/~/configuration/org/api-url).

           <Note>
             Only [organization owners](/admin/access-control#built-in-permission-groups) can access this page.
           </Note>

    2. In **API URL** area, select **Edit**.

    3. Enter the API URL from the last step.

    4. Leave the other fields blank.

    5. If your deployment is accessed through a VPN or is otherwise on a private network (not accessible from the public internet), enable **Data plane is on a private network**. This enables Chrome's Local Network Access permission handling, which is required for browser access to private network resources. When enabled, Chrome will prompt users to grant permission for the Braintrust UI to access your self-hosted data plane. See [Grant browser permissions](/admin/self-hosting/advanced#grant-browser-permissions) for details.

    6. Select **Save**.

    The UI will automatically test the connection to your new data plane. Verify that the ping to each endpoint is successful.

    ## Debug issues

    If you encounter issues, you can use the [`dump-logs.sh`](https://github.com/braintrustdata/terraform-aws-braintrust-data-plane/blob/main/scripts/dump-logs.sh) script to collect logs:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    ./scripts/dump-logs.sh <deployment_name> [--minutes N] [--service <svc1,svc2,...|all>]
    ```

    For example, to dump 60 minutes of logs for the `bt-sandbox` deployment, run:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    ./scripts/dump-logs.sh bt-sandbox
    ```

    This will save logs for all services to a `logs-<deployment_name>` directory, which you can share with the Braintrust team for debugging.

    ## Customize the deployment

    ### Use an existing VPC

    To deploy into an existing VPC instead of creating a new one, set `create_vpc = false` and provide your VPC and subnet IDs:

    ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    module "braintrust-data-plane" {
      source = "github.com/braintrustdata/terraform-aws-braintrust-data-plane"

      create_vpc = false

      existing_vpc_id              = "vpc-xxxxxxxxx"
      existing_private_subnet_1_id = "subnet-xxxxxxxxx"
      existing_private_subnet_2_id = "subnet-xxxxxxxxx"
      existing_private_subnet_3_id = "subnet-xxxxxxxxx"
      existing_public_subnet_1_id  = "subnet-xxxxxxxxx"

      # ... other configuration ...
    }
    ```

    Your existing VPC must have:

    * At least 3 private subnets across different availability zones
    * At least 1 public subnet
    * Internet and NAT gateways with properly configured route tables

    The module manages its own security groups. To also use an existing quarantine VPC, set `existing_quarantine_vpc_id` and the corresponding `existing_quarantine_private_subnet_*_id` variables.

    ### Use custom tags

    To apply custom tags to all resources, pass the `custom_tags` parameter to the Braintrust module:

    ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    module "braintrust-data-plane" {
      source = "github.com/braintrustdata/terraform-aws-braintrust-data-plane"

      custom_tags = {
        Environment = "production"
        Team        = "ml-platform"
        CostCenter  = "engineering"
      }

      # ... other configuration ...
    }
    ```

    These tags will be applied to all resources including Brainstore EC2 instances, volumes, and ENIs. The deployment name variable automatically prefixes resource names and applies a `BraintrustDeploymentName` tag across all resources.

    <Note>
      Use the `custom_tags` parameter instead of the AWS provider's `default_tags` configuration. Due to a Terraform limitation, `default_tags` are not applied to resources that use launch templates, such as Brainstore instances.
    </Note>

    ### Redis instance sizing

    <Warning>
      **Important for AWS**: Avoid using burstable Redis instances (t-family instances like `cache.t4g.micro`) in production. These instances use CPU credits that can be exhausted during high-load periods, leading to performance throttling.

      Instead, use non-burstable instances like `cache.r7g.large`, `cache.r6g.medium`, or `cache.r5.large` for predictable performance. Even if these instances seem oversized initially, they provide consistent performance without the risk of CPU credit exhaustion.
    </Warning>

    ### VPC connectivity

    To connect Braintrust's VPC to other internal resources (like an LLM gateway), use one of the following approaches:

    * Create a VPC Endpoint Service for your internal resource, then create a VPC Interface Endpoint inside of the Braintrust "Quarantine" VPC
    * Set up VPC peering with the Braintrust "Quarantine" VPC

    ### Lambda memory limits

    The API Handler and AI Proxy Lambda functions default to 10240 MB (the Lambda maximum). You can reduce these to lower costs in environments with tighter memory quotas, though Braintrust recommends keeping the defaults for production workloads.

    ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    module "braintrust-data-plane" {
      source = "github.com/braintrustdata/terraform-aws-braintrust-data-plane"

      api_handler_memory_limit = 10240  # default, valid range 1–10240 MB
      ai_proxy_memory_limit    = 10240  # default, valid range 1–10240 MB

      # ... other configuration ...
    }
    ```

    ### WAL footer version

    The `brainstore_wal_footer_version` variable controls the WAL footer format written by Brainstore. It defaults to `""` (unset) and should not be changed outside of a planned upgrade sequence.

    <Warning>
      Do not set `brainstore_wal_footer_version` without following the [upgrade guide](/admin/self-hosting/upgrade/v2). Setting it at the same time as a version bump can cause Brainstore nodes still rolling out to fail to read the new WAL format.
    </Warning>

    See [Enable efficient WAL format](/admin/self-hosting/upgrade/v2#enable-efficient-wal-format-aws) in the v2.0 upgrade guide for the correct migration steps.

    ### KMS encryption

    When `kms_key_arn` is configured, all managed S3 buckets (Brainstore, code-bundle, and Lambda responses) enforce `blocked_encryption_types = ["NONE"]`, preventing unencrypted object uploads. This policy is applied automatically as of v4.5.0 — upgrading from an earlier version will include this change in your `terraform plan`.

    ### AI Proxy CORS headers

    As of v4.5.0, the `x-bt-use-gateway` header is included in the AI Proxy Lambda function URL CORS allowed headers. Browser clients can send this header without triggering a CORS preflight rejection.
  </Tab>

  <Tab title="GCP" icon="https://img.logo.dev/cloud.google.com?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
    Deploy the Braintrust data plane in your GCP project using the Braintrust [Terraform module](https://github.com/braintrustdata/terraform-google-braintrust-data-plane) and [Helm chart](https://github.com/braintrustdata/helm). This is the recommended way to self-host Braintrust on GCP.

    ## 1. Configure the Terraform module

    The Braintrust [Terraform module](https://github.com/braintrustdata/terraform-google-braintrust-data-plane) contains all the necessary resources for a self-hosted Braintrust data plane. A dedicated Google Cloud project for your Braintrust deployment is recommended but not required.

    1. Copy the entire contents of the [`examples/braintrust-data-plane`](https://github.com/braintrustdata/terraform-google-braintrust-data-plane/tree/main/examples/braintrust-data-plane) directory from the [terraform-google-braintrust-data-plane](https://github.com/braintrustdata/terraform-google-braintrust-data-plane) repository into your own repository.

    2. In `provider.tf`, configure your Google Cloud project and region.

    3. In `backend.tf`, set up your remote backend (typically a GCS bucket).

    4. In `main.tf`, customize the Braintrust deployment settings.

       The defaults are suitable for a large production-sized deployment. Adjust them based on your needs, but keep in mind the [hardware requirements](/admin/self-hosting/index#hardware-requirements).

    ## 2. Enable Google Cloud APIs

    Before deploying, enable the required Google Cloud services. Run the following in Cloud Shell:

    1. In a Cloud Shell, set the project to deploy Braintrust into:

       ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
       gcloud config set project "your-project-id"
       ```

    2. Enable the required services:

       ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
       gcloud services enable storage-api.googleapis.com \
         storage-component.googleapis.com \
         storage.googleapis.com \
         redis.googleapis.com \
         secretmanager.googleapis.com \
         servicenetworking.googleapis.com \
         logging.googleapis.com \
         monitoring.googleapis.com \
         oslogin.googleapis.com \
         dns.googleapis.com \
         cloudresourcemanager.googleapis.com \
         compute.googleapis.com \
         cloudkms.googleapis.com \
         autoscaling.googleapis.com \
         iam.googleapis.com \
         iamcredentials.googleapis.com \
         vpcaccess.googleapis.com \
         sts.googleapis.com \
         container.googleapis.com \
         sqladmin.googleapis.com \
         artifactregistry.googleapis.com
       ```

    Allow approximately 5 minutes for services to activate.

    ## 3. Deploy the Terraform module

    Initialize and apply the Terraform configuration:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    terraform init
    terraform apply
    ```

    This will create all necessary GCP resources including:

    * GKE cluster for running Braintrust services
    * Cloud SQL PostgreSQL database
    * Cloud Memorystore Redis cache
    * Cloud Storage buckets
    * VPC network and subnets
    * Cloud KMS key for encryption

    ## 4. Set up Kubernetes

    After the Terraform deployment completes, connect to your GKE cluster:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    gcloud auth login
    gcloud config set project "<project-id>"
    gcloud container clusters get-credentials <deployment_name>-gke-autopilot --region <region>
    ```

    Verify cluster connectivity:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    kubectl cluster-info
    ```

    Create the namespace for Braintrust:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    kubectl create namespace braintrust
    ```

    ## 5. Create Kubernetes secrets

    Create the required Kubernetes secrets for your deployment. The secrets needed depend on your GCS authentication method:

    * **API**: Can use either native GCS authentication (recommended) or S3 compatibility mode with HMAC keys (legacy)
    * **Brainstore**: Always uses native GCS authentication via Workload Identity

    See [GCS authentication options](#gcs-authentication-options) below for the specific commands.

    ## 6. Deploy Helm chart

    Create a `helm-values.yaml` file for your deployment. Refer to the [Helm chart documentation](https://github.com/braintrustdata/helm) for configuration options.

    Deploy the Braintrust Helm chart to your cluster:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    helm install braintrust \
      oci://public.ecr.aws/braintrust/helm/braintrust \
      --namespace braintrust \
      --version <version> \
      --values helm-values.yaml
    ```

    See all Helm chart releases: [GitHub Releases](https://github.com/braintrustdata/helm/releases)

    ## 7. Configure Ingress (HTTPS)

    The data plane requires a publicly reachable HTTPS endpoint with a valid TLS certificate. The Helm chart deploys the API as a ClusterIP service. You are expected to provide your own ingress solution that terminates TLS and routes traffic to the braintrust-api service on port 8000.

    Common approaches:

    * GCP Application Load Balancer with a Google-managed certificate (requires a custom domain)
    * GKE Gateway API with cert-manager and Let's Encrypt
    * Cloud Run NGINX proxy with a VPC Connector for SSL termination (no custom domain required)
    * Istio/ASM Gateway - the Helm chart includes native VirtualService support (see virtualService in values.yaml)
    * Any reverse proxy or load balancer that terminates TLS and forwards HTTP to the API service

    After configuring your ingress, save the resulting HTTPS URL. You'll need it to configure your Braintrust organization.

    If your ingress uses a private or self-signed certificate, pass the CA bundle to the `bt` CLI using the `--ca-cert <PATH>` flag or the `BRAINTRUST_CA_CERT` environment variable so that bt commands can connect to your data plane.

    ## 8. Configure your organization

    Connect your Braintrust organization to your newly deployed data plane.

    <Warning>
      Changing your live organization's API URL can disrupt access for existing users. If you are testing, create a new Braintrust organization for your data plane instead of updating your live environment.
    </Warning>

    1. Go to **<Icon icon="settings-2" /> Settings** > [**<Icon icon="lock" /> Data plane**](https://www.braintrust.dev/app/~/configuration/org/api-url).

           <Note>
             Only [organization owners](/admin/access-control#built-in-permission-groups) can access this page.
           </Note>

    2. In **API URL** area, select **Edit**.

    3. Enter the API URL from the last step.

    4. Leave the other fields blank.

    5. If your deployment is accessed through a VPN or is otherwise on a private network (not accessible from the public internet), enable **Data plane is on a private network**. This enables Chrome's Local Network Access permission handling, which is required for browser access to private network resources. When enabled, Chrome will prompt users to grant permission for the Braintrust UI to access your self-hosted data plane. See [Grant browser permissions](/admin/self-hosting/advanced#grant-browser-permissions) for details.

    6. Select **Save**.

    The UI will automatically test the connection to your new data plane. Verify that the ping to each endpoint is successful.

    ## GCS authentication options

    Braintrust services use different authentication methods for Google Cloud Storage:

    * **API**: Can use either native GCS authentication (recommended) or S3 compatibility mode with HMAC keys (legacy)
    * **Brainstore**: Always uses native GCS authentication via Workload Identity

    ### Workload Identity setup

    The Terraform module automatically configures Workload Identity for your GKE cluster and creates two service accounts with the following IAM grants.

    The module creates two GCS buckets:

    * **Brainstore bucket** (`<deployment_name>-brainstore-*`): Brainstore data storage.
    * **API bucket** (`<deployment_name>-api-*`): Contains two storage paths — `code-bundle/` (API layer writes) and `brainstore-cache/` (ephemeral Brainstore cache, automatically deleted after 1 day by a GCS lifecycle rule).

    <Note>
      The `brainstore-cache/` objects are ephemeral and managed automatically. Operators do not need to manage or back up this data.
    </Note>

    | Service account | Role                                   | Resource                      | Purpose                                         |
    | --------------- | -------------------------------------- | ----------------------------- | ----------------------------------------------- |
    | Braintrust API  | `roles/storage.objectAdmin`            | Brainstore bucket, API bucket | Read/write objects                              |
    | Braintrust API  | `roles/storage.legacyBucketReader`     | Brainstore bucket, API bucket | List bucket contents                            |
    | Braintrust API  | `roles/iam.serviceAccountTokenCreator` | Itself                        | Generate short-lived tokens and signed GCS URLs |
    | Brainstore      | `roles/storage.objectAdmin`            | Brainstore bucket, API bucket | Read/write objects                              |
    | Brainstore      | `roles/storage.legacyBucketReader`     | Brainstore bucket, API bucket | List bucket contents                            |

    ### API authentication configuration

    Choose one of the following authentication methods for the API service:

    <AccordionGroup>
      <Accordion title="Native GCS auth (recommended)">
        Native GCS authentication uses the `@google-cloud/storage` SDK and Workload Identity. This is the recommended approach for enhanced security as it eliminates the need to manage service account keys.

        **Requirements:**

        * Helm chart version 3.1.0 or later
        * Workload Identity configured (automatic with Terraform module)

        **Kubernetes secrets:**

        For native GCS authentication, create secrets without GCS credentials:

        ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        kubectl create secret generic braintrust-secrets \
          --from-literal=REDIS_URL="<redis_url>" \
          --from-literal=PG_URL="<pg_url>" \
          --from-literal=FUNCTION_SECRET_KEY="<randomly_generated_secret>" \
          --from-literal=BRAINSTORE_LICENSE_KEY="<brainstore_license_key>" \
          --namespace=braintrust
        ```

        <Note>
          Refer to the Terraform outputs for the connection strings. The Brainstore license key can be found at **<Icon icon="settings-2" /> Settings** > [**<Icon icon="lock" /> Data plane**](https://www.braintrust.dev/app/~/configuration/org/api-url). Only [organization owners](/admin/access-control#built-in-permission-groups) can access this page.
        </Note>

        **Helm configuration:**

        In your Helm values file, enable native GCS authentication and configure the Google service account:

        ```yaml theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        api:
          serviceAccount:
            googleServiceAccount: "prod-braintrust@your-project-id.iam.gserviceaccount.com"
          enableGcsAuth: true
        ```

        <Note>
          The Terraform module outputs the service account email as `braintrust_service_account`. Run `terraform output braintrust_service_account` to get the full email address.

          The `enableGcsAuth` setting defaults to `false` for backwards compatibility. Contact Braintrust if you want to enable native GCS authentication by default.
        </Note>
      </Accordion>

      <Accordion title="S3 compatibility mode (legacy)">
        S3 compatibility mode uses HMAC keys to access GCS through the S3-compatible API. This is the legacy authentication method.

        **Kubernetes secrets:**

        For S3 compatibility mode, include HMAC credentials:

        ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        kubectl create secret generic braintrust-secrets \
          --from-literal=REDIS_URL="<redis_url>" \
          --from-literal=PG_URL="<pg_url>" \
          --from-literal=GCS_ACCESS_KEY_ID="<braintrust_hmac_access_id>" \
          --from-literal=GCS_SECRET_ACCESS_KEY="<braintrust_hmac_secret>" \
          --from-literal=FUNCTION_SECRET_KEY="<randomly_generated_secret>" \
          --from-literal=BRAINSTORE_LICENSE_KEY="<brainstore_license_key>" \
          --namespace=braintrust
        ```

        <Note>
          Refer to the Terraform outputs for the connection strings. The Brainstore license key can be found at **<Icon icon="settings-2" /> Settings** > [**<Icon icon="lock" /> Data plane**](https://www.braintrust.dev/app/~/configuration/org/api-url). Only [organization owners](/admin/access-control#built-in-permission-groups) can access this page.
        </Note>

        **Helm configuration:**

        In your Helm values file, ensure `enableGcsAuth` is not set or set to `false`. Do not configure `googleServiceAccount` when using HMAC keys:

        ```yaml theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        api:
          enableGcsAuth: false  # or omit this line
        ```

        <Note>
          Helm chart v5.0.1+ automatically sets `AWS_REQUEST_CHECKSUM_CALCULATION` and `AWS_RESPONSE_CHECKSUM_VALIDATION` to `WHEN_REQUIRED` when `enableGcsAuth` is disabled. This ensures AWS SDK compatibility with the GCS S3-compatible endpoint. No additional configuration is required.
        </Note>
      </Accordion>
    </AccordionGroup>

    ### Tune GCS retry behavior (optional)

    When `api.enableGcsAuth: true`, the API service uses the `@google-cloud/storage` SDK and applies the SDK's default retry policy for transient GCS errors. To override these defaults, set any of the following environment variables on the API container. If none are set, the SDK defaults apply. See [Cloud Storage retry strategy](https://docs.cloud.google.com/storage/docs/retry-strategy) for the upstream defaults and the Node.js client behavior.

    | Environment variable                       | Type              | Description                                                   |
    | ------------------------------------------ | ----------------- | ------------------------------------------------------------- |
    | `GCS_RETRY_OPTIONS_AUTO_RETRY`             | boolean           | Enable or disable automatic retries.                          |
    | `GCS_RETRY_OPTIONS_MAX_RETRIES`            | integer           | Maximum number of retry attempts per request.                 |
    | `GCS_RETRY_OPTIONS_RETRY_DELAY_MULTIPLIER` | float             | Exponential backoff multiplier between retries.               |
    | `GCS_RETRY_OPTIONS_TOTAL_TIMEOUT`          | integer (seconds) | Total time allowed across all retries for a request.          |
    | `GCS_RETRY_OPTIONS_MAX_RETRY_DELAY`        | integer (seconds) | Maximum delay between consecutive retries.                    |
    | `GCS_RETRY_OPTIONS_IDEMPOTENCY_STRATEGY`   | string            | One of `retry-always`, `retry-conditional`, or `retry-never`. |

    Set these through `api.extraEnvVars` in your Helm values file:

    ```yaml theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    api:
      enableGcsAuth: true
      extraEnvVars:
        - name: GCS_RETRY_OPTIONS_MAX_RETRIES
          value: "5"
        - name: GCS_RETRY_OPTIONS_TOTAL_TIMEOUT
          value: "120"
        - name: GCS_RETRY_OPTIONS_IDEMPOTENCY_STRATEGY
          value: "retry-conditional"
    ```

    <Note>
      Requires data plane v2.0.0 or later. These environment variables are read only when `api.enableGcsAuth: true` and have no effect in S3 compatibility mode. They apply to the API service only. Brainstore manages its GCS client independently.
    </Note>

    ### Brainstore configuration

    Brainstore always uses native GCS authentication. In your Helm values file, configure the Google service account for Workload Identity:

    ```yaml theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    brainstore:
      serviceAccount:
        googleServiceAccount: "prod-brainstore@your-project-id.iam.gserviceaccount.com"
    ```

    <Note>
      The Terraform module outputs the service account email as `brainstore_service_account`. Run `terraform output brainstore_service_account` to get the full email address.
    </Note>

    ## Customize the deployment

    ### Service account impersonation

    If Brainstore needs to access GCS buckets or other GCP resources in another project that are restricted to a specific service account identity, use `brainstore_impersonation_targets` to grant the Brainstore Kubernetes service account the ability to impersonate one or more Google Cloud service accounts.

    ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    module "braintrust_data_plane" {
      source = "github.com/braintrustdata/terraform-google-braintrust-data-plane"
      # ...other variables...
      brainstore_impersonation_targets = [
        "projects/my-other-project/serviceAccounts/data-access@my-other-project.iam.gserviceaccount.com"
      ]
    }
    ```

    This grants `roles/iam.serviceAccountTokenCreator` on each target service account to the Brainstore Kubernetes service account, enabling Brainstore to generate short-lived tokens and act as those accounts. Values must use the full resource name format `projects/{project_id}/serviceAccounts/{service_account_email}`, not bare email addresses. The default is `[]` (no impersonation).

    <Note>
      This variable is only needed if you are not separately granting the Brainstore service account IAM access to the target accounts. The Terraform executor must have `roles/iam.serviceAccountAdmin` or `roles/resourcemanager.projectIamAdmin` on each target service account (or equivalent project-level permissions). Target service accounts must also exist before running `terraform apply`.
    </Note>

    <h3 id="tag-resources-with-custom-labels-gcp">
      Custom labels
    </h3>

    To apply user-defined GCP labels to all resources created by the module, use the `custom_labels` variable:

    ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    module "braintrust_data_plane" {
      source = "github.com/braintrustdata/terraform-google-braintrust-data-plane"

      custom_labels = {
        environment = "production"
        team        = "platform"
        cost-center = "ai-infra"
      }

      # ... other configuration ...
    }
    ```

    Labels are applied to Cloud SQL, Memorystore Redis, Cloud Storage buckets, the GKE cluster, and the KMS key. The module's built-in `braintrustdeploymentname` label is always preserved when merged with your custom labels.

    <Warning>
      Keys must start with a lowercase letter and contain only lowercase letters, numbers, underscores, or dashes (max 63 characters). Values must contain only lowercase letters, numbers, underscores, or dashes, do not require a leading letter, and may be empty (max 63 characters). Maximum 63 custom labels.
    </Warning>

    ### Private Service Access range

    When the module creates the VPC (`create_vpc = true`), Cloud SQL and Memorystore connect over a Private Service Access range. By default, the module allocates a `/16` range and lets Google pick the starting address. To avoid overlap with existing VPCs, peering connections, or corporate networks, use `private_service_access_prefix_length` and `private_service_access_address`:

    ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    module "braintrust_data_plane" {
      source = "github.com/braintrustdata/terraform-google-braintrust-data-plane"

      private_service_access_prefix_length = 16
      private_service_access_address       = "10.10.0.0"

      # ... other configuration ...
    }
    ```

    * `private_service_access_prefix_length` accepts a value from 8 through 24 (default `16`). Lower prefix lengths create larger ranges. Smaller ranges work, but reduce future expansion headroom, so choose the size with your Braintrust architecture team.
    * `private_service_access_address` sets the starting address of the range. If unset, Google selects an available range.

    <Warning>
      Choose these values before your first deployment. Changing the Private Service Access range later can require rebuilding Cloud SQL, Memorystore, and other dependent resources.
    </Warning>

    ### GKE Pod and Service IP ranges

    To control the secondary IP ranges that the GKE cluster uses for Pods and Services, set either a CIDR block or the name of a pre-existing secondary range. By default, GKE allocates these ranges automatically.

    ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    module "braintrust_data_plane" {
      source = "github.com/braintrustdata/terraform-google-braintrust-data-plane"

      # Option A: let the module create the ranges from CIDR blocks
      gke_pods_ipv4_cidr_block     = "10.20.0.0/20"
      gke_services_ipv4_cidr_block = "10.30.0.0/22"

      # Option B: reference secondary ranges that already exist on the subnet
      # gke_pods_secondary_range_name     = "braintrust-pods"
      # gke_services_secondary_range_name = "braintrust-services"

      # ... other configuration ...
    }
    ```

    * The CIDR variables accept a full CIDR (`"10.20.0.0/20"`) or a netmask size only (`"/20"`).
    * The secondary range name variables reference ranges that already exist on the subnet. They do not create the ranges.
    * For each range type (Pods and Services), set the CIDR block or the secondary range name, but not both. The module rejects setting both forms for the same range type.
    * Most deployments should leave the Service range unset unless they intentionally need a custom Service CIDR.

    <Warning>
      Choose these values before your first deployment. Changing GKE secondary ranges later requires recreating the cluster.
    </Warning>
  </Tab>

  <Tab title="Azure" icon="https://img.logo.dev/azure.microsoft.com?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
    Deploy the Braintrust data plane in your Azure subscription using the Braintrust [Terraform module](https://github.com/braintrustdata/terraform-azure-braintrust-data-plane) and [Helm chart](https://github.com/braintrustdata/helm). This is the recommended way to self-host Braintrust on Azure.

    <Note>
      **Requirements**: Terraform >= 1.10.0 and the azurerm provider \~> 4.0 are required. If you have an existing deployment using azurerm 3.x, run `terraform init -upgrade` before applying and review the [azurerm v4 upgrade guide](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/4.0-upgrade-guide).
    </Note>

    ## 1. Configure the Terraform module

    The Braintrust [Terraform module](https://github.com/braintrustdata/terraform-azure-braintrust-data-plane) contains all the necessary resources for a self-hosted Braintrust data plane. A dedicated Azure subscription for your Braintrust deployment is recommended but not required.

    1. Copy the entire contents of the [`examples/default`](https://github.com/braintrustdata/terraform-azure-braintrust-data-plane/tree/main/examples/default) directory from the [terraform-azure-braintrust-data-plane](https://github.com/braintrustdata/terraform-azure-braintrust-data-plane) repository into your own repository.

    2. In `provider.tf`, configure your Azure subscription and tenant details.

    3. In `terraform.tf`, set up your remote backend (typically Azure Blob Storage).

    4. In `main.tf`, customize the Braintrust deployment settings.

       The defaults are suitable for a large production-sized deployment. Adjust them based on your needs, but keep in mind the [hardware requirements](/admin/self-hosting/index#hardware-requirements).

       The module provisions two AKS node pools:

       * `brainstore` pool (`aks_brainstore_pool_vm_size`): runs Brainstore pods. Must be a VM SKU with local NVMe SSD (e.g. `Standard_D32ds_v6`). The Azure Container Storage extension is automatically installed to configure RAID0 across the local disks.
       * `services` pool (`aks_services_pool_vm_size`): runs API and other application pods. Does not require local SSD (e.g. `Standard_D16s_v6`).

    5. Initially set `enable_front_door = false` in `main.tf`. You'll enable this later after configuring the load balancer.

    ## 2. Configure Brainstore license

    Your deployment requires a Brainstore license key.

    1. Go to **<Icon icon="settings-2" /> Settings** > [**<Icon icon="lock" /> Data plane**](https://www.braintrust.dev/app/~/configuration/org/api-url).

           <Note>
             Only [organization owners](/admin/access-control#built-in-permission-groups) can access this page. If you don't see your data plane configuration, [contact Braintrust](mailto:support@braintrust.dev) to enable self-hosting.
           </Note>

    2. Copy your Brainstore license.

    3. Pass the key to Terraform. Do not commit it to your git repository. Recommended options:
       * Set `TF_VAR_brainstore_license_key=your-key` in your environment before running `terraform apply`.
       * Pass it on the command line: `terraform apply -var 'brainstore_license_key=your-key'`.
       * Add it to an uncommitted `terraform.tfvars` or `.auto.tfvars` file.

    ## 3. Deploy the base infrastructure

    Initialize and apply the Terraform configuration:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    terraform init
    terraform apply
    ```

    This will create all necessary Azure resources including:

    * AKS cluster for running Braintrust services
    * Azure Database for PostgreSQL
    * Azure Cache for Redis
    * Azure Storage Account
    * Virtual Network
    * Azure Key Vault for encryption and secrets

    This deployment typically takes 15-20 minutes.

    ## 4. Restart PostgreSQL

    The Terraform module configures PostgreSQL extensions (`pg_cron`, `pg_partman`) and sets `cron.database_name` to the Braintrust database. These are static parameters that require a server restart to take effect. The Terraform provider is configured to not restart the server automatically, since automatic restarts on configuration changes could cause unintended downtime in production.

    After your first `terraform apply`, restart the PostgreSQL server before proceeding:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    az postgres flexible-server restart \
      --resource-group <resource-group-name> \
      --name <postgres-database-server-name>
    ```

    You can find the resource group and server name in your Terraform outputs.

    <Note>
      This step is only required on the initial deployment. Subsequent `terraform apply` runs do not require a restart unless you modify the PostgreSQL extension configuration.
    </Note>

    ## 5. Connect to AKS cluster

    After the Terraform deployment completes, connect to your AKS cluster:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    az aks get-credentials --resource-group braintrust --name braintrust-aks
    ```

    Verify cluster connectivity:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    kubectl cluster-info
    ```

    ## 6. Deploy Helm chart

    Create a `helm-values.yaml` file for your deployment. Populate it with values from your Terraform outputs:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    terraform output workload_identity_client_id
    terraform output azure_tenant_id
    terraform output key_vault_name
    terraform output storage_account_name
    ```

    Use those values to fill in your `helm-values.yaml`:

    ```yaml theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    cloud: "azure"

    global:
      orgName: "<your org name on Braintrust>"

    azure:
      tenantId: "<azure_tenant_id>"
      enableAzureContainerStorageDriver: true
      enableAzureKeyVaultDriver: true
      keyVaultCSIclientID: "<workload_identity_client_id>"
      keyVaultName: "<key_vault_name>"

    objectStorage:
      azure:
        storageAccountName: "<storage_account_name>"

    api:
      annotations:
        service:
          service.beta.kubernetes.io/azure-load-balancer-internal: "true"
      service:
        type: LoadBalancer
    ```

    Refer to the [Helm chart documentation](https://github.com/braintrustdata/helm) for the full list of configuration options.

    Deploy the Braintrust [Helm chart](https://github.com/braintrustdata/helm) to your cluster:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    helm install braintrust \
      oci://public.ecr.aws/braintrust/helm/braintrust \
      --namespace braintrust \
      --create-namespace \
      --version <version> \
      --values helm-values.yaml
    ```

    See all Helm chart releases: [GitHub Releases](https://github.com/braintrustdata/helm/releases)

    ## 7. Enable Front Door

    1. Retrieve the load balancer IP address and frontend configuration:

       ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
       lb_ip_address=$(kubectl get service braintrust-api -n braintrust \
         -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

       az network lb list \
         --query "[?frontendIPConfigurations[?privateIPAddress=='$lb_ip_address']].{Name:name, ResourceGroup:resourceGroup, FrontendIPConfig:frontendIPConfigurations[0].name, Id:frontendIPConfigurations[0].id}" \
         -o table
       ```

    2. Update `main.tf` with the values from the previous step and apply:

       ```hcl theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
       enable_front_door = true
       front_door_api_backend_address = "<LB_IPAddress>"
       front_door_load_balancer_frontend_ip_config_id = "<LB_FrontendIPConfigId>"
       ```

       ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
       terraform apply
       ```

    3. In the Azure Portal, find the private link service named `<deployment>-aks-api-pls` and manually approve it.

           <Warning>
             This manual approval step is an Azure platform requirement — Front Door cannot automatically approve private link connections to resources in a different subscription or tenant. The deployment will appear to succeed but Front Door traffic will not flow until the connection is approved.
           </Warning>

           <Warning>
             Front Door deployment takes up to 45 minutes after the Terraform apply completes. Wait for the deployment to finish before proceeding.
           </Warning>

    ## 8. Get your API URL

    After the Front Door deployment completes, get your API URL from the Terraform outputs:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    terraform output
    ```

    Test the endpoint:

    ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    curl https://<endpoint-hostname>/
    ```

    You should receive a 200 OK response. Save this URL. You'll need it to configure your Braintrust organization.

    ## 9. Configure your organization

    Connect your Braintrust organization to your newly deployed data plane.

    <Warning>
      Changing your live organization's API URL can disrupt access for existing users. If you are testing, create a new Braintrust organization for your data plane instead of updating your live environment.
    </Warning>

    1. Go to **<Icon icon="settings-2" /> Settings** > [**<Icon icon="lock" /> Data plane**](https://www.braintrust.dev/app/~/configuration/org/api-url).

           <Note>
             Only [organization owners](/admin/access-control#built-in-permission-groups) can access this page.
           </Note>

    2. In **API URL** area, select **Edit**.

    3. Enter the API URL from the last step.

    4. Leave the other fields blank.

    5. If your deployment is accessed through a VPN or is otherwise on a private network (not accessible from the public internet), enable **Data plane is on a private network**. This enables Chrome's Local Network Access permission handling, which is required for browser access to private network resources. When enabled, Chrome will prompt users to grant permission for the Braintrust UI to access your self-hosted data plane. See [Grant browser permissions](/admin/self-hosting/advanced#grant-browser-permissions) for details.

    6. Select **Save**.

    The UI will automatically test the connection to your new data plane. Verify that the ping to each endpoint is successful.

    ## Upgrade from v1.0.0

    The resource group Terraform resource address changed from `azurerm_resource_group.main` to `azurerm_resource_group.main[0]`. A `moved` block in `moved.tf` handles this automatically — no manual state manipulation is required. Before applying, run `terraform plan` and confirm the resource group shows as moved rather than destroyed and recreated. If you see a destroy planned, do not apply without investigating.

    ## Upgrade from v0.9.0

    <Warning>
      Upgrading from Terraform Azure module v0.9.0 to v1.0.0 requires several breaking changes:

      * **Node pool variables renamed**: Replace `aks_user_pool_vm_size` with `aks_brainstore_pool_vm_size` and `aks_services_pool_vm_size`, and `aks_user_pool_max_count` with `aks_brainstore_pool_max_count` and `aks_services_pool_max_count`. The existing `user` node pool will be destroyed and replaced by two new pools during the upgrade — drain and reschedule workloads before applying.
      * **azurerm provider upgrade**: Run `terraform init -upgrade` to update to azurerm \~> 4.0 before applying. Review the [azurerm v4 upgrade guide](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/4.0-upgrade-guide) for any state migrations needed (e.g. the `enable_rbac_authorization` → `rbac_authorization_enabled` rename on Key Vault resources).
      * **`brainstore_license_key` now required**: The module will fail at plan time if this variable is not set. See the [Configure Brainstore license](#2-configure-brainstore-license) step above.
    </Warning>
  </Tab>
</Tabs>

## Next steps

* [Upgrade your deployment](/admin/self-hosting/upgrade) — learn how to keep your data plane up to date
* [Advanced configuration](/admin/self-hosting/advanced) — configure telemetry, network and URL security, rate limiting, and other options
