Databricks

This guide explains how to create a service account (metadata access only) for Revefi on Databricks.

Step 1: Create a new Databricks access token
Revefi connects to your Databricks via an access token. You can create a new service principal for Revefi and then generate an access token for that corresponding service principal. You can also use an access token for an existing user in your workspace if that's easier. Both the approaches are described below

[Option 1]: Generate access token for a new service principal (Prefered)

Create a service principal for Revefi
Use the guide to create a service principal in your Databricks account and add it to your workspace.
Ensure that the Service Principal has Databricks SQL access and Workspace access Entitlements.
Save the application id
Grant token usage to service principal in workspace
Use the guide to grant the above service principal permissions to use access tokens.
Generate an access token for service principal
Use the guide to generate an access token for the new service principal.

[Option 2]: Generate personal access token for your user
1. Use the guide to generate a personal access token for an existing user.

Step 2: Grant Unity Catalog data permission to the service principal
Run these commands on each catalog that Revefi should have access to.

GRANT USE_CATALOG ON CATALOG <catalog_name> TO `<application_id>`;  
GRANT USE_SCHEMA ON CATALOG <catalog_name> TO `<application_id>`;  
GRANT SELECT ON CATALOG <catalog_name> TO `<application_id>`;

Revefi also needs access to the system catalog. Note that access to system catalog can only be granted by a metastore admin

GRANT USE_CATALOG ON CATALOG system TO `<application_id>`;  
GRANT USE_SCHEMA ON CATALOG system TO `<application_id>`;  
GRANT SELECT ON CATALOG system TO `<application_id>`;

Step 3: Create a Databricks SQL Warehouse
Use the guide to create a new SQL Warehouse for Revefi(Serverless is preferred). Use the 'Permissions' button and give the new service principal 'Can use' permissions on this warehouse.

Step 4: Add Databricks as a connection in Revefi
Finally now you can add your Databricks on Revefi. On the connections page, click the 'Add connection' and select Databricks as the source.
The HostName, Port and HTTP Path fields come from the SQL Warehouse created in Step 3.
The Access token comes from Step 1.

Step 5: Additional requirements for Job Optimization
Revefi can also optimize your Databricks jobs which requires additional setup.

Grant CAN_MANAGE permissions to the service principal on all jobs and all-purpose clusters. This can be done using Terraform or using the script below

#!/bin/bash

# === CONFIGURATION ===
SERVICE_PRINCIPAL_ID="<service-principal-object-id>"  # e.g., "12345678-aaaa-bbbb-cccc-1234567890ab"

# === Fetch all jobs ===
echo "Fetching all job IDs..."
JOB_IDS=$(databricks jobs list --output JSON | jq -r '.[].job_id')

# === Loop through jobs and assign CAN_MANAGE permission ===
for JOB_ID in $JOB_IDS; do
  echo "Updating permissions for job_id: $JOB_ID"

  databricks permissions jobs update --job-id "$JOB_ID" --json '{
    "access_control_list": [
      {
        "service_principal_name": "'$SERVICE_PRINCIPAL_ID'",
        "permission_level": "CAN_MANAGE"
      }
    ]
  }'

  echo "Permissions updated for job $JOB_ID"
done

echo "All job permissions updated successfully."

# === Fetch all clusters ===
echo "Fetching all-purpose cluster IDs..."
CLUSTERS=$(databricks clusters list --output JSON | jq -r '.[] | select(.cluster_source == "UI") | .cluster_id')

# === Loop through clusters and update permissions ===
for CLUSTER_ID in $CLUSTERS; do
  echo "Updating permissions for cluster_id: $CLUSTER_ID"

  databricks permissions clusters update --cluster-id "$CLUSTER_ID" --json '{
    "access_control_list": [
      {
        "service_principal_name": "'$SERVICE_PRINCIPAL_ID'",
        "permission_level": "CAN_MANAGE"
      }
    ]
  }'

  echo "Permissions updated for cluster $CLUSTER_ID"
done

echo "All-purpose cluster permissions updated successfully."

Use guide to configure jobs to write their cluster logs to a volume at path /Volumes/revefi/default/logs.

Step 6: Additional requirements for Cloudwatch Metrics
Revefi can also collect metrics from your AWS account powering end to end view of databricks jobs and clusters which requires additional setup in your AWS account.

Create a new IAM Permissions Policy named revefi-cloudwatch-metrics-reader-policy as below

{
  "Version": "2025-01-01",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "cloudwatch:GetMetricData",
        "cloudwatch:GetMetricStatistics",
        "cloudwatch:ListMetrics"
      ],
      "Resource": "*"
    }
  ]
}

Create a new IAM Permissions Policy named revefi-user-ec2-describe-policy as below

{
  "Version": "2025-01-01",
  "Statement": [
    {
      "Sid": "EC2Actions",
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "ec2:DescribeVolumes",
        "ec2:DescribeSpotPriceHistory"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

Create an AWS IAM Role with custom trust policy
Create a new AWS IAM role revefi-reader-role with the "Custom trust policy" trust entity as below

{
  "Version": "2025-01-01",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::220294960462:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "some-customer-generated-random-string"
        }
      }
    }
  ]
}

Attach the IAM Policies created in Step 1 and Step 2 to the IAM Role created in Step 3.
Now that the role and policy are created, the following information will need to be shared with
Revefi. There will be optional fields to provide this information when setting up (or editing) a Databricks
connection in the Revefi app.
- Your AWS region (e.g., us-west-2)
- The ARN of role revefi-reader-role
- The external-id added to the role revefi-reader-role

Step 7: Additional requirements for AWS Infra Cost Metrics
Revefi can get aws infra cost for your AWS account powering end to end cost observability for databricks jobs

Create a new IAM Permissions Policy named revefi-cost-metrics-reader-policy as below

{
  "Version": "2025-01-01",

  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "ce:GetCostAndUsage"
      ],
      "Resource": "*"
    }
  ]
}

Attach the IAM Policy to the AWS IAM role revefi-reader-role.
Activate Tags for Cost Allocation in your AWS account (this step can only be done by account billing admin)
- Go to the AWS Billing and Cost Management console.
- In the navigation pane, choose "Cost Allocation Tags".
- Activate the tags JobId, ClusterId, ClusterName, and RunId.