turbot/databricks
steampipe plugin install databricks

Table: databricks_workspace - Query Databricks Workspaces using SQL

A Databricks Workspace is an environment for accessing all of your Databricks assets. The workspace organizes objects (notebooks, libraries, and experiments) into folders and provides access to data objects and computational resources including clusters, jobs, and models. Each workspace is associated with a Databricks account and includes a number of features for collaborative work.

Table Usage Guide

The databricks_workspace table provides insights into Databricks Workspaces. As a data engineer or data scientist, explore workspace-specific details through this table, including the workspace name, location, and SKU. Utilize it to uncover information about workspaces, such as the workspace's managed resource group ID, managed private network, and the provisioning state of the workspace.

Examples

Basic info

Explore which objects were created within the Databricks workspace for a specific user. This can help in understanding the user's activities and the resources they've used.

select
object_id,
created_at,
language,
object_type,
path,
size,
account_id
from
databricks_workspace
where
path = '/Users/user@turbot.com/NotebookDev';
select
object_id,
created_at,
language,
object_type,
path,
size,
account_id
from
databricks_workspace
where
path = '/Users/user@turbot.com/NotebookDev';

List all objects in workspace created in the past 7 days

Explore the most recent additions to your workspace by identifying objects that have been created within the past week. This is particularly useful for keeping track of recent changes and additions, ensuring you stay updated on the most current workspace content.

select
object_id,
created_at,
language,
object_type,
path,
size,
account_id
from
databricks_workspace
where
created_at >= now() - interval '7' day;
select
object_id,
created_at,
language,
object_type,
path,
size,
account_id
from
databricks_workspace
where
created_at >= datetime('now', '-7 day');

List all objects in workspace modified in the past 30 days

Explore which items in your workspace have been updated in the past month. This can be useful for tracking recent changes and understanding the current state of your workspace.

select
object_id,
modified_at,
language,
object_type,
path,
size,
account_id
from
databricks_workspace
where
modified_at >= now() - interval '30' day;
select
object_id,
modified_at,
language,
object_type,
path,
size,
account_id
from
databricks_workspace
where
modified_at >= datetime('now', '-30 day');

List total objects per type in workspace

Explore the distribution of different object types within your workspace to understand the composition and organization of your data. This can assist in managing resources and identifying potential areas for optimization or reorganization.

select
object_type,
count(*) as total_objects
from
databricks_workspace
group by
object_type;
select
object_type,
count(*) as total_objects
from
databricks_workspace
group by
object_type;

List total notebook objects per language in workspace

Analyze the distribution of notebook objects across different programming languages in your workspace. This could be useful to understand the most commonly used languages and guide future training or tool development.

select
language,
count(*) as total_notebooks
from
databricks_workspace
where
object_type = 'NOTEBOOK'
group by
language;
select
language,
count(*) as total_notebooks
from
databricks_workspace
where
object_type = 'NOTEBOOK'
group by
language;

Schema for databricks_workspace

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form, e.g. connection_name.
account_idtextThe Databricks Account ID in which the resource is located.
created_attextThe creation time of the workspace.
languagetextThe language of the workspace.
modified_attimestamp with time zoneThe last modified time of the workspace.
object_idbigintUnique identifier for the object.
object_typetextThe type of the object in workspace.
pathtext=The absolute path of the workspace.
sizebigintThe file size in bytes.
titletextThe title of the resource.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks

You can pass the configuration to the command with the --config argument:

steampipe_export_databricks --config '<your_config>' databricks_workspace