dbt Setup
We use dbt to build our data warehouse. You can view every model on OSO here: https://models.opensource.observer.
This guide walks you through setting up dbt (Data Build Tool) for OSO development.
Prerequisites
- Python >=3.11
- Python uv >= 0.6
- git
- A GitHub account
- BigQuery access
gcloud
CLI
Installing gcloud CLI
For macOS users:
brew install --cask google-cloud-sdk
For other platforms, follow the official instructions.
Installation
-
Follow the installation instructions in our monorepo README.
-
Activate the virtual environment:
source .venv/bin/activate
- Verify dbt is installed:
which dbt
- Authenticate with gcloud:
gcloud auth application-default login
- Run the setup wizard:
uv sync && uv run oso_lets_go
The wizard will create a GCP project and BigQuery dataset if needed, copy a subset of OSO data for development, and configure your dbt profile.
Configuration
dbt Profile Setup
Create or edit ~/.dbt/profiles.yml
:
opensource_observer:
outputs:
production:
type: bigquery
dataset: oso
job_execution_time_seconds: 300
job_retries: 1
location: US
method: oauth
project: opensource-observer
threads: 32
playground:
type: bigquery
dataset: oso_playground
job_execution_time_seconds: 300
job_retries: 1
location: US
method: oauth
project: opensource-observer
threads: 32
target: playground
VS Code Setup
-
Install the Power User for dbt core extension
-
Get your virtual environment path:
echo 'import sys; print(sys.prefix)' | uv run -
- In VS Code:
- Open command palette
- Select "Python: select interpreter"
- Choose "Enter interpreter path..."
- Enter the virtual path
Running dbt
Basic usage:
dbt run
Target specific model:
dbt run --select {model_name}
By default, this writes to the opensource-observer.oso_playground
dataset.
For more details on working with dbt models, see our Data Models Guide.