Data Engineering
dbt

dbt

PythonSQLOpen SourceFree tier

The leading data transformation tool for analytics engineers. dbt lets you write SQL SELECT statements and handles materialization, testing, documentation, and lineage. It transformed how data teams work.

License

Apache 2.0

Language

Python / SQL

52
Trust
Limited

Why dbt?

You need to transform raw data in your warehouse into clean, tested models

Your team writes SQL and wants software engineering practices (tests, version control)

You're working with Snowflake, BigQuery, Redshift, or DuckDB

Signal Breakdown

What drives the Trust Score

PyPI downloads
4.1M / mo
Commits (90d)
389 commits
GitHub stars
9.4k ★
Stack Overflow
4.2k q's
Community
High
Weighted Trust Score52 / 100

Download Trend

Last 12 months

Tradeoffs & Caveats

Know before you commit

You need real-time streaming transformations (dbt is batch)

Your team doesn't know SQL — dbt won't abstract that away

You need full orchestration — pair dbt with Airflow or Dagster

Pricing

Free tier & paid plans

Free tier

dbt Core open-source free · dbt Cloud: 1 seat free

Paid

$100/mo Team (5 seats)

Most teams start on dbt Core (free)

Alternative Tools

Other options worth considering

airflow
Apache Airflow93Excellent

The leading workflow orchestration platform for data pipelines. Airflow lets you define, schedule, and monitor complex DAG-based pipelines in Python. The standard for data engineers and ML pipeline orchestration.

Often Used Together

Complementary tools that pair well with dbt

snowflake

Snowflake

Data Engineering

80Strong
View
airflow

Apache Airflow

Data Engineering

93Excellent
View
supabase

Supabase

Database & Cache

95Excellent
View
tableau

Tableau

Business Intelligence

78Good
View
power-bi

Power BI

Business Intelligence

90Excellent
View

Learning Resources

Docs, videos, tutorials, and courses

Get Started

Repository and installation options

View on GitHub

github.com/dbt-labs/dbt-core

pippip install dbt-core dbt-postgres

Quick Start

Copy and adapt to get going fast

# Install and initialize a dbt project
pip install dbt-core dbt-snowflake
dbt init my_project
cd my_project

# Run all models
dbt run

# Run tests
dbt test

# Generate and serve docs
dbt docs generate && dbt docs serve

Code Examples

Common usage patterns

Incremental model

Only process new rows on each dbt run

-- models/marts/fct_events.sql
{{
  config(
    materialized='incremental',
    unique_key='event_id',
    on_schema_change='append_new_columns'
  )
}}

SELECT event_id, user_id, event_type, occurred_at
FROM {{ source('raw', 'events') }}

{% if is_incremental() %}
  WHERE occurred_at > (SELECT MAX(occurred_at) FROM {{ this }})
{% endif %}

Macros

Reuse SQL logic with dbt macros

-- macros/cents_to_dollars.sql
{% macro cents_to_dollars(column_name) %}
    {{ column_name }} / 100.0
{% endmacro %}

-- Usage in a model
SELECT
    order_id,
    {{ cents_to_dollars('amount_cents') }} AS total_amount
FROM {{ source('raw', 'orders') }}

Sources with freshness check

Alert when upstream data is stale

# sources.yml
version: 2

sources:
  - name: raw
    database: ANALYTICS
    schema: RAW
    freshness:
      warn_after: { count: 12, period: hour }
      error_after: { count: 24, period: hour }
    loaded_at_field: _loaded_at
    tables:
      - name: orders
      - name: customers

# dbt source freshness

Community Notes

Real experiences from developers who've used this tool