dbt(data built tool) commands cheat sheet

Introduction

dbt (data build tool)  is a transformation tool.  It doesn’t extract or load data, but it’s very useful at transforming data that’s already loaded into your warehouse.
Here I combined most of the useful dbt commands into an overview.

dbt generic commands

dbt init project_name – performs several actions necessary to create a new dbt project.
dbt deps – install the dbt dependencies from packages.yml file
dbt clean – this will remove the /dbt_modules (populated when you run deps) and /target folder (populated when models are run)
dbt run – regular run. will run all models based on hierarchy
dbt run –full-refresh – will refresh incremental models
dbt test – will run custom data tests and schema tests
dbt seed – will load CSV files specified in the data-paths directory into the data warehouse. Also, see the seeds section of this guide
dbt compile – compiles all models. This isn’t a command you will need to run regularly. dbt will compile the models when you run any models.
dbt snapshot – execute all the snapshot defined in your project
dbt clean – a utility function that deletes all folders specified in the clean-targets list specified in dbt_project.yml. It is useful for deleting the dbt_modules and target directories.
dbt debug – make sure your connection, config file, and dbt dependencies are good.
dbt run threads 2 – run all models in 2 threads and also over-ride the threads in profiles.yml

dbt model specifying commands

Specifying models can save you a lot of time by only running/testing the models that you think are relevant.

However, there is a risk that you’ll forget to specify that dependency is needed or not, so it’s a good idea to understand the syntax thoroughly:

Running based on the model name

dbt run –models modelname – will only run modelname
dbt run –models +modelname – will run modelname and all parents
dbt run –models modelname+ – will run modelname and all children
dbt run –models +modelname+ – will run modelname, and all parents and children
dbt run –models @modelname – will run modelname, all parents, all children, and all parents of all children
dbt run –exclude modelname – will run all models except modelname

Running based on the folder name

dbt run –models folder – will run all models in a folder
dbt run –models folder. subfolder – will run all models in the subfolder
dbt run –models +folder. subfolder – will run all models in the subfolder and all parents
dbt run –models folder.subfolder+ – will run all models in the subfolder and all children
dbt run –models +folder.subfolder+ – will run all models in the subfolder, all parents, all children
dbt run –models @folder.subfolder – will run all models in the subfolder, all parents, all children, AND all parents of all children
dbt run –exclude folder – will run all models except the folder

Running based on tag

dbt run –models tag:tagname – will run only tagged models.
dbt run –models +tag:tagname – will run tagged models and all parents.
dbt run –models tag:tagname+ – will run tagged models and all children.
dbt run –models +tag:tagname+ – will run tagged models and all parents and children.
dbt run –models @tag:tagname – will run tagged, all parents, all children, AND all parents of all children.
dbt run –exclude tag:tagname – will run all models except the tagged models

In here –models can be replaced by -m.
dbt test can also have all combinations syntax referenced for dbt run

Multiple model inputs in dbt command

dbt run –models modelname+ folder @tag:tagname modelname – like this any number of models can be specified.

dbt run –exclude modelname folder tag:tagname modelname – like this any number of models can be excluded.

Special commands

help command
help command shows the available input combinations and sub-commands also.
ex:  dbt run –help, dbt docs –help

dbt source
It provides subcommands that are helpful when working with source data
dbt source snapshot-freshness – this command will query all the source table defined and determines the freshness of the tables.
 
dbt docs
dbt docs generate – a very powerful command which will generate documentation for the models in your folder based on config files.
dbt docs serve –port 8001 – it will host the docs in your local browser.
Users can have more info about each model, dependencies, and also DAG diagram.

Treat warnings as errors
dbt –warn-error run – some time dbt shows warning like the use of deprecated methods or configurations, if you want to treat that as an error then this command will help you

Failing fast
dbt run –fail-fast(x) – to make dbt exit immediately if a single model fails to build. If other models are in-progress when the first model fails, then dbt will terminate the connections for these still-running models.

 Enable or Disable Colorized Logs
dbt –use-colors run – color enabled by default
dbt –no-use-colors run –  disable the terminal color logs(green/red)

list resources (CLI only)
dbt ls(list) – list all the models and sources in the dbt project folder

4 thoughts on “dbt(data built tool) commands cheat sheet”

  1. Pingback: Data Science Cheat Sheets - Part 1 - Console Flare

  2. Pingback: 7 Essential Cheat Sheets for Data Engineering - KDnuggets

  3. Pingback: 7 Essential Cheat Sheets for Data Engineering – KDnuggets - Technology Review Ireland Bringing Technology News.

  4. Pingback: 7 essential cheats for data engineering – My Blog

Leave a Comment

Your email address will not be published. Required fields are marked *