What is dbt?
You know that feeling when you’re trying to do something and you just have the right tool for the job, the perfect tool? It just works.
What’s more, you can use that same tool to do a dozen different things and it doesn’t need a 50-page instruction manual and three youtube videos to decipher, that’s dbt.
dbt is a data transformation tool which can be used from both the cloud platform or a command line interface that organises and structures your transformation steps to come together at just the right time - ultimately creating beautifully orchestrated, highly modular and reusable data flows.
Why is dbt so powerful?
The reason dbt is so powerful is that it abstracts the actual orchestration of your pipeline scripts away from the developer and handles any resulting DDL (data definition language), allowing them to focus instead on capturing the business logic with a series of select statements.
So all you’ve gotta do is write a few select statements and dbt creates all the accompanying views, materialisations and tables required, neat huh?
But that’s not it, oh no.
We love dbt because as Data Engineers and Data Analytics consultants - dbt makes our job easier and leaves a much cleaner footprint for the next developer who picks up our work once we’re done. And it integrates seamlessly with GIT. So it’s perfect for team working!
We’re so sure that dbt is the right tool for us, and for you, that we’ve even built our own immersive training course to show more people how great it is. This Biztory led course is designed to take you through the set up of dbt as you might in a professional setting and get you started on your first project. It is the first of its kind in this space! If you’re interested in learning how to supercharge your transformations with dbt - check it out here!
dbt: The basics
dbt is incredibly easy to get started with if you’ve done any work in the data industry before as it uses SQL as a base (flavour depending on your underlying data platform) and as mentioned above, abstracts orchestration, leaving you with just writing sql SELECT statements to encapsulate the business logic for the data you’re after.
When you do a ‘dbt run’, dbt gathers the models you’ve built and parses them to determine the order in which they should be run to ensure your pipeline flows correctly, and then does it all for you!
This vastly simplifies model deployment and saves you a headache when a model is changed or removed - because dbt will do it all over again on the next run.
The DL on the DDL
dbt handles DDL for you - and it’s really cool. It achieves this through Jinja macros to handle references.
By doing this, you can define your datasources in one place and make calls to the datasources through the ref macro when they’re needed. This saves you a heap of typing and also helps dbt build out their lineage graph/DAG (directed acyclic graph).
We just write our select queries that encapsulate the model we’re building and dbt will build the appropriate DDL needed to build it.
And the best part is, once you’ve built a model to capture some logic, you can refer back to it from your other models with the same ref macro that you used to reference your sources. So your model is all fully reusable. Magic!
It does the boring bits for you!
Let’s be honest, testing and documentation can be boring and tedious. That’s why dbt handles it for you. A lot of the documentation can be inferred from the set up of your project and the features & testing you’ve set up. But you can also choose to add your own documentation detail in your project set up. Similarly you can specify tests that will be run on your workflow. You can also choose the severity of warning you receive from the tests that you set up. With a quick CLI command you can update the documentation you’ve specified or see the output of the testing you’ve set up - it couldn’t be easier!
Conclusion: to dbt or not to dbt?
In summary - dbt is packed with features that make your life easier, seeds to allow you to refer to CSVs in your models, custom macros & logic to allow you to add elegant logic to your model calculations and heaps more.
And it does all this whilst being highly extensible, highly configurable and automates many of the smaller (but still important!) processes that add additional overhead to your projects.
We love it as it saves us time, makes our projects much easier to manage, scale, reuse and collaborate on and allows us to focus our transformation efforts on capturing the business logic.
Let us give you a hand getting started so you too can see how powerful dbt is - join us on the 9th November for our dbt live public training course. See you there!