Topics: 4 min readtime

Is ELT better than ETL?

Written by Bjorn Cornelis
Tuesday 9 March, 2021

To be honest, I wouldn’t blame you for reading the title twice in order to spot the difference. You did so, didn’t you?

Before we answer this question, let’s have a quick look at some base concepts first.

The base concepts

ETL or ELT describes the process of moving data from one or multiple source systems, into a (centralised) target data system. That central data system enables users, other processes, or applications to reuse the combined data for reporting, analysis, mining or to feed into other applications that require data.

Both concepts, ETL and ELT, describe the process and the operations to do so.

E = Extract / T = Transform / L = Load

The order of the characters determines the sequence in which the operations occur in that process.

ETL
  1. Extract the data from the source system(s)

  2. Transform the data.

  3. Load the data into the target system.

ELT, on its turn, does the exact same operations, but in a different order.

ELT
  1. Extract the data.

  2. Load the data into the target system.

  3. Transform the data.

ETL has been the standard and proven way of centralising data over the last few decades.

So why change a winning team?

In that ETL process the Transform step comes before the Load step. That means that very often during that transformation process the owner of the process already must decide which transformations are bound to happen. Meaning the data is already prepared, cleansed, filtered, enriched, or aggregated before loaded into the centralised data system.

The main reasons for doing these transformation steps upfront are 1; storage capacity, and 2, processing power of the ETL engine. It allowed us to populate our centralised data system in an optimised format and acceptable timeframe.

How ETL works
Nevertheless, this sensible approach comes with a downside. Because we are reducing the stored data in our centralised data system, we potentially lose some of our data and information along that road. Data that seems insignificant now but can be critical in tomorrow’s analysis.

With great data comes great analytics

During the past decade we’ve seen data analytics move from standard reporting into ad-hoc analytics. Data is no longer used to feed predefined reports but is consumed by a variety of tools, technologies, and users. Scorecards, dashboards, metrics, self-service analytics, predictive analytics and so on… they all want a piece of your data and they all want it in their own unpredictable way.

So how on earth will be able to decide what selection of data we will store, what kind of enrichment we will do to our data, or how our transformation will look like? We simply can’t.

Enter the cloud

But what if don’t have to make that decision upfront? What if we don’t have to worry about storage capacity or the processing power to consume this massive and diverse pool of data? What if we can just push all of our data into that centralised data system and worry about the consumption later?

Cloud data systems are the answer to today’s biggest data challenges, as well for volume capacity as for processing power. Because of their cloud nature and its scalability, both of those requirements become virtually unlimited and therefore perhaps the only affordable option left in today’s data landscape.

So instead of bargaining about which data makes it to the centralised data system, and which doesn’t, we try to store as much as possible. The more data, the better insights, the better insights, the better decisions.

Enter ELT

Without being aware, we are applying EL. We Load the data first and then later we can transform it depending on its purpose and consumer(s).

ELT process
Beside that critical aspect of data completeness, ELT comes with few other advantages as well.

  • Minimal impact on the Extract and Load step

    • As we don’t need to worry about any transformations, we can grab the data in its raw form and push it directly to the centralised data system. Whenever a new source occurs in our system, we set up a separate flow without impacting the existing ones.

  • Same data can be used for different purposes without replicating.

    • The same data in our centralised data system can be used in both an aggregated view for high performance reporting and in its most granular form for predictive modeling.

  • Other advantages

    • Use the power of the centralised data system to transform the data.

    • Lower duration to load data into the central data system without the transforming part

    • Loading and transformation are decoupled.

    • Failed transformations do not break the data pipeline.

 


Now, Is ELT better than ETL?

To cycle back to question that started this data journey: Is ELT is better than ETL?

I think it fair to conclude that ELT is the process of moving data into a centralised data system that fits closer the data and analytics challenges we face nowadays, as well as optimal uses the technical solutions out there.

Want to learn more about ELT? Don't hesitate to check out Fivetran.


- Bjorn