14 February, 2023
min reading time
Head of Marketing at Biztory
A lot has been written and said about the “Modern Data Stack”. But what is The Modern Data Stack, anyway? Buzzword bingo? Or actually worth investigating and implementing?
As businesses strive to be more data-driven, efforts to share data and collaborate between various business divisions are increasing. A Modern Data Stack infrastructure can help you do just that.
So, what does a modern data stack infrastructure look like? And how can you get started implementing one in your organisation?
Grab yourself another cup of coffee. We’re getting started…
A Modern Data Stack is a collection of tools used for collecting, storing, processing, and analysing data. The aim of a Modern Data Stack is to help your organisation save time, effort, and money.
Prioritising flexible self-service analytics, governed data on trusted platforms, and speed to insight, the MDS enables businesses to integrate cloud-based data sources with legacy and on-premises solutions, empowering end users with data, with minimal configuration.
And while the tools and data sources within your tech stack will vary from business to business, the overall structure will remain the same. More on that later!
First, let’s take a look at how the MDS differs from a legacy toolstack.
The biggest difference between a legacy toolstack and the Modern Data Stack everyone agrees on, is that the Modern Data Stack is cloud-based and requires little technical configuration by users.
In a legacy data stack, systems were often designed for a specific purpose, such as a specific type of data or a specific use case, and were not as flexible or adaptable as modern systems.
In the end, it all comes down to the same promise; A Modern Data Stack lowers the technical barrier to data integration, transformation and visualisation for end users. It promises greater scalability, accessibility and best-of-breed capabilities - saving you time, money, and a whole lot of headaches.
Ahaa… The big question now ofcourse is: what characteristics make your data stack… “Modern”?
Great question. Here’s what we would say a Modern Data Stack Infrastructure looks like:
Modern, of course, is a relative way of describing your data stack - given the speed at which these technologies evolve. But if there was a list of common characteristics, these ones would definitely make the list;
As you can see on the image above, your data warehouse sits in a central position. Modern Data Stack tools operate directly on the data in your data warehouse or lakehouse. Your data warehouse often becomes your “single source of truth”, as it is the place where data silos are broken down, and there is maximised access, control, and governance.
Most technologies of a Modern Data Stack are cloud-based and therefore SaaS offerings. They can be tested via free trials and require little to no knowledge of configuration. All underlying maintenance is mostly done by the technology itself, allowing you to focus on outcomes, rather than fixing problems in the software.
Another aspect of the Modern Data Stack is that they’re highly scalable. Thanks to the fact that they’re often cloud-based, most of the tools within this tech stack are designed for horizontal scalability.
Scalability can refer to different dimensions such as:
You don’t want to be locked-in with specific technologies or vendors. You want to find and use the tools that are best for your specific needs and business environments. Modern Data Stack tools can therefore be swapped out for other tools that can replace the same or similar functionalities.
You don’t need to know code to manage and use Modern Data Stack tools. The goal is to enable all end users to not only easily use these tools, but also manage them without in-depth technical knowledge or skills.
Knowing what makes a Modern Data Stack “modern” is one thing. Understanding the business value that comes with it, is another thing.
Let’s dig deeper…
Building a Modern Data Stack adds value to your business overall. If you’re looking for reasons why you should act sooner rather than later, here are some main points to consider:
Reason #1: Work more efficiently
We’ve said it before. The Modern Data Stack promises to save you time, money, and effort by leveraging systems that are designed with a much better standard for usability, manageability, and general human efficiency.
Whether it is managing databases in your cloud warehouse, or building reports and dashboards in your data visualisation tool, with Modern Data Stack technologies it takes significantly less time to get the job done.
Reason #2: Automation & re-usability
Your data sources can change, disrupting your analytics workflows. A Modern Data Stack allows you to automate data replication more easily. Other than that, in today’s world, no tool exists on its own.
Through APIs, you can make the technologies within your data stack work fairly easy with other tools. This makes them more of a building block in a larger whole, than a standalone tool.
Reason #3: Move to operational analytics
A modern business wants fast, reliable answers on questions to make crucial business decisions based on facts, rather than gut feeling alone. Modern Data Stacks are fast to set up, and require no need for a large IT team to support.
As a MDS can integrate with a wide variety of first and third party data sources, you’re able to start generating actionable insights in a matter of days, rather than weeks or months.
Now, let’s take a look at how you can build a Modern Data Stack one step at a time.
A Modern Data Stack consists of different components that work together to provide you with an end-to-end data solution. The different components of a Modern Data Stack are:
You want all your data centralised in one place. A modern cloud-based data warehouse like Snowflake or Google BigQuery should be able to store an accurate, up-to-date replica of all the data in your business systems.
To move data from the source to your data warehouse, you need data pipelines. Building a data pipeline can be pretty complex and requires both knowledge of the data source and some engineering skills.
Using a tool like Fivetran can automate this process, so you have more time to focus on data modelling, analysis and reporting.
This is an important step. Here, you will determine how you will clean and transform raw data to prepare it for analysis. Whether we like it or not, your data is probably a bit messy.
The same data might be flowing around and is often duplicated among different systems. The customer data in your CRM, for example, could also live in your accounting system - and there will probably be some small differences in the data between these systems.
Which system contains “the truth”?
Secondly, you’ll need to think about guidelines for naming data, documenting lineage, etc.
Before you can start with data visualisations and analysis - you need to store your data safely in a single location like a data warehouse, or a data lake for unstructured data.
The way data is stored and organised significantly affects data access and influences how easily different departments can share the data in a governed and secure way.
In that view, a data warehouse is a great solution to remove data silos and drive data clarity through a single source of truth.
Once your data is collected, stored securely in a warehouse, and prepared for analysis - it’s time to start visualising your data. There are numerous ways of visualising your data for analysis and providing decision support for your managers and leaders.
The main goal of a data viz is to present insights and other useful information about data in a way that is easy to understand. How to do that? Read our top 10 tips on how to make your dashboards look great here.
Okay, now that you understand the different stages of the Modern Data Stack and how to build a Modern Analytics flow - let’s take a look at some of the technologies that work together really well in a MDS infrastructure.
Let’s start off with an important note: there’s no such thing as a one-size-fits all approach here. It’s important to assess your own situation, business needs and requirements first.
Based on that, you’re better able to select the right technology for the job. Nonetheless, here are some tools you can consider when building your own Modern Analytics stack…
Data Collection & Ingestion: Fivetran
Fivetran is a smart and easy way to load data into your warehouse without much hassle. It offers seamless connectivity to external apps and databases with pre-built schemas that help your team shift the focus to analytics instead of data engineering.
Read more about Fivetran here.
Data Preparation: dbt
Dbt is a hot topic in the world of analytics right now. It offers a unique data engineering solution that helps teams work directly within the warehouse to produce trusted datasets for reporting, ML modelling, and operational workflows.
Combining SQL with software engineering best practices, dbt enables your team to rapidly increase data quality with confident testing and modular data modelling.
Data Storage: Snowflake
Snowflake is a unique cloud data solution. The platform enables teams to remove data silos and consolidate data warehouses, data marts, and data lakes into a single source of truth.
Data visualisation: Tableau
Tableau is a market-leading self-service analytics tool. It’s known for taking any kind of data from almost any system and turning it into actionable insights with speed and ease. It’s as simple as dragging and dropping.
Tableau is designed with a user-first approach, which drives an intuitive user experience that is focused on data analysis, rather than learning the software itself.
You can see the tool in action yourself here.