15 February, 2023
min reading time
I help you get insights from your data. Easier. Faster.
Learn how to use Tableau Prep to schedule, monitor and clean your data for analysis.
Let's get started!
Tableau Prep Builder is a powerful data preparation tool that allows users to quickly and easily clean, reshape, and combine data from a variety of sources. With its intuitive interface and a wide range of built-in functions, Tableau Prep makes it easy for users to perform complex data transformations without the need for coding or scripting.
The workspace of Tableau prep builder is very visual - allowing you to easily keep track of changes made throughout the flow, and identify and fix data quality issues at each step. There are three main sections of the tableau prep workspace - the flow pane (1), the profile pane (2) and the data grid (3).
A flow is a series of steps that you take to clean and reshape your data, and shows a visual representation of what is happening along the way. The profile pane allows you to see a summarized view of your data, and provides you the ability to click on a value to highlight the other fields . Clicking somewhere here will also filter the row level data grid below which is handy if you want to drill down and investigate certain values.. The value 0.7 in the 4th column (Discount) has been clicked in the example above and you can see how the other columns are highlighted for where this value exists, and the data grid below is also filtered for this value.
At any point in your flow, you can add an output step to write the data to a file, database, published data source on Tableau Server/Tableau Cloud, or to CRM Analytics. This allows you to easily share your cleaned data with others and collaborate on your data analysis. You can also directly preview your results in tableau desktop by right clicking any step and selecting “Preview in Tableau Desktop”.
You can import whole flows, or a certain group of steps from a flow, into another flow. This can help to save time and ensure consistency in your data cleaning process.
If you would like to share your flow then you can publish it to Tableau Server/Tableau cloud and allow others to run it, edit it, or make a copy of it. This allows for easy collaboration.
Tableau prep conductor is part of the Data Management Add-On for Tableau Server, and it allows you to schedule Tableau Prep flows (which have been published to Tableau Server/Tableau Cloud) to run automatically at specific intervals such as hourly, daily, weekly or monthly. This ensures that your data is always up-to-date and ready for analysis, saving you time and effort.
Tableau Prep Conductor also allows you to monitor the performance of your flows. You can view detailed flow run information, including start and end times, status, and any error messages. This allows you to quickly identify and resolve any issues that may arise during the flow execution.
Tableau Prep supports a wide range of data sources, including Excel spreadsheets, CSV files, Tableau data sources, databases, and cloud-based data sources. When you open Tableau Prep, it’ll ask if you want to connect to data or open a previous flow. When you select the “Connect to Data” option, you choose your data source, and enter the connection details if required. After successfully connecting to your desired data source, Tableau will then take you to the workspace. From there, you can start building your flow to clean your data, or you can add more data connections via the connection pane.
Start building your flow
Add another data connection
Tableau Prep offers several data cleaning options including:
Filtering: Remove unwanted rows from your data with a calculation, filter selected values, filter on a range of values, filter using a wildcard match, or filter null values. You can also easily click on a value in the profile pane or data pane and select “keep only” or “exclude”, just like in a visualization in Tableau Desktop.
Automatically grouping values based on spelling |
Result |
Configuring your join
If you want to drill into the rows that didn’t join then you can click the bar to see the unmatched rows.
Viewing the unmatched rows
Reviewing union results
These options can be used in combination to clean and transform your data to meet your needs.
Once you’re happy with your flow, you can publish it to Tableau Server so that it can be shared with your colleagues.
As a further step, and only If you have the data management add-on for Tableau Server, you can automate your flows to run on a schedule.
Flow schedules are added in the same way as for extract refreshes and subscriptions - they are on a server level, and are set up independently from your flows to begin with, You can then select one of these predefined schedules on which to run your flow (or string of flows).
You can either select the data output steps that you want to include, or set it to include them all.
You can also select the option to send you an email each time the flow has been completed. This can be configured so that the email goes to certain groups or users and you can change the subject, add a message, and also add the data if you want.
If your flow relies on the output from another flow, you can also define this in the setup by choosing the “linked task” option. You can choose which flows should be run together and then define the running order. This means that the next flow won’t start until the previous one is finished. You can also define what to do if one of the flows fails - whether it be to stop all proceeding flows, issue data quality warnings, or send you an email.
I help you get insights from your data. Easier. Faster.
Read more articles of this author