Setting Up Environment and Database
This is the second blog of our 4-series Snowflake 101 blog. Missed the first part? Don't hesitate to read it here.
Now that you have become more familiar with Snowflake and what makes it a revolutionary tool in the data market, we can continue to look at how you can set up the environment. If you haven’t yet seen it, check out our first blog post (Snowflake 101: Why Should I Use Snowflake?) in this series to learn more about Snowflake’s basic concepts and capabilities.
So now you know why you should use Snowflake as a solution, then the next question that pops up is, “How can I start leveraging and using it?”. Here we are going to talk about how to create an account, set up your environment, and create your first Snowflake database.
Create your Snowflake Account and Start Discovering for Free
Snowflake offers a 30-days free account with $400 worth of credits that you can use to play around to query your data. Make sure to choose the Enterprise or Business Critical edition for this trial to take advantage of the most features. Next, for the purpose of this tutorial, we are going to choose AWS as our cloud platform, but you can definitely use Azure or Google Cloud Platform (GCP) too if you are already utilising those. Once signed in, you’ll have a unique URL containing your account number, followed by the zone in which Snowflake has been installed. Using this link, you will be directed to this web-interface login prompt, and voila! You are ready to conquer the Snowflake world.
Setting up your Snowflake Environment
The first thing that you will need to do is to create an environment for your specific project. By default, Snowflake generates a warehouse called compute_wh when you sign in. But we are going to create our own environment.
This virtual warehouse environment is required to query data from Snowflake as well as loading data into Snowflake. A warehouse is a cluster of servers that provides compute resources. We use warehouses to execute compute queries and perform all the operations including bulk data loading. Multiple warehouses allow us to distribute resource-intensive tasks such as querying and data loading. We can also isolate our query workloads from others and resize our warehouse at any time to improve performance.
There are two ways to create a warehouse, from the UI or with SQL Code. Both will give the same output, except that you might want to consider using code in case you need to create multiple objects all at once.
From the UI: Warehouses -> Create
Via worksheet, you can run this code:
CREATE WAREHOUSE SNOWFLAKE_WH
WAREHOUSE_SIZE = XSMALL
AUTO_SUSPEND = 60
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE
STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 300
STATEMENT_TIMEOUT_IN_SECONDS = 600;
Here we are creating a SNOWFLAKE_WH virtual environment with the smallest size, auto suspends after 60 seconds of idle time, Automatic Resume once you again execute any query on the Snowflake warehouse, and the last 3 lines of codes for further idle timeout setting.
It is important to configure these options wisely based on your needs. In this tutorial, performance is not our main concern since we are going to use rather simple data. Therefore we are carrying this set-up to optimize cost instead. In case you have big data and want to focus on performance, you can resize the warehouse-size to, for example, Large to 4X-Large. Other scenarios would include considering both cost and performance or prioritise one of them, regardless, any choice that you make here depends on the specific use cases.
Create your first database
Similar to creating a warehouse, you can also create a database via the user interface as well as using code. It is important to note that using a number as a first character or space within the name is prohibited.
From the UI: Databases -> Create
Via worksheet, run this code:
CREATE DATABASE Retail_DB_Sales
COMMENT = 'Retail Sales Info';
Create schemas within a database
Inside a database, we can create multiple schemas and each schema can contain multiple tables and views. To check the existing schema inside of your database, you can simply click the database name -> Schemas. Each time you create a new database, PUBLIC schema (that can be used to create other objects) and INFORMATION_SCHEMA (that contains the database’s metadata) are automatically generated. To create a new schema, you can use both the UI and the code provided below:
1. From the UI: Databases -> Click Retail_DB_Sales -> Schemas -> Create -> Check Management Access (in case you want the security objects to be managed by the owner of the individual object instead of the schema owner)
2. Via worksheet, run this code:
CREATE SCHEMA Sales_Data;
CREATE SCHEMA Sales_Views;
CREATE SCHEMA Sales_Stage;
USE WAREHOUSE SNOWFLAKE_WH;
USE DATABASE Retail_DB_Sales;
USE SCHEMA Sales_Data;
The second batch of codes is used to make sure that we correctly set up the context to the warehouse, database, and schema we made before. You can also set this up using the console on the top right corner.
Your environment is all set, now you can start to play around with your data and are ready to discover the Snowflake interface even further.
Make sure to look out fo the third part of this Snowflake 101 blog post!
Join the Data Jam
92% of companies fail to scale their analytics, which likely includes you. We have studied the patterns in hundreds of client engagements and cracked the code for a modern data stack that guarantees success.
We'll uncover this in the most original webinar you've attended this year.