What is Snowflake?
Snowflake is a data cloud solution/platform, in fact, it is the only data platform built IN and FOR the cloud. Snowflake is also the only cloud data platform that can be used as a data warehouse and a data lake. This allows both functionalities which means that you no longer need to have a separate data lake and data warehouse/data marts. With Snowflake, you can build your data architecture within a single platform.
Snowflake was born from the idea of bringing the capabilities of a traditional data warehouse/lake while at the same time enabling elasticity and scalability of the cloud without worrying about things like costs, performance, or complexity of managing the system. It’s a great technology that may help you to scale up and down based on need while at the same time meeting the performance requirements. When we talk about performance, we mean really really fast. You can get a result 15-20x faster than the previous solutions which were already pretty fast. Sounds interesting for any data-driven company!
Video
In the video Reinvent the Data Warehouse the Snowflake team explains why their solution is faster and with lower cost.
How does Snowflake work?
The power of Snowflake lies in its architecture. Snowflake comprises three layers:- Centralised Database Storage
- Query Processing
- Cloud Services
A | Centralised Database Storage
Also known as a storage layer, this is a layer where Snowflake stores all the data in a form of Hybrid Columnar Storage. Unlike previous technologies where we save data in rows and columns, Snowflake stores data in blocks by compressing the data. This allows query processing to be much faster compared to fetching rows.
B | Query Processing
Consists of multiple virtual warehouses responsible for all the query processing tasks. These layers are the muscle or the backbone of the whole Snowflake system that allows you to perform massively parallel processing.
Imagine you have petabytes of data, with these multi-clusters, you can divide those into chunks of data (or as we call staging) so that your query processing will be much faster. Within this process, you can also scale up and down based on your needs and use the auto-suspend feature so that you can just pay as you use. With a certain Snowflake role, you can manage these virtual warehouses. Learn more about different roles permission here.
C | Cloud Services
If the previous layer is a muscle, then the Cloud Services layer is the brain of the whole system. This acts as an authentication and access control to automate common administration tasks such as security, automatic query optimisation, metadata and infrastructure management, etc.
Why is it the real deal?
You understand the powerful architecture now. So what makes it the real deal and why should you consider Snowflake as a solution?
- Performance and speed: From multiple virtual warehouses, automatic query optimisation, cluster tuning, micro-partitions, the whole Snowflake architecture is built to allow faster query processing.
- On-Demand Pricing: Snowflake offers on-demand pricing, meaning that you will only pay based on the amount of data you store and the compute hours/minutes you use. Unlike a traditional data warehouse, Snowflake also gives you the flexibility to easily set-up the idle time so you don’t need to pay if the warehouse is inactive.
- Zero Administration Cost: With features like auto-scaling warehouse size, auto suspend, and data sharing, you don’t need to worry about the administrative cost that normally comes with other solutions. In comparison to a traditional data warehouse, Snowflake as a SaaS requires no hardware (virtual or physical) and no software install. All of the ongoing maintenance, management, and tuning is handled directly by Snowflake
- User-friendly UX: The snowflake interface is user-friendly for both users with and without coding experience. ANSI SQL language is used to support general users.
- Compatible: You can query large datasets from various BI tools like Tableau or Einstein Analytics/Tableau CRM. It also provides support for many programming languages such as Python, R, Java, .NET, Go, C, Node.js, etc.
- Scalability: As a cloud-agnostic solution, you don’t need to worry about system failures or delay due to high queries competing, with Snowflake, queries from one virtual warehouse won’t affect others thanks to the multicluster architecture. Snowflake is distributed across availability zones on the running platforms (AWS, Azure, BigQuery).
- Easy Data Sharing: The architecture allows seamless data sharing between consumers and providers. This isn’t limited to Snowflake users, you can share your data with any recipients even if they’re not Snowflake clients.
- Security and Encryption: By default, Snowflake encrypts all customer data at no additional cost. End-to-end encryption allows only a customer and the runtime components to read the data. While Client-Side Encryption means that the cloud storage only stores the encrypted version because a user encrypts stored data before loading it into Snowflake.
- Data Processing: With SQL as a single language being used in Snowflake, users can do data blending, analysis, and transformations without needing to learn a new language in order to be able to leverage the service.
- Support Variety of File Formats: Snowflake supports both structured data such as CSV, TSV, etc. and semi-structured data including JSON, XML, Parquet, Avro, ORC.
Being a data-driven company means that you should be ready to implement the best solutions when it comes to data storage, data integration, advanced analytics, and business intelligence. Thus all the capabilities mentioned above are what make Snowflake a great data warehouse/lake.
So the next question is, are you ready to bring a great data culture and practice into your organisation?
Read the other Snowflake 101 blog posts:
- Blog | Snowflake 101: Why is Snowflake Great? (1/4)
- Blog | Snowflake 101: Setting Up Environment and Database (2/4)
- Blog | Snowflake 101: Loading Data from Local (3/4)
- Blog | Snowflake 101: Loading Data from Cloud using AWS (4/4)
Build a data-driven organization with Snowflake.
A powerful data cloud thanks to an architecture and technology that enables today’s data-driven organizations.
Want to try out Snowflake? We got you covered! Sign up for a Snowflake trial today and receive $400 worth of free usage when you test drive Snowflake. Don't hesitate to reach out to us if you need some assistance with you setting up your Snowflake trial. We'll get one of our bright minds to help you with it.
Issye Margaretha
Analytics Consultant
Biztory
Discover other Snowflake content
- Technologies - Snowflake
- Blog | Snowflake 101: Why is Snowflake Great? (1/4)
- Blog | Snowflake 101: Setting Up Environment and Database (2/4)
- Blog | Snowflake 101: Loading Data from Local (3/4)
- Blog | Snowflake 101: Loading Data from Cloud using AWS (4/4)
- Blog | What is Snowflake?
- Blog | What are the different Snowflake components?
- Blog | Is Snowflake difficult to learn
- Blog | The Power of Snowflake's Data Sharing
- Blog | Snowflake & Security - A quick overview
- Blog | How to pass the SnowPro Core certification exam