Why is Snowflake Great?
You are seeking a revolutionary tool to make your IT infrastructure even stronger than ever. Perhaps you’re already in the cloud with one of the leading cloud providers (AWS, Microsoft Azure, or Google Cloud Platform), yet there are certain things that you feel can be improved, and thus you are ready for a big move. Or maybe, you simply want to learn more about Snowflake since you’ve heard a lot of good things about it.
In any case, Biztory blog posts has got you covered! In this 101 series, we are going to talk about all the basic concepts you need to know about Snowflake. We will also do a hands-on practice to show you how to start experimenting with Snowflake for FREE! But before all that, let’s talk about Snowflake architecture and why is it different from other data platforms you have ever seen before.
What is Snowflake?
Snowflake is a cloud data platform, in fact, it is the only data platform built IN and FOR the cloud. Snowflake is also the only cloud data platform that can be used as a data warehouse and a data lake. This allows both functionalities which means that you no longer need to have a separate data lake and data warehouse/data marts. With Snowflake, you can build your data architecture within a single platform.
Snowflake was born from the idea of bringing the capabilities of a traditional data warehouse/lake while at the same time enabling elasticity and scalability of the cloud without worrying about things like costs, performance, or complexity of managing the system. It’s a great technology that may help you to scale up and down based on need while at the same time meet the performance requirements. When we talk about performance, we mean really really fast. You can get a result 15-20x faster than the previous solutions which were already pretty fast (see how Snowflake Reinvent the Data Warehouse). But if you can have a way faster solution with lower cost, who could resist?
How does Snowflake work?
The power of Snowflake lies in its architecture. Snowflake comprises three layers:
Centralised Database Storage
Also known as a storage layer, this is a layer where Snowflake stores all the data in a form of Hybrid Columnar Storage. Unlike previous technologies where we save data in rows and columns, Snowflake stores data in blocks by compressing the data. This allows query processing to be much faster compared to fetching rows.
Consists of multiple virtual warehouses responsible for all the query processing tasks. These layers are the muscle or the backbone of the whole Snowflake system that allows you to perform massive parallel processing. Imagine you have petabytes of data, with these multi-clusters, you can divide those into chunks of data (or as we call staging) so that your query processing will be much faster. Within this process, you can also scale up and down based on your needs and use the auto-suspend feature so that you can just pay as you use. With a certain Snowflake role, you can manage these virtual warehouses. Learn more about different roles permission here.
If the previous layer is a muscle, then the Cloud Services layer is the brain of the whole system. This acts as an authentication and access control to automate common administration tasks such as security, automatic query optimisation, metadata and infrastructure management, etc.
Why is it the real deal?
You understand the powerful architecture now. So what makes it the real deal and why should you consider Snowflake as a solution?
- Performance and speed: From multiple virtual warehouses, automatic query optimisation, cluster tuning, micro-partitions, the whole Snowflake architecture is built to allow faster query processing.
- On-Demand Pricing: Snowflake offers on-demand pricing, meaning that you will only pay based on the amount of data you store and the compute hours/minutes you use. Unlike a traditional data warehouse, Snowflake also gives you the flexibility to easily set-up the idle time so you don’t need to pay if the warehouse is inactive.
- Zero Administration Cost: With features like auto-scaling warehouse size, auto suspend, and data sharing, you don’t need to worry about the administrative cost that normally comes with other solutions. In comparison to a traditional data warehouse, Snowflake as a SaaS requires no hardware (virtual or physical) and no software install. All of the ongoing maintenance, management, and tuning is handled directly by Snowflake
- User-friendly UX: The snowflake interface is user-friendly for both users with and without coding experience. ANSI SQL language is used to support general users.
- Compatible: You can query large datasets from various BI tools like Tableau or Einstein Analytics/Tableau CRM. It also provides support for many programming languages such as Python, R, Java, .NET, Go, C, Node.js, etc.
- Scalability: As a cloud-agnostic solution, you don’t need to worry about system failures or delay due to high queries competing, with Snowflake, queries from one virtual warehouse won’t affect others thanks to the multicluster architecture. Snowflake is distributed across availability zones on the running platforms (AWS, Azure, BigQuery).
- Easy Data Sharing: The architecture allows seamless data sharing between consumers and providers. This isn’t limited to Snowflake users, you can share your data with any recipients even if they’re not Snowflake clients.
- Security and Encryption: By default, Snowflake encrypts all customer data at no additional cost. End-to-end encryption allows only a customer and the runtime components to read the data. While Client-Side Encryption means that the cloud storage only stores the encrypted version because a user encrypts stored data before loading it into Snowflake.
- Data Processing: With SQL as a single language being used in Snowflake, users can do data blending, analysis, and transformations without needing to learn a new language in order to be able to leverage the service.
- Support Variety of File Formats: Snowflake supports both structured data such as CSV, TSV, etc. and semi-structured data including JSON, XML, Parquet, Avro, ORC.
Being a data-driven company means that you should be ready to implement the best solutions when it comes to data storage, data integration, advanced analytics, and business intelligence. Thus all the capabilities mentioned-above are what make Snowflake a great data warehouse/lake.
So the next question is, are you ready to bring a great data culture and practice into your organisation? Don’t hesitate to contact us and make sure to look out fo the second part of this Snowflake 101 blog post!
Join the Data Jam
92% of companies fail to scale their analytics, which likely includes you. We have studied the patterns in hundreds of client engagements and cracked the code for a modern data stack that guarantees success.
We'll uncover this in the most original webinar you've attended this year.