Introduction to Snowflake

Snowflake has been in the talks and being compared to various other cloud based data solutions lately. In this blog post we will see various offerings provided by Snowflake and it’s architecture.

Initially Snowflake started with Data warehousing as its main offering. Today, there are 6 different workloads that it offers:

  • Data Warehouse
  • Data Lake
  • Data Engineering
  • Data Exchange
  • Data Applications
  • Data Science

The advantages of having all these workloads running on a single platform are immense. Some of which are easy and reliable data governance and data observability.

Moreover, Snowflake is a complete SAS offering and can run on either AWS, Azure, or GCP in the backend which gives you the option of being multi-cloud as well in your disaster recobery setup. There is no hardware and software to install, configure, or manage. Also, ongoing maintenance, management, upgrades, and tuning are handled by Snowflake.


Snowflake architecture consists of three layers with each layer having the capability to scale independently. Also, as can be seen from the diagram, Snowflake offers separate storage and compute layers which enables us to scale them separately as opposed to earlier cloud based data warehouses with combined storage and compute.

Services layer

The cloud services layer is a collection of services that coordinate activities across Snowflake. The cloud services layer also runs on compute instances provisioned by Snowflake from the cloud provider. Some of its functions include authenticating users, managing sessions, managing the processing layer, coordinating access to data storage, etc.

Storage layer

Snowflake provides unlimited and fast access to storage. It uses cloud provider’s storage and works on hybrid of shared-disk and shared-nothing architecture. Shared-disk is achieved as Storage layer is common to all warehouses and shared-nothing is achieved via the SSD of warehouses.

Processing layer

Processing layer consists of multiple virtual warehouses that access the same data stored in the storage layer without any resource contention.
To create a warehouse, we just need to specify the name and size of the warehouse. All the provisioning and configuration of the underlying compute resources is handled by snowflake. The speed at which this provisioning happens is fast when compared to other competitors. We can have warehouses of different sizes for different workloads and have the flexibility to resize them as and when required which also happens in secs.


Snowflake offers both horizontal and vertical scaling. The warehouses can be scaled either up or down at runtime to scale vertically. We can also specify the maximum and minimum number of clusters in the warehouse for it to scale horizontally when there is more load onto the system. This can be done either using the UI or SQL scripts. Different Scaling policies are also made available to orchestrate horizontal scaling.

Next is this series of blog post, we will see how we can utilize Snowflake as a Data lake solution and how it varies from traditional cloud based data lake solutions.


