Snowflake Architecture – Scotty's Tech Hangout!

At the storage level, there exists cloud storage that includes both shared-disk (for storing persistent data) as well as shared-nothing (for massively parallel processing or MPP of queries with portions of data stored locally) entities. Ingested cloud data is optimized before storing in a columnar format. The data ingestion, compression, and storage are fully managed by Snowflake; as a matter of fact, this stored data is not directly accessible to users and can only be accessed via SQL queries.

Next up is the query processing level, this is where the SQL queries are executed. All the SQL queries are part of a particular cluster that consists of several compute nodes (this is customizable) and are executed in a dedicated, MPP environment. These dedicated MPPs are also known as virtual data warehouses. It is not uncommon for a firm to have separate virtual data warehouses for individual business units like sales, marketing, finance, etc. This setup is more costly but it ensures data integrity and maximum performance.

Finally, we have cloud services. As mentioned in the boxes, these are a bunch of services that help tie together the different units of Snowflake ranging from access control/data security to infrastructure and storage management. Know more about Snowflake Data Warehouse architecture here.

Categories

Snowflake Resources

Snowflake Learning

Courses I’m Taking

Snowflake Competitive

Back to Basics

Recent Posts