Snowflake’s Autoprogettazione: Data Design Through Practice
Is it pretentious to reference the Italian furniture designer, Enzo Mari, when talking about data design? Probably. But he’s also the inspiration for that bench I want to build and all the architecture ideas I have for the house. His philosophy, at least, for Autoprogettazione was to publish simple diy furniture; things you only need a few 2x4s, nails, hammer, and plan. Anyone can build something and maybe even call it art, but this is about teaching design through practice. Regardless, I love the intersection of art and utility. Sometimes I even get a weird satisfaction from brutalist architecture or websites. Although, hard pass on the modern box design.
My own contribution for design through practice is this:
- Grab your 2x4s, nails, and hammer.
- Throw them in the trash because we’re programmers and bits and bytes engineers.
- Pick the object storage that provides the most utility and stage your data to it.
- Maybe even use some hive style partitioning.
- Allow Snowflake access to your object storage via a storage integration.
- Create an external table and stage.
- Setup auto refresh.
- Use the first option.
- Past SNS attempts have run afoul of unconfirmed subscriptions. Especially while developing this is problematic. You should be able to tear down and rebuild without surprises.
- Sit back, sip your coffee, and enjoy the best of all architectures.
So why do I recommend this? Through experience this opens up possibilities to many types of data consumers and producers. Suppose you have applications that process your data, machine learning, AI, customer identity resolution, whatever you dream of — it now doesn’t need credentials, roles, an active warehouse, or even sql. Raw utility is what you have at your hands now. My favorite is to apply lambda functions. These applications and services have carte blanch over your data while allowing your analyst team to consume that data naturally inside the warehouse. I see this as a very simple solution to an otherwise complex problem, something that gets asked a lot in the data world, which is how to enable all users or consumers of data.
There are many other databases where this design pattern works too. Any database that allows you to read from object storage is fair game for this style; for example, Redshift.
Resources:
- https://socks-studio.com/2016/04/18/critical-understanding-through-practice-autoprogettazione-by-enzo-mari-1974/
- I make no claims about this link. Just look at the pictures.
- https://en.wikipedia.org/wiki/Enzo_Mari
- https://brutalistwebsites.com
- Same I make no claims about the links on this site. Some of them redirect to ads, etc.
- https://www.architecturaldigest.com/gallery/most-beautiful-brutalist-buildings-world
- https://dispatchesmag.com/reappraisal-enzo-mari/
- https://delta.io/blog/pros-cons-hive-style-partionining/
- https://docs.snowflake.com/en/sql-reference/sql/create-storage-integration
- https://docs.snowflake.com/en/sql-reference/sql/create-external-table
- https://docs.snowflake.com/en/user-guide/tables-external-auto
- https://aws.amazon.com/about-aws/whats-new/2023/05/amazon-sns-automatic-deletion-unconfirmed-subscriptions/