End-to-end: generating semantic models for Snowflake Cortex Analyst/Intelligence in two weeks
We often get asked how an end-to-end flow would look like for the generation of semantic models through Solid. In this post, we'll take you through that process.
Snowflake’s 12,000 customers were told last June that with Snowflake Intelligence, they can finally chat with their data. This came with impeccable timing - business teams around the world are increasing their pressure on data teams to deliver this capability. “We want ChatGPT with our data” is heard on Zoom/Teams meetings across the globe.
The foundational models recently got better, with Claude and ChatGPT scoring high marks in Text2SQL (those are the models used by Snowflake’s Cortex Analyst, too). The challenge now became semantics - how to instruct the AI on the best SQL statement for a given business question, and how to interpret the data coming out of that SQL statement’s execution.
In other words - how to translate between a business question and a SQL query, and then back.
The solution: semantic models. Or in Snowflake’s terms - “Semantic Views”.
Building and testing the semantic models and views can take months, while maintaining them is an even bigger undertaking. Data teams don’t have the bandwidth to do this.
We previously shared here how we think about this problem. In today’s post, we will share how the process actually looks and feels like, when doing it through the Solid platform.
If you prefer, instead of reading this post, you can just view this video:
Step one: connect and learn
Imagine a company called Acme Technologies. Named after the Warner Bros’ fictional ACME Corporation, Acme is actually a technology company. It has several thousand employees and a variety of products, including OrionIQ, an AI analytics, forecasting, and scoring engine.
This is a high-level overview of how Solid connects into Acme’s environment:
Solid requires just the following:
A service account with metadata access into Snowflake
An Explorer user in Tableau
A read only user in JIRA (with access to only data engineering and analytics projects)
Limited access to Slack - only to channels where analytics work is discussed
Any unstructured content the data team at Acme has about Acme’s business
The platform then proceeds to learn the business language for Acme (product names, acronyms, how questions are asked and so forth) and create a glossary. This is a critical stage in the process and ensures the resulting content matches the way business users ask questions.
Solid will then proceed to map the relationships between all the assets (tables, views, columns, SQL queries, BI dashboards/reports, metrics, etc) and give a quality score to each. After that, Solid will automatically document all of these assets, with very detailed, high-quality documentation.
This is the foundation for Solid’s ability to generate the correct semantic models/views on demand.
Step two: generate semantic model on demand
With Solid’s Semantic Modeling capability, the data or analytics engineer, can now easily create their first model. For example a model that focuses on win rate and revenue won per customer tier.
Now, we’ll click Next and give Solid thirty seconds to pick the right assets for the model. We can now see that Solid picked the correct tables (with their dimensions and facts), joins/relationships, metrics and verified SQL queries. These are all high quality assets, that have passed our rigorous filtering mechanisms (after all, there’s a lot of garbage in our Acme’s data warehouse).
We can edit this model at this time if we’d like, or at a later time as well.
Step three: test the model/view and certify
With a click of a button, Solid can push the semantic view out to Snowflake for testing. At this point, Solid can automatically test the view (by running relevant business questions through Cortex Analyst and evaluating the results), or the analytics engineer can do so manually.
Once the model passes these tests, it can be handed to a small group of beta tests - business users who are interested in evaluating the new chat capability. These users will input questions they’d want answers to. At this stage, it’s important to have these users aware of what is included in the model (and what is not).
Any issues identified during these test phases will be addressed by editing the model within the Solid platform, and publishing it again. Once the series of tests has completed, the engineer will mark the model as Certified in the Solid platform, and we’re good to go. The model can now be used by business users to answer their questions.
Scale and maintain
Now that the first semantic model was created end-to-end, and quite easily, through the Solid platform, there are a few avenues to pursue:
Adding more models - that first model is likely to get people excited. The potential for chatting with your data will open their eyes and the demand will roll in for more data to be made accessible through chat. The analytics engineer, or, in many cases, analysts residing within business departments, will be able to use the Solid platform to create, test and publish more models. The beauty of the Solid platform here is that different people can work on their own models, democratizing the ability to make the chat capability a reality.
Updating models as the underlying data structure changes - Solid will continue to monitor the underlying data structure. Every day, Solid will fetch an update of the metadata in store and evaluate changes in it. New columns and tables will be automatically documented, and semantic models will be re-evaluated to see if they are impacted by these changes. If they are, the owner of the model will be alerted and a proposed update will be shown to them for approval.
Monitoring of model usage - in the future, Solid will allow Snowflake users to easily monitor the usage of each model and identify areas for improvement. Snowflake does log the usage of semantic views and Solid will be able to retrieve this and analyze it. Solid will also provide statistics on how often certain models are used and by who.
With these capabilities, Snowflake users can scale their adoption of Cortex Analyst and Intelligence, bringing trustworthy and accurate AI into their organization.
If you’re interested in trying Solid and run a full end-to-end flow for the creation of one semantic model, reach out to us today.