Optimizing advertisement campaign outcomes with the Snowflake Data Cloud and Refract

4 min read

Reading Time: 4 minutes

Global advertisers utilize the concept of “Sales Lift” to estimate the increase in sales resulting from a specific advertisement, and to assess the effectiveness of their advertising efforts. This approach enables advertisers to make informed decisions regarding the allocation of advertising budgets, the selection of ad types, and targeting of their ads to maximize impact. However, it is crucial to account for confounding variables and their impact on the success or failure of these ad campaigns.
Confounding variables – demographic factors like age or income – have the potential to influence both advertising and sales figures. Failing to effectively monitor these variables may lead advertisers to incorrectly attribute the increase in sales.

In this use case, we will demonstrate how the Snowflake platform along with Refract can be used in this scenario for data analysis and model training. By understanding how certain confounding variables can predict whether a user will purchase a product, advertisers can make more accurate predictions.

Refract by Fosfor, is built to enable new-age data-to-decision teams; it accelerates and automates the entire Data Science, Artificial Intelligence (AI), and Machine Learning (ML) life cycles, from data discovery to modelling, deployment, and monitoring. Refract addresses the needs and challenges of different user personas by fostering collaboration and making data consumption for decision-making simple and effortless.

Additionally, the Refract platform is employed for hosting and post-production monitoring of the model. To facilitate insights contextualization for business users, we have developed a Streamlit app that allows them to grasp the intricacies of the data and appreciate the benefits. We have tried to keep the solution simple by not predicting the sales lift, but rather just predicting if the user is going to make the purchase given some confounding parameters.We have also ensured that the solution is visually intuitive, so that even non-technical users can understand the data.

Note: This solution is inspired from and built on top of another article written by our strategic technology partner, Snowflake. The article offers an excellent explanation on the dataset that was built and showcases some of the fantastic Snowflake features that were used to build the solution.

Business problem:

Most businesses look to calculate what the sales lift would be or try to understand the probability of sales conversion for a particular user after consuming a targeted advertisement. To keep the problem statement simple, we used only a few of the confounding parameters (age band, marital status, if they saw the ad or not, etc.) to predict if the user would make the purchase. But a few other “non-confounding” parameters were also used to make it more robust, such as the channel and form of advertisement, and other demographical data (like employment details, etc.). Perhaps even a market mix model, can be used in a future version of this solution.

Refract by Fosfor:

In this example, data manipulation and modeling has been executed on Snowflake, to provide for data security, while Refract provides an integrated platform to track and consume the model performance through a Streamlit app. This app provides a direct UI to consume the model in a what-if type of analysis and understand the data using dynamic visualizations.

Key benefits of the proposed solution:

Powerful, flexible data science pipelines built through Snowpark for Python
Quick visualizations of attrition facts & figures through Streamlit (quicker turnaround for everyday visualizations)
Interactive storytelling with quick app-style deployment of notebooks

The below image is a representation of the solution workflow that illustrates how the Streamlit app is connected and integrated on Refract.

Image 1: Refract solution workflow

The data:

To understand and to be able to predict user purchase patterns, we have mapped user demographics such as marital status, age band, type of household, etc.
We also have the details regarding if they are already a customer of the brand
we tracked the details if the user has watched the ad
We also mapped which user has purchased the product given the other confounding variables

The process:

Step 1: We have used Refract’s integrated Jupyter notebooks to connect to Snowflake and start a session. As the Snowpark API requires Python 3.8, Refract gives users the option to freely choose the desired version of Python. Refract has pre-created templates catering to a wide area of use cases.

Step 2: We created a data pre-processing pipeline for simple manipulations like missing value imputations, data scaling, and one-hot-encoding, followed by the model training pipeline consisting of Random forest algorithm and grid search for finding the best parameters. This entire code is written in a separate .py file in the form of Python functions, which is then sent to Snowflake stage that you will be using. This training function is then registered as a stored procedure using Snowpark SQL.

Using the above code, you can send any file to your Snowflake stage.

Image 2: Code for registering the model training pipelines as a SPROC (stored procedure) in Snowflake.

The above piece of code will register the model training pipelines as a stored procedure in Snowflake where the training will happen, do remember to mention the Python modules you will be using in your training pipeline, and if your pipeline has multiple functions, specify the main function as the handler of the SPROC.

Image 3: Code for triggering the stored procedure

Using the above code you can trigger the training stored procedure, while mentioning the variables that you don’t want to use in your training.

Step 3: Like step 2, you can create a separate file defining your prediction functions, sending that file to your Snowflake stage environment and then registering the functions to Snowflake using the below code.

Image 4: Code for registering the model training pipelines as a UDF (User Defined Function) in Snowflake.

Here, we will be registering the prediction file as a Snowpark UDF, having the same imports and handler.

Image 5: Code for triggering the UDF

We can call the UDF and trigger the prediction on an entire Snowflake table(sdf),and we can write back and save the predictions in a Snowflake table.

Step 4: Once the model training and predictions are done, we brought back the model to Refract along with all the necessary files for scoring the model and registered the model in Refract for monitoring and consumption.

Refract ML is a module that makes the registration of models seamless – just provide the model file and some metadata around it, and you will be able to use Refract’s one-click-deployment and post deployment monitoring features.

Registering models on Refract comes with a lot of benefits:

You get the visual representation of all the build-time metrics
Your model decision making is explained
Automatic API creation for basic consumption of the model
Model monitoring can be scheduled to look out for any possible data drift
Multiple versions can be deployed and compared
Option to choose the compute power while deploying the model, giving you better control on resource utilization

Step 5: Once the model is deployed on Refract, you will get the link to the API where you can provide the input payload and get the predictions. For small scale testing though, you can directly consume the model in Refract itself.

Campaign performance measurement through a digital application

We have built and deployed a Streamlit-based app,which offers users the technical and visual understanding of the data, so that anyone can understand the relationships between the variables and understand the possibility for users’ purchase behaviors.

Image 6: Data Profile tab – shows snippets of the actual data and the technical data profile.

The data profile tab of the app gives you a snippet of the data, just so that you can understand the data you are using.Below the data, you get the technical profile of all the variables present in the data – the distributions, cardinality, missing values, correlations, etc. This type of data profiling enables you to better understand the data, and decide the steps to figure out what kind of pre-processing is needed for better modeling.

Image 7: Know your Data tab – Visual representation of the data in form of graphs to better show the relationships between the variables.

The know your data tab provides you with the visual representation of all the variables, with the best graph representation possible for each data type of the variables.
In the above charts, it is evident that by looking at the variables’ distribution and the age band distribution, that either the data under consideration was a little biased or the interest of study was biased towards middle-aged people.

This app also enables the user to consume the trained model in a what-if analysis type of fashion, to predict the purchase of the user, given a certain confounding parameter.

Image 8: Model and Predictions tab – it provides information about the model performance and facilitate the user with a what-if analysis setup.

On the left-hand side of the UI (as seen in the image above), you can change the given parameter and see how its effects on the purchase made for an user. Along with this, we get the import features, the confusion matrix, and the model parameters such as accuracy, precision, and recall. Having all the information in a single place can enable users to better understand the data and model, and to come up with probable purchase scenarios and or understand the relationship between the confounding variables.

Conclusion

This solution was built upon very distinctive Ad Campaign measurement data. Information such as type and channel of advertisement can also be used to better understand the effects of channels of advertisement on these scenarios.Therefore, it is obvious that these confounding variable models along with sales optimization algorithms (MMM models) can really make an effect in estimating a sales lift.
Want to learn more? Contact our team at Refract by Fosfor to set up your free consultation today!

Author

Ayush Kumar Singh

Specialist – Data Scientist, Fosfor

Ayush Kumar Singh has 6+ years of experience in executing data driven solutions. He is proficient in Machine Learning and deep learning and is adept at identifying patterns and extracting valuable insights. He has a remarkable track record of delivering end-to-end Data Science projects.

More on the topic

Read more thought leadership from our team of experts

AI in a box: How Refract simplifies end-to-end machine learning

The modern tech world has become a data hub reliant on processing. Today, there is user data on everything from driving records to scroll speed on social media applications. As a result, there has been a considerable demand for methods to process this data, given that it holds hidden insights that can propel a company into the global stage quicker than ever before.

Accelerate your production ML journey with Refract

As we all know, production ML (Machine Learning) is more engineering than machine learning. Building a prototype in machine learning has become very simple nowadays, all thanks to different open-source projects like sci-kit, TensorFlow, Keras, etc. But operationalizing that model to get the insights from the model which can be used in day-to-day business decisions is challenging and needs more engineering knowledge than data science knowledge.

How to choose the best AI/ML platform for your business

Although according to a 2020 McKinsey study¹, 50% of the companies surveyed had already adopted AI in at least one business function, the state of AI in 2023 according to a similar McKinsey study suggests that adoption rates have effectively plateaued over the last 3 years².

Privacy & Cookie policy

Privacy & Cookies policy

Cookie name	Active
sess_map

What is a cookie?

A cookie is a small piece of data that a website asks your browser to store on your computer or mobile device. The cookie allows the website to “remember” your actions or preferences over time. On future visits, this data is then returned to that website to help identify you and your site preferences. Our websites and mobile sites use cookies to give you the best online experience. Most Internet browsers support cookies; however, users can set their browsers to decline certain types of cookies or specific cookies. Further, users can delete cookies at any time.

Why do we use cookies?

We use cookies to learn how you interact with our content and to improve your experience when visiting our website(s). For example, some cookies remember your language or preferences so that you do not have to repeatedly make these choices when you visit one of our websites.

What kind of cookies do we use?

We use the following categories of cookie:

Category 1: Strictly Necessary Cookies

Strictly necessary cookies are those that are essential for our sites to work in the way you have requested. Although many of our sites are open, that is, they do not require registration; we may use strictly necessary cookies to control access to some of our community sites, whitepapers or online events such as webinars; as well as to maintain your session during a single visit. These cookies will need to reset on your browser each time you register or log in to a gated area. If you block these cookies entirely, you may not be able to access gated areas. We may also offer you the choice of a persistent cookie to recognize you as you return to one of our gated sites. If you choose not to use this “remember me” function, you will simply need to log in each time you return.

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
__cfduid	Cloudflare	Cookie associated with sites using CloudFlare, used to speed up page load times	1 Year
lidc	linkedin.com	his is a Microsoft MSN 1^st party cookie that ensures the proper functioning of this website.	1 Day
PHPSESSID	ltimindtree.com	Cookies named PHPSESSID only contain a reference to a session stored on the web server	When the browsing session ends
catAccCookies	ltimindtree.com	Cookie set by the UK cookie consent plugin to record that you accept the fact that the site uses cookies.	29 Days
AWSELB		Used to distribute traffic to the website on several servers in order to optimise response times.	2437 Days
JSESSIONID	linkedin.com	Preserves users states across page requests.	334,416 Days
checkForPermission	bidr.io	Determines whether the visitor has accepted the cookie consent box.	1 Day
VISITOR_INFO1_LIVE		Tries to estimate users bandwidth on the pages with integrated YouTube videos.	179 Days

Category 2: Performance Cookies

Performance cookies, often called analytics cookies, collect data from visitors to our sites on a unique, but anonymous basis. The results are reported to us as aggregate numbers and trends. LTI allows third-parties to set performance cookies. We rely on reports to understand our audiences, and improve how our websites work. We use Google Analytics, a web analytics service provided by Google, Inc. (“Google”), which in turn uses performance cookies. Information generated by the cookies about your use of our website will be transmitted to and stored by Google on servers Worldwide. The IP-address, which your browser conveys within the scope of Google Analytics, will not be associated with any other data held by Google. You may refuse the use of cookies by selecting the appropriate settings on your browser. However, you have to note that if you do this, you may not be able to use the full functionality of our website. You can also opt-out from being tracked by Google Analytics from any future instances, by downloading and installing Google Analytics Opt-out Browser Add-on for your current web browser: https://tools.google.com/dlpage/gaoptout & cookiechoices.org and privacy.google.com/businesses

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
_ga	ltimindtree.com	Used to identify unique users. Registers a unique ID that is used to generate statistical data on how the visitor uses the web site.	2 years
_gid	ltimindtree.com	This cookie name is asssociated with Google Universal Analytics. This appears to be a new cookie and as of Spring 2017 no information is available from Google. It appears to store and update a unique value for each page visited.	1 day
_gat	ltimindtree.com	Used by Google Analytics to throttle request rate	1 Day

Category 3: Functionality Cookies

We may use site performance cookies to remember your preferences for operational settings on our websites, so as to save you the trouble to reset the preferences every time you visit. For example, the cookie may recognize optimum video streaming speeds, or volume settings, or the order in which you look at comments to a posting on one of our forums. These cookies do not identify you as an individual and we don’t associate the resulting information with a cookie that does.

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
lang	ads.linkedin.com	Set by LinkedIn when a webpage contains an embedded “Follow us” panel. Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.	When the browsing session ends
lang	linkedin.com	In most cases it will likely be used to store language preferences, potentially to serve up content in the stored language.	When the browsing session ends
YSC		Registers a unique ID to keep statistics of what videos from Youtube the user has seen.	2,488,902 Days

Category 4: Social Media Cookies

If you use social media or other third-party credentials to log in to our sites, then that other organization may set a cookie that allows that company to recognize you. The social media organization may use that cookie for its own purposes. The Social Media Organization may also show you ads and content from us when you visit its websites.

Ref links:

LinkedIn – https://www.linkedin.com/legal/privacy-policy Twitter – https://gdpr.twitter.com/en.html & https://twitter.com/en/privacy & https://help.twitter.com/en/rules-and-policies/twitter-cookies Facebook – https://www.facebook.com/business/gdpr Also, if you use a social media-sharing button or widget on one of our sites, the social network that created the button will record your action for its own purposes. Please read through each social media organization’s privacy and data protection policy to understand its use of its cookies and the tracking from our sites, and also how to control such cookies and buttons.

Category 5: Targeting/Advertising Cookies

We use tracking and targeting cookies, or ask other companies to do so on our behalf, to send you emails and show you online advertising, which meet your business and professional interests. If you have registered on our websites, we may send you emails, tailored to reflect the interests you have shown during your visits. We ask third-party advertising platforms and technology companies to show you our ads after you leave our sites (retargeting technology). This technology allows us to make our website services more interesting for you. Retargeting cookies are used to record anonymized movement patterns on a website. These patterns are used to tailor banner advertisements to your interests. The data used for retargeting is completely anonymous, and is only used for statistical analysis. No personal data is stored, and the use of the retargeting technology is subject to the applicable statutory data protection regulations. We also work with companies to reach people who have not visited our sites. These companies do not identify you as an individual, instead rely on a variety of other data to show you advertisements, for example, behavior across websites, information about individual devices, and, in some cases, IP addresses. Please refer below table to understand how these third-party websites collect and use information on our behalf and read more about their opt out options.

Cookie Name	Domain / Associated Domain / Third-Party Service	Description	Retention period
BizoID	ads.linkedin.com	These cookies are used to deliver adverts more relevant to you and your interests	183 days
iuuid	demandbase.com	Used to measure the performance and optimization of Demandbase data and reporting	2 years
IDE	doubleclick.net	This cookie carries out information about how the end user uses the website and any advertising that the end user may have seen before visiting the said website.	2,903,481 Days
UserMatchHistory	linkedin.com	This cookie is used to track visitors so that more relevant ads can be presented based on the visitor’s preferences.	60,345 Days
bcookie	linkedin.com	This is a Microsoft MSN 1st party cookie for sharing the content of the website via social media.	2 years
__asc	ltimindtree.com	This cookie is used to collect information on consumer behavior, which is sent to Alexa Analytics.	1 Day
__auc	ltimindtree.com	This cookie is used to collect information on consumer behavior, which is sent to Alexa Analytics.	1 Year
_gcl_au	ltimindtree.com	Used by Google AdSense for experimenting with advertisement efficiency across websites using their services.	3 Months
bscookie	linkedin.com	Used by the social networking service, LinkedIn, for tracking the use of embedded services.	2 years
tempToken	app.mirabelsmarketingmanager.com		When the browsing session ends
ELOQUA	eloqua.com	Registers a unique ID that identifies the user’s device upon return visits. Used for auto -populating forms and to validate if a certain contact is registered to an email group .	2 Years
ELQSTATUS	eloqua.com	Used to auto -populate forms and validate if a given contact has subscribed to an email group. The cookies only set if the user allows tracking .	2 Years
IDE	doubleclick.net	Used by Google Double Click to register and report the website user’s actions after viewing clicking one of the advertiser’s ads with the purpose of measuring the efficiency of an ad and to present targeted ads to the user.	1 Year
NID	google.com	Registers a unique ID that identifies a returning user’s device. The ID is used for targeted ads.	6 Months
PREF	youtube.com	Registers a unique ID that is used by Google to keep statistics of how the visitor uses YouTube videos across different web sites.	8 months
test_cookie	doubleclick.net	This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor’s browser supports cookies.	1,073,201 Days
UserMatchHistory	linkedin.com	Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor’s preferences.	29 days
VISITOR_INFO1_LIVE	youtube.com		179 days

Third party companies	Purpose	Applicable Privacy/Cookie Policy Link
Alexa	Show targeted, relevant advertisements	https://www.oracle.com/legal/privacy/marketing-cloud-data-cloud-privacy-policy.html To opt out: http://www.bluekai.com/consumers.php#optout
Eloqua	Personalized email based interactions	https://www.oracle.com/legal/privacy/marketing-cloud-data-cloud-privacy-policy.html To opt out: https://www.oracle.com/marketingcloud/opt-status.html
CrazyEgg	CrazyEgg provides visualization of visits to website.	https://help.crazyegg.com/article/165-crazy-eggs-gdpr-readiness Opt Out: DAA: https://www.crazyegg.com/opt-out
DemandBase	Show targeted, relevant advertisements	https://www.demandbase.com/privacy-policy/ Opt out: DAA: http://www.aboutads.info/choices/
LinkedIn	Show targeted, relevant advertisements and re-targeted advertisements to visitors of LTI websites	https://www.linkedin.com/legal/privacy-policy Opt-out: https://www.linkedin.com/help/linkedin/answer/62931/manage-advertising-preferences
Google	Show targeted, relevant advertisements and re-targeted advertisements to visitors of LTI websites	https://policies.google.com/privacy Opt Out: https://adssettings.google.com/ NAI: http://optout.networkadvertising.org/ DAA: http://optout.aboutads.info/
Facebook	Show targeted, relevant advertisements	https://www.facebook.com/privacy/explanation Opt Out: https://www.facebook.com/help/568137493302217
Youtube	Show targeted, relevant advertisements. Show embedded videos on LTI websites	https://policies.google.com/privacy Opt Out: https://adssettings.google.com/ NAI: http://optout.networkadvertising.org/ DAA: http://optout.aboutads.info/
Twitter	Show targeted, relevant advertisements and re-targeted advertisements to visitors of LTI websites	https://twitter.com/en/privacy Opt out: https://twitter.com/personalization DAA: http://optout.aboutads.info/

Save settings

Overview

Partners

What’s hot

Industries

Roles

Knowledge hub

About Fosfor

The Fosfor Decision Cloud