As a non-techie delving in digital analytics I often come across the notion that SQL, Python, R and other scripting languages are the main requirement to enter this space. No doubt, these are important. But in this post I would like to make a case for web analytics folks to better understand the cloud.
The fact is that I have moderate skills in SQL, JavaScript, Java, HTML, CSS and have beginner level skills of R and Python. But as I move forward in my career I’ve chosen to prioritize learning cloud architecture (becoming AWS Certified and currently studying Google Cloud Platform) over scripting language proficiency. I often find myself explaining the reasons behind this choice, sometimes not doing my best at it.
Cue this post explaining why I decided to prioritize cloud architecture for web analytics.
The drivers behind this decision are mostly from experience, and a bit from where I feel the industry is going.
If we go back five years in time, I would have prioritized SQL and JavaScript. But in these past years I was lucky to lead the digital analytics transformation for ASICS (four years ago) and experience working as a marketing manager in Amazon. Both experiences changed what I would use to become impactful.
My time at ASICS was particularly influential. When I assumed the role, one of my priorities was to democratize data. That is, to allow non-specialists to be able to leverage digital data without requiring significant help.
As part of the strategy I pulled the organization’s analytics structure from very decentralized to a more balanced Center of Excellence model to ensure more consistent and better data. Doing this gave me room to implement guardrails for cleaner data, but also meant I was one the hook to make our digital data available to leadership, marketers, merchandisers, operations and other non-analysts in a way that was more useful than before.
Most of these folks didn’t know SQL or any other querying languages, we didn’t have the resources in the Global team to start pumping all reporting requests, and we had a lot of visibility from senior leadership.
We had our challenges but eventually delivered. To make this happen we had to create and run scripts, ETL, use databases etc. Since our developer and infrastructure teams were topped out, we ended approving the use of Google Cloud Platform for some things, Azure for others… It worked. At a point it became evident that we were in (and from now on would need) the cloud.
I won’t lie, it was a steep learning curve. I had help along the way from many people (like David Vallejo, who helped with all sorts of strategy and technical skills, and James Standen and Nmodal team, who advised on ETL, infrastructure and other processes that I believe are still in place today) and I’m sure I will need advise moving forward.
The ASICS experience was likely the start of my cloud exploration for analytics. I’m happy we went through these challenges. Take the following areas I believe the cloud makes the difference in most larger businesses.
Cloud for Collecting Data and Automating Processes
The digital analytics space is lucky to have tools like Adobe Analytics and Google Analytics. They do a great job at providing a ton of insights, but are often just a piece of a bigger puzzle that includes multiple data sources and reporting needs. Many companies end up investing significant time and resources to manually address this complexity. The result often involves many spreadsheets and lots of frustration.
Consider the diagram below. This was a scenario during my time at ASICS Corporation. It’s not that complex but we needed the cloud to make this solution. This replaced hours of time from people manually combining data, which was also prone to error.
This is one of many processes we had in place involving senior leadership. All of them made possible thanks to having access to virtual machines and databases in cloud, which meant not having to stress our internal developers.
Being able to combine and build our datasets and reporting in a automated fashion, reasonably priced and skipping the internal development sprints was awesome.
Cloud for Building Your Own Tools
Notice that in the previous diagram Analytics Canvas, one of my favourite ETL tools, was central to this process. As much as I love them, there are many cases where the tool wouldn’t work (e.g. streaming data). In these cases, teams would need to either look for a new tool or invest building processes that can be owned and customized to the organization.
Take for example, recommendation engines. Many organizations use software like Certona or Dynamic Yield to add recommended products to an ecommerce site. In many cases, outsourcing this capability makes sense. But in others, it makes sense to bring this in-house and evolve it to much more than a recommendation engine for products.
These tools are not cheap and it is worth debating between building internal resources to develop a tool or the tool itself.
Most tools out use the cloud in the background anyway. Consider the templated data flow I found at draw.io.
Granted, I just came across this diagram and wouldn’t be able to implement this system. I’m also sure there is a lot R and Python libraries can do in this regard. But with intermediate knowledge in this regard I feel I would know who to look for, what to ask and how to make an educated decision should this scenario come up. This ability I believe is critical to building an analytics vision.
Cloud for Experimenting
What should we do when we want to test a new idea or approach before committing major resources? How about serving content depending on external actions (like this Nike World Cup case study from 2014); or stitching two fundamentally different tracking like apps (screenviews) and website (pageviews); or experimenting with machine learning?
Going beyond what’s out of the box often needs some trial and error. Um, a lot of trial and error.
A small example of this came during my time in ASICS. In one occasion we wanted to automate reporting from a ecommerce marketplace platform that did not provide an API. After lots of testing, we ended quickly provisioning a VM in GCP running a script that would simulate a user to get us this data on a daily basis. KABOOM!
Or what if we wanted to play around with machine learning or AI to understand potential uses? GCP Machine Learning Engine and AWS Machine Learning services are still a mystery to many. Yet, there they are waiting to be taken for a ride.
Take a look at the site datalayer (available data points) on this AE.com product on the web (I seem to shop online every two years).
They transact over $1bn per year, I’m going to assume they have enough users for a machine learning algorithm to go to work. Why not then experiment by brainstorming some hypotheses and sending data to either AWS or GCP and take their algorithms for a ride?
Or take data already being sent by YOOX’s mobile app for iPhone (JSON object).
I’m sure it would not be too much of an effort to send a copy of some of this data to the cloud and try to understand better products to recommend, or advise merchandisers and buyers at time of thinking of inventory.
I’m experienced data scientists are able to point out all details I’m missing. But that’s the point. Knowing the cloud and its offerings can help leaders brainstorm a more effective and exciting vision for analytics, relying on the team to guide the details.
Yes The Cloud is Awesome, But Not for All
I understand every organization has its own challenges and sometimes going to the cloud is impossible. But I still believe that being familiar with what the cloud can do will help teams have key conversations about current possibilities.
Most relevant for me, since we’re talking about analytics, I like any vehicle that can help bring data to many people across teams – particularly in bigger organizations. Yes, ideally we also have the specialized analytics folks that can leverage R and Python libraries to make magic. But if I find myself in a leadership position in the future, I’m sure I’ll first consider all possible ways to democratize data.
Leave a Reply