Dry run of Sam Torres' talk for Brighton San Diego 2023: Building data that grows with you.
Find the deck here: https://speakerdeck.com/samtorres/building-data-that-grows-with-you
Hey, everyone, and thank you so much for attending my talk today all about how to build data that's going to grow with you. My name is Sam Torres and I am the Chief Digital Officer at Gray Dot Company.
So we are a boutique agency that specializes in technical and strategic SEO, and all things data. So when we're talking about data, we are talking about things like using digital analytics to help inform other parts of your business, to implementing sophisticated and very complex tracking systems across multiple platforms.
And then, of course, the reporting itself, and how do you actually get insights out of the data and have it talk to you.
But before we get started, let's talk about why you might even be here or be interested in this talk today. And I think it's all about this - data-driven decisions.
Now, if anyone else is feeling triggered about this, like I am, and I'm even going to call myself out because we use this phrase on our own website pretty often. There's a reason for this, but we've been using data to inform our marketing decisions for decades at this point. So why is it something we're still talking about? Right?
If you think about the Mad Men era, they started building personas for trying to convince housewives to buy certain brands or certain items within the grocery store. Great job, by the way.
So why is it something that in 2023, we're still talking about? And honestly, it's because of what digital has done for us.
With digital, we now have access to data faster and easier than ever, and more data than ever before.
But one of the things I do think that that is causing, it makes me think about - so I'm from Atlanta, and we have horrible traffic and horrible road conditions. And I think part of the reason for that is because after the '96 Olympics, Atlanta as a city just blossomed. It boomed.
We had a ton of people moving here, including myself. And so suddenly, it's you know, the infrastructure of the city couldn't support the growth that it had.
And I think this is happening in the same with digital, is that we're getting access to so much data and so much information. But our systems and everything around it can't keep up, that it has caused all of us to look a little bit like Spongebob, right?
You've got data in so many different places, you've got different logins you got to keep track of, you've got so many different spreadsheets, and not everybody has access to the same data. Not to mention now you've probably heard, like all these Python scripts that you could do, or something of that nature.
So you're suddenly thinking that oh, yeah, data science, I got to learn that as part of my job. And then not to mention, the executive team is still probably focusing on things that you wish they wouldn't, or it's just all vanity metrics.
So my goal is to help you with this part, and maybe get back one of those arms so you can drink your latte or your tea. Because, oh by the way, this is just one part of your job. This is just the reporting part, this isn't getting into technical optimization, or content creation, or any of that. It's just the reporting.
So the way we're going to do that today is we're going to talk through scaling your data, your platforms, and your insights and communications around this.
Since we're talking about scalability, let's go ahead and talk about like, what even is scalability? So this is a definition I pulled off of Google. So super, super formal. But it's the ability of a system to perform well under an expanding workload.
So I think a lot of times when we think about scalability, we're always thinking it's just growth. And it's going to be more than that. It's...it's growth that is sustainable, its growth that is reliable. So all of that's really going to come into play. as far as what kind of scalability are we talking about, right? We dowant growth, but we also want to make sure that what we're doing we can trust.
Alright, also anyone who knows me knows that I love transparency, maybe to a fault. So I just want to be really honest about - what are you not going to get out of this?
If you're expecting a dashboard to rule them all. Sorry all, it ain't gonna happen. No boo, it's not, no... it doesn't exist. And I'm going to try to convince you, or at least give you the reasons that I think that such a thing doesn't exist.
Because to me, it's like saying that any marketing messaging, should...you'd only have one message for any possible customer. That's not how that works. It's just not how 'people' work.
So let's get started and let's talk about how do we scale our data. So of course, when we're talking about data, and remember, the scalability is, it's not just growth, it's also can you trust it?
So the biggest thing being - your data has got to be valid. If you can't trust it, you're dead in the water, you do nothing else. So if you can't trust your data, you need to stay here.
So what are the ways that I recommend how to do that? Well, so one of the things I do like about GA4 better than Universal is its debug mode - makes it really easy to be able to test the data and see how it's going. This QR code will take you to an article written by Analytics Mania, and Julius has done all the work of how to use debug mode, how to get the most out of it.
And I would also say that if you are being judged on things like conversions for your website, come up with regular tests, or a schedule for regular testing, of when you're going to test kind of that happy path or the path to conversion on your website. Because if it's something you're being graded on, you should absolutely have insight into it, and ideally also input. But this is also a good way that you can make sure that you can still trust the data that you have, and everything is good to go.
Now, the next thing, your data - it should be compliant. So privacy is a growing concern and the prevalence of personally identifiable information or PII. These are things that people are really caring about.
And while, yes, definitely the legislative bodies are taking some time to catch up with the rest of the internet, it's still going to move in that direction. So GDPR, you probably know, CCPA, which is the California Consumer Privacy Act, and then PIPEDA, which was a recent set of regulations and rules in Canada. Find out which ones of these apply to you.
And one of the things I think that could catch up a lot of companies is that, you know, we're a US and Canada-based company. So I might think, oh, GDPR doesn't apply to me.
No, that's not true if my customers and my users are in the EU. So think... So you have to think about - it's not just about your company, but it's also where are your customers, where are your users coming from, then those things might apply.
And obviously, I'm not a lawyer, so please make sure you address this with your legal team. Because after that, you know, you need like, say you need to find out what applies to you.
But then how you implement that, or how you protect people's privacy or implement whatever you need to to stay compliant? That's going to matter.
So this is a cookie banner from The Guardian. Regardless of how you feel about the Guardian, I actually love this one, because they do go ahead and actually give you some short summation of what your rights are, and I think that's pretty cool. But just know that the technical implementation matters.
Why? Because you need to understand - is data getting collected after the person agrees? Or does it stop after they disagree? You need to understand where the data is getting tracked. When does it get kicked in? And how can these things be affecting the data that you're collecting?
Also, how can you make sure that if someone does ask for removal of their data that you can, you know, be able to support that without losing everything else?
So we know data is going to be valid and compliant. But what about...one of the things that we're really passionate about at Gray Dot Company is, we think that data should be planned. But it also needs to remain flexible, right?
In SEO, everything's changing all the time - our measurement programs should be keeping up as well. So we do something called a measurement plan. And like I say, we're super passionate about these, we love them. And it's always going to include three things - the technical requirements and stakeholders.
I'm listing out what are all the different technologies that we need to have. Who are the different people that are going to be accessing this information? What are the use cases? Right?
I think about… there's a lot of times where suddenly an ecosystem for your tracking can get really complicated. So for example, what technical requirements and what platforms - you might have...your eCommerce site is hosted on Shopify, but you've got your blog on WordPress, and the paid media team uses Unbounce for their landing pages.
So suddenly, you're talking about three different platforms that you want to be able to work together. And you haven't even gotten past just what platforms are we working on?
Then we also talk about events and definitions. So what are the audiences that we care about? What are the conversions and the micro conversions that we're tracking? How do we really define those?
Because what that's going to do for us is give us one source of truth and something that every team member can look to, as far as what's going on. How do we define these things? Also think about, you know, you don't have to define bounce rate, 14 more times or engagement rate, you just give them this document.
And then lastly, we also include the implementation documentation. So how is it implemented? What tools are you using to implement? What are the filters or requirements that you're putting into place, whether it's for keeping that PII safe, right, or also just making sure that your tracking is working, because I guarantee you at some point, it's going to break unexpectedly.
And if you already have all this documentation made, then it is that much easier to troubleshoot, and figure out where it broke.
Now with a flexible...keep in mind that your data should plan to change. Like I said, we change all the time. And so with this measurement plan, it should be a living organism, a living document. Keep it up to date. Maybe set up a quarterly check-in of 'Do you need to update anything on there?'
And then that also provides you a time to check and re-check your assumptions. So are there things that you're taking for granted or maybe need to be more further defined when somebody pushes on it?
These are the types of things that that measurement plan can really help you figure out and then solve for. So this QR code will take you to an article that we've written. Like I said, we're super passionate about this. So it's all about how do you create your own measurement plan and we then kick you off with a template.
Alright, so that's about how you keep your data scalable. So it's really all about, you know - how do you grow it in a way that still has meaning for you? So now let's talk about the platforms. Also, what the heck do I mean by a platform?
So when I say platform, I mean your dashboarding tools, or where is it being stored?
So dashboard...of course, the big one being Looker Studio, formerly Data Studio. And then you've got Tableau, Databox, Portfolio, Domo - there are a ton of dashboarding tools out there.
And then warehousing and storage. So this might be something that previously wasn't really on your radar to worry about. But just so you know, with GA4 - the longest that you can have GA4 retain your website data is 14 months. So if you're wanting to see something from two years ago, or pre-COVID, right?
First off...well if it's pre-COVID you would have had to migrate it from Universal. But you also are going to have to have the data sent to these warehouses.
And that's another thing about GA4, that makes it really nice. So you can automatically send the data to BigQuery. We're going to talk a little bit more about the nuances with that. But you may use something like a Redshift, or a Snowflake.
And if you have an in-house server or an IT team that is taking care of that storage for you - just know some of the things we talk about may not apply. But hopefully, I'm giving you some ideas or language around how to be able to talk about what you need, what your needs are for getting access to that data in the long term.
Alright, so first, let's talk about dashboards. One of my biggest pet peeves is the fact that dashboards take forever to load. And I think it's because often we're trying to do too much with a single dashboard.
But I don't know if you've ever had the same experience that I've had, where you're on a phone with a client and you're trying to load Looker Studio and it's like, 'I promise, the data is there. We didn't lose it. It's coming eventually. Oh God, I hope so. Do you have any Thanksgiving plans? Like what's going on?'
Right? It's just a really awkward moment.
And part of that is because your dashboard is trying to do too much. So what are some things that you can do for getting more out of them?
Well, first off, what I really like to do is I like to make sure that my dashboards are properly themed around the type of data that they're talking about, by the types of questions that they can answer, and the audience who cares.
So really thinking about creating reports for each individual of those stakeholders. And you know, not...maybe not each individual person on the marketing team, but certainly the marketing manager versus the executive team. They have very different needs and very different wants.
And then another thing you could do is, you can do the heavy lifting of calculation somewhere else. Now, what do I mean by calculations. So something that happens pretty often, just think about with Looker Studio - you go ahead and you're making a filter, and you want it to be organic search only. Because, you know, we're SEOs.
And oh, by the way, my product is really only available in the United States. So I'm gonna limit to only US customers. And I also want to exclude anybody who was on my careers page, because that shows they're probably job seekers and not really applying to this audience that I'm trying to analyze. And then I also want to exclude any of my current customers.
So you've already added four filters. And we haven't even gotten into the data that actually matters yet. So these are things that you can do - you can use Google Sheets, if you upload a CSV, you can do that. I do that one less, because that's harder to automate. But with Google Sheets, you can run all these queries, and there are extensions that can help you, or you can use the Query Builders yourselves. Or you can do them in BigQuery.
And this way, your data loads a lot faster into Looker Studio, because you're not asking Looker Studio to make all of those calculations on the fly every time you load the dashboard.
But before you get a little bit crazy, and try to do too much, just know that platforms have limitations. And I want to give you some of the caveats or gotchas.
So first, Google Sheets - it's definitely got some limitations, right? We just talked about CSVs. If you do it that way, you have to upload them individually, you're not going to get automated data.
With Google Sheets, you can. But if you have too many rows, and I believe the formal limitation is 5 million cells - I've definitely had times where my sheets just stopped responding. So I would say with that, specialize or focus your data.
So this isn't necessarily going to help with the 'too many spreadsheets' problem that SpongeBob was having. But ideally, what's happened is in your measurement plan, you've documented what data points or sets are important to you, how you're going to manipulate that data into getting to what actually matters to you. And so you have kind of this system already figured out of what matters and what doesn't.
So suddenly, it's becoming a lot more manageable. So maybe you don't have less spreadsheets, but you can manage them. It's become scalable.
And then with BigQuery, depending on what you do, you can end up with really high costs. So for many of our clients - and we have everything from Fortune 100 to small startups. For the majority of our clients, we can keep them under 10 to $20 a month with their BigQuery needs. Some of our very large clients, we can still keep them under like 40 a month. So you can do this relatively cheaply. But if you are not doing it properly, you could suddenly find yourself with a $1,500, $2,000 bill.
And what's happening is because people are setting up the connection directly from Looker Studio into BigQuery. So now you're having BigQuery make all those transactions.
And the way that BigQuery's cost is calculated is based on the number of transactions or calculations. So what I like to do is actually aggregate the results.
If you've seen what the the table from GA4 to BigQuery looks like, it's showing you every single basically, event param, like...every single interaction that happens on your website. Whether it was loading a page, somebody clicked on something, right? It gets really, really nuanced.
So what you can do is run queries at the end of every day that will aggregate that data into tables that you actually care about.
So this QR code will take you to something written...a piece written by Analytic Canvas that I think is fantastic. It kind of walks you through how to do those things. Plus, they offer some tools to help you build SQL queries so that you can build the tables and the aggregated data sets that matter to you while keeping those costs low.
Because again, if I do it once, like, I can run a whole lot of transformations on data, and it costs like 60 cents. So just know that there's a lot of power there. But you have to know how to properly use it. So definitely recommend looking into this, if this is of interest to you. Plus be on the lookout for more coming from us about it.
Alright, so that's the platforms that you're kind of using to deal with your data. But now let's talk about how you scale the insights.
And I like to talk about this from two different ways. First, it's how do you get the data to better communicate to you what's going on so that you don't have to spend as much time trying to dig through and find the nuggets. But then the flip side is also how do you then take that and better communicate to the other stakeholders or other team members or your clients in a way that's more meaningful for them?
So first, let's talk about how do you get more from it? And honestly, it's going to be all about what kind of visualizations are you using. So definitely focus on trends over element quantity, we're going to talk about what that means more of, and then also benefit from Color Psychology.
So first, let's talk about trends. In SEO, everything we do is trends. And I feel like a lot of dashboards that I've seen are focusing on single points of time, which really don't make sense for SEO because we are in a race. It is based on what ours... what our competitors are doing. What's Google doing, right?
There's no just single point in time, that really makes sense usually for the decisions that we're trying to make.
So some of the things I like say - instead of pie charts, you stacked area charts. These help you see a trend. So for example, I can see you know this...whatever time period we're looking at, I might want to look at it and say, 'Hey, what's going on here, because I saw a huge dip.'
Whereas a pie chart is just showing you a single point in time, it doesn't really give you the context to tell you what's important.
One of my favorite examples there is when people are talking about Share of Voice. We had 20% branded, or...or even branding versus non-branded search. 20% of our search terms were branded.
That doesn't tell you much of anything at all, because now you're wondering like, 'Okay, well 20% of how many searches and what was it last month? And you know, is their brand more popular or less popular?' Like what's...there's just a lot of nuance that is subtly missed there.
And then here also, if you're ever using I call them value pairs, I know it's the big values. Make sure you're noting what is the time period (and at Gray Dot, we actually do month over month and year over year,) and then use sparklines. Sparklines can be extremely helpful for helping you understand what's a good number versus what's not.
So with the example here of negative 36%. That seems really bad. But I can tell by the sparkline that, no, maybe that's just the seasonality, so we're probably doing okay.
And then leaning on the psychology of color. I'm not saying you need to become a designer, good lord knows we have enough on our plates. But I do say go ahead and start learning about what color can do for how it makes people feel. And so this QR code will take you to an article written by Toptal - they've done really all the hard work for you.
But I will say my biggest plea is: stop using red text. Stop it. There's a couple different reasons.
First, it's not accessible. Red text is hard for even completely able people to see. It blurs on a screen, so it's just really difficult. The other thing is going to be that for your accessibility - you know, colorblindness, the most common form of colorblindness are men who can't see the difference between red and green.
So already you're kind of shifting the conversation and you're making it in a place where people can't... can't even understand what they're looking at. The other piece - it... I've seen it happen way too many times, where there is a sea of green metrics. And people want to focus on just the one single red. So that's something that definitely, you know, use something else.
At Gray Dot, we typically use grays or oranges. So I'm not saying to hide the data, right? Still show that decrease, we want to be really transparent in our communications, but stop putting... putting so much focus on it, that it's taking away your power and telling the story.
Alright, so now you know how to get your data to like spill the tea for you, if you will.
So how do you get to be better at communicating what the data is telling you to the other people that...that it matters to. It's all gonna be about customizing your comms. So in the same way that we customize our marketing messages and our branding to our different personas, develop personas for your different internal audiences.
So as I mentioned before, with executive teams, they only want a one-page report, they do not have time to care. The marketing manager, he wants a 40-page report. So those are the things that matters.
And also think about what are the things that they care about. So what are their KPIs? How are they being graded on? So for example - is there data from the website that you could be sharing with the product team to help them better develop the products or prioritize the roadmap? Right?
And so one of the things we'll do there is say, 'Hey, a lot of the, you know...90% of the people who actually bought this product came in from this blog post.' That is really useful information for that product team so that they understand the costs, how people are coming in, how it's relating to them.
And then...can also help them you know, do they need to do more customer service? Do they need to expand on that feature even more, things like that.
So the same way that we customize our communications and our messaging for our external audiences, right? Our prospects, our leads, and future customers - we need to do the same internally. Whether it's internal stakeholders across different teams, or with your clients in an agency environment. Customize those comms. One message does not work for all.
All right, and that is a little bit about what I know about how to scale your reporting and make sure that you can really get your arms around your data.