The Google API Leak Should Change How Marketers and Publishers Do SEO

Update: We are diving deep into the Google API leak and what it means for marketers in our next episode of SparkToro Office Hours on June 27. Join Michael King, Founder & CEO, iPullRank, and Rand Fishkin, Co-founder & CEO, SparkToro, to learn more. RSVP for the free webinar.

Recently, my good friend Wil Reynolds wrote a LinkedIn post questioning whether the recent Google Search API leak had actionable takeaways that would be applicable to customer campaigns. Will Critchlow (another close friend and CEO of Searchpilot) wrote on LinkedIn:

“So far, I’m not changing any tactics off the back of this new information (I’d be very interested to hear from anyone who is!).”

That shocked me, because, even as someone who’s not thinking much about SEO as a key marketing channel, there are a number of different choices I’m planning to make about our marketing strategy and tactics to hopefully set us up for better potential search opportunities. So, for those who might have read my breakdown and Mike’s superb deep-dive on Monday/Tuesday, but haven’t kept close tabs on the story around Google’s Search API leaks, here’s more detail in a fast-moving, 11-minute video (after all, I’m talking about SEO again, and it is Friday 😎 ).

Transcript, links, and visuals:

Well, friends, it’s been an incredibly exciting week after the Google Search API leak, and I’m gonna try and walk you through what’s going on and what’s happened in just seven (oops… eleven) minutes.

So here we go. First off, in brief, this is two thousand five hundred and sixty nine documents that were leaked with fourteen thousand and fourteen different attributes, features in the API that you can see, browse, and and check out if you’d like. They’re hosted on hexdocs right now. There’s a number of folks who’ve downloaded them and uploaded them other places.

It was confirmed this week to The Verge and other outlets that this does come from Google. It is real. This is their internal API documentation. They certainly did not intend for it to be public.

It includes notable details and things that tie into other things we know about Google from, for example, the DOJ evidence, white papers. They’ve written patent applications, all that kind of stuff. How much of it is about Google’s actual web search engine because there’s a lot of stuff in there? Well, Mike King analyzed that for us along with a bunch of other things, and there you go.

Eight thousand of the fourteen thousand features are web search related specifically. Yes. There’s stuff in there about YouTube and maps and local and all that kind of thing, but this is by and large mostly the search API. So let’s break down, what Google said when they confirmed this leak.

Google said, said to The Verge, we would caution against making inaccurate assumptions about search based on out of context, outdated, or incomplete information. Well, that is generic and it sounds like they fed it into an AI, but okay. Basically, as long as we only make accurate assumptions, we’re all good. Okay.

Thanks, Google.

And given that this leak is not out of context, it is not outdated, and it is not incomplete information, I think we can, be cool to derive a lot of value from it.

We’ve shared extensive information about how search works and the types of factors that are okay. Whatever. I I don’t know. Okay.

What were the biggest initial discoveries in the leak?

Well, first off, we knew from the Department of Justice trial, and in fact, some some other leaks inside of Google that they considered Navboost which is based on click data to be an extremely important, perhaps the most important ranking factor in the search engine. And we now know thanks to this leak, that that comes from Chrome, Google Chrome. So this is this is Paul Hahr’s resume, which which was, shared as part of the, the DOJ trial.

And you can see Navboost. This is already one of Google’s strongest ranking signals. There’s actually another, document from the DOJ trial saying Navboost is our most important ranking signal. It was in an email, between Googlers.

We know that comes from Chrome.

Google quality rater feedback. So quality raters are the folks like Cyrus Shepherd who who secretly worked as a search quality rater at Google. No idea how he got hired, but pretty remarkable that he did.

That we we know that their input is directly callable in the search API. How much it’s weighted, how frequently it’s used, for what queries it’s used, we don’t know. But the fact that it’s in there is it’s shocking, quite quite shocking.

White lists were also surprising to a lot of folks, although they shouldn’t be. For example, let’s say, you know, there were lots of folks during the COVID pandemic who are trying to spread misinformation that that literally would have killed hundreds of thousands or millions of people and and, of course, sadly, millions of people did die.

But preventing that sort of misinformation and disinformation by suppressing sources that are malicious or that knowingly spread bad info, that’s obviously Google’s responsibility. Glad to see them doing it.

What are the biggest new discoveries in the leak since Monday’s post came out? First off, toxic backlinks exist. Look at this. There’s a bad backlinks penalized call, feature that can be made, called. And it says that, the Mustang a score, by the way, that is the Amit Singhal’s ranking system, which has been used at Google for for years and years, was mentioned by, some of the other engineers there. In any case, the fact that this exists is, I I think, quite validating for a lot of folks in the search marketing field and and SEOs in particular.

Google might be limiting the number of sites of a particular type that appear in the results. Mike King talked about this in particular on a search engine land blog, which I’m gonna send you to. But they they specifically call out, blogs, identifying commercial travel types of sites. And so it’s quite possible that Google would say, oh, only this many blogs can appear in the results. Only this many small personal sites could appear. Only this many commercial sites could appear. Only this many, local sites could appear.

Other discoveries include there are lots of modules and features that mention mentions. Basically, the, word or phrase that associates with an entity like, like my name, Rand Fishkin or my company’s name, SparkToro.

And then those mentions as they are made across the web could influence search rankings similar to the way that links do. We don’t know for sure. I would say this is actually one of the areas we really need to dig into.

One of the other fascinating ones, Andrew Ansley called this out on on his search engine land post is the idea that page titles have site wide implications. Look at this. There’s a title match score for the site, not just the individual pages. This is also true for something called page to VEC. There’s a site to VEC. So a lot of things that we thought, I I think SEOs thought historically, certainly I did when I was in the industry, were page specific are actually site wide signals.

Okay. Big question. If Chrome click stream data is used for rankings, does that mean paid clicks could boost organic rankings?

Okay. Over the years, a lot of people have noticed this this sort of effect where they they do a big PPC buy, they see a big SEO boost.

Maybe it’s possible Google is discounting the specific paid search clicks. Right? They’re ignoring what about when people click those links and share them with friends or or colleagues or family or open them on other devices or bookmark them or email them to themselves for later.

It seems probable that those might have an impact and thus paying for clicks could actually boost your organic rankings in in some cases.

I think it it’s certainly reasonable to assume that this might impact the Chrome clickstream, and it’s possible to have a positive impact unintended though Google might be. Could Google control for this? Sure. They could.

There there are there are ways that they could, you know, build machine learning models to filter it out and to say, well, when someone buys, usually they see this much more lift in traffic, and so we’re gonna discount that. That’s possible. I don’t know. I haven’t seen that in the documents.

I haven’t seen anyone uncover that yet. What are we actually changing in our marketing strategies based on this leak? This is a really good question. I I saw Will Reynolds on LinkedIn say nothing.

That that surprised me quite a bit. I think, you know, Will’s one of my closest friends and one of the smartest guys in search. I’m I’m surprised you changed another. I’ll tell you some things that I am changing and thinking about changing.

One is SparkToro. We were thinking about hiring, potentially some contract authors to write content for us. We haven’t invested in SEO, but we probably should in the next few years. Now I’m thinking, you know what?

We should probably hire one experienced author who has lots of citations, and their their entity is sort of strong in Google’s index instead of multiple authors.

I’m trying to balance zero click mentality. Right? Something Amanda and I have invested in a bunch is is this idea of zero click everything, Zero click marketing.

And that includes our emails having all of the info that you might need to consume. But now I’m thinking maybe we should actually keep keep it such that you have to click on an email to go to the full blog post and read it because that Chrome click stream data is probably really valuable to showing Google, that that we’re important and that we’re, you know, getting that ranking signal. Link being building in digital PR, for example, I do a lot of podcasts with with very small people, you know, very small publishers who probably have five or ten listeners only per episode. But I reason this is good for SEO because you want a big diversity of links and mentions across the web. But you know what? It looks like Google actually is using the Chrome click stream data to devalue links and potentially entity mentions from pages and sites that don’t get much traffic. So I should probably focus more of my attention and energy on those big sources and fewer on doing lots of little ones.

Creating demand for video or images by producing videos, by producing images, that could bias the search results in a direction that that I want. I I think that’s probably a useful thing for almost everyone: the value of outlinking. I was really surprised to see that outlinking is exclusively connected to spam score in the documents and nothing else. Nothing in the site quality, nothing in the a ranking.

I that surprised me. I don’t know. Maybe someone will dig deeper and find that out links are connected in a way, but I I thought out linking could be a very positive signal. Apparently, in these docs, it looks like it’s only a negative. Page title matching, surprisingly important factor. I kinda thought Google had given up on this and that, you know, if you were the reference resource for something and if lots of other pages pointed to you for a topic, you could rank with a more story like or more creative title. But that is not what we are seeing in these docs so far.

Mentions equals links?

Maybe. I I think if we dig more into this, we if we experiment more, I might be tempted to not worry about links at all and just go after, mentions of our brand and associations with the keywords that we want. So what are the next steps?

I really would like someone to figure out what Keto score, which is mentioned in a few of the documents, is. That is very interesting. It appears tied to entities and to mentions.

I want someone to dig into the mentions thing. I want tests. I wanna see results. Obviously, I’m not an SEO, but I I know that many of you in the SEO community could dig into this and tell us. And that might really change the behavior of how, you know, sort of digital PR and link builders operate.

I would also encourage anyone who’s interested in this to read Mike’s latest on search engine land. It’s very comprehensive and thorough. It’s a it’s like a twenty minute read, but it is really good.

Mike is also joining us at SparkToro for, our June twenty seventh episode of office hours. That’s gonna be amazing. I’m sure he’s gonna have way more. If this stuff is interesting to you, you should register for that.

And Google, I appreciate all the SEO folks saying, hey. You should apologize to Rand Fishkin. No. Google, you know what you should do?

You should put Russ Jones’ kids through college because he deserves it. And I think that would be a nice way to say you’re sorry for what you put him through, during those those last few years before he passed. Alright. Take care, friends.