Who Sends Traffic on the Web and How Much? New Research from Datos & SparkToro

In the era of zero-click platforms, native content, and walled gardens, does anyone still send meaningful traffic to the open web? Is Google responsible for 50% of all traffic referrals? 75%? Is social media sending half what search does? A quarter? Less? Are the major platforms sending less traffic overall than they did a year ago?

These are frustrating, hard-to-solve problems, requiring a massive dataset, an opt-in panel of real people who share the URLs they visit, a partner to provide said data, and a lot of thoughtful, careful decisions to accurately aggregate and compare referral traffic outside analytics tools (because so much of the web now sends dark traffic, i.e. without a referral string). Thanks to SparkToro’s partnership with clickstream data provider, Datos, we can finally deliver answers to these incredibly important questions.

Best I can tell from my own searching, these questions have never before been addressed at scale through a credible process. And the results are surprising indeed.

The Largest Traffic Referrers on the Web

It’s Google, and by a mile. Close to 2/3rds (63.41%) of all US web traffic referrals from the top 170 sites initiated on Google.com. The second-largest individual, traffic-referring domain is technically YouTube.com, but whereas Google.com hosts Google Docs, Gmail, Google Meet, and others, Microsoft splits these among a wide range of domains in the top 100 (Bing.com, Office.com, Live.com, Office365.com, Sharepoint.com, MicrosoftOnline.com, and Microsoft.com). And for those curious, the 170th largest traffic-referrer (Pinimg.com) sent 0.003197%, suggesting that even if the next thousand sites (#171-1,171) all sent similar amounts of traffic to the web, their combined referral traffic is smaller than Facebook or YouTube.

I reasoned it was only fair to group these and compare apples to apples. Taken together, these Microsoft-owned sites are responsible for a combined 7.21% of referrals.

Google’s dominance in traffic delivery won’t surprise anyone in the digital marketing world. The search giant often refers 70%+ of any given site’s traffic, vastly exceeding all social media and content sites combined. But, until now, I was never sure just how sizable that margin was, or whether dark social traffic was responsible for over/underestimates. Google’s almost 10X larger than the next largest referrer in our dataset.

Throughout this study, you’ll note that I’ve taken the liberty of grouping other sites with similar issues, including Twitter’s traffic-referring domains (Twitter.com, T.co, and Twimg.com), Cisco’s (okta.com and duo.com, which process URLs for security checks), Pinterest’s (pinterest.com and pinimg.com), etc. But, for those interested in the unfiltered/ungrouped list, I’ve made that available here.

Click to see the top 170 domains, in order (warning: 3,000px long)

For this version domains are ungrouped and use the full time period of the study (13 months from Jan 2023 – Jan 2024). The numbers are quite similar, though there’s variance from the grouped, January-2024-only version highlighted above.

Google’s dominance is clear. Microsoft’s sites, Facebook, YouTube, Reddit, DuckDuckGo, Yahoo!, and Twitter are the only others that send 1%+ of referral traffic. These include referrals between the noted domains – Google sending traffic to YouTube, Live.com sending traffic to Bing.com, etc. But, I think most readers may be more interested in the more nuanced view in the next section.

What a Traffic Referral Is and Isn’t

Before we go too much further into the findings, it’s important to understand what we mean by a “referral.” Referrals aren’t just clicks on links or ads. If you hear SparkToro mentioned on an Apple podcast and type-in “sparktoro.com” to your browser window, that’s still a referral, albeit a much-less-common way users navigate the web and much more difficult one to track.

Thankfully, Datos’ panel can account for this kind of referral. Included in this study are any visits that happened in a browsing session (defined by a period of active use).

Those among you who worry over the details of URL-visiting methodology (like me!) might now be concerned that this would overinflate referrals from any and every website to the most popular navigation and consumption platforms (i.e. Google, Facebook, YouTube, Reddit, etc.). What if someone finishes listening to their podcast and clicks on their Gmail bookmark or types in YouTube to unwind with some movie trailers? Good news—we can control for that, too!

By looking at only referrals to sites outside the most-visited ones, we can see how traffic flows from the major traffic referrers to the rest of the web. Let’s explore that next.

The Largest Traffic Referrers to the “Long Tail” of the Web

The pie and bar charts above show data on all referrals. Yet, for the overwhelming majority of marketers, website builders, business owners, investors, and those who worry about monopoly power, what we really want to know is:

“Who sends traffic to the little guys?”

To explore this, I asked Datos to break the referral data into two categories: referrals to the top 170 most-visited sites on the American web and referrals to everyone else outside that top 170, what I’m calling “The Long Tail.”

Only 22 sites have a 0.1% or greater share of web traffic referrals to the Long Tail, as you can see in the graph below:

To me, the most interesting, standout surprises are:

  • Reddit still makes the cut! Despite their Orwellian efforts to box out moderators, reduce outbound traffic, and sell user content as AI training models without compensation, Reddit’s still one of the few large referrers of traffic to the Long Tail.
  • YouTube is below Facebook and Reddit! This one blows me away, because YouTube is technically open to anyone, and so many millions of channels exist there. But, sadly, it’s the big sites getting most of YouTube’s referrals.
  • Cisco Okta and Duo? Both domains are connected to how Cisco provides security to many of their customers, and I suspect these domains are more about processing clicks on Cisco-monitored machines than about actually referring traffic the way most of the media, social, and search sites do. Still, an interesting look at how web traffic flows.
  • Amazon! I’m almost shocked they’re on the list at all, given Amazon’s proclivity against outlinking and frequent use a “final destination” for e-commerce purchases, watching shows, etc. It’s nice to see Amazon’s referring some traffic at least.
  • LinkedIn is another surprise. Honestly, I worried that with their intense algorithmic bias against links, and increasing efforts to make everything point to a LinkedIn page, they’d be lower down. Nice to know there’s still a shot for us little guys on the business-focused network.
  • Twitter’s not a surprise, though if they’re still there in 3 years, that would be. The site’s decline in traffic, app downloads, usage, and civility is troubling, and most data-savvy prognosticators expect Meta’s Threads to overtake them within a few years.
  • OpenAI – I can’t decide if I’m surprised they’re already here, or surprised they’re not higher. One worry I have is whether OpenAI’s references and referrals will, long-term, favor the Long Tail vs. the largest sites on the web. Hard to predict.

My fear is that, over time, we’ll see fewer referrals going to the Long Tail of the web, and more going to the big winners already at the top. So, I graphed the stats:

It’s already happening. In the 13 months of this study, the Long Tail lost ~3.2% to the top 170, equivalent to tens of millions of visitors and billions of clicks. And while the number of monthly unique visitors (among Datos’ fixed-size, stable US panel) to the top 170 grew by 0.299%, referral traffic only grew at 0.103%, suggesting that, on the whole, the most powerful, popular sites on the web are sending out a shrinking share of the traffic they receive.

Who’s Hoarding Traffic? And Who’s Generous with Referrals?

One of the best ways to compare traffic referrers is to look not only at how much traffic they send out, but the ratio of how much they send to how much they receive. For example, Google refers nearly 100 visits to external sites per monthly unique visitor they receive. No surprise as they’re not just a search engine but often, also, a navigation tool. Many of us no longer type in domain names to our browser address bar, we type a brand name and count on Google to take us to the right place.

When applying this metric, I decided to visualize it as it applied to the Long Tail, excluding traffic sites in the top 170 send each other. From these, we can see both the most generous referrers (those who send a high percentage of visitors to the Long Tail) and those who hoard traffic and/or are at the end of a browsing session.

Let’s start with the traffic-hoarders:

Large medical sites and media consumption platforms are unsurprising to find here, as are utility-focused sites like UPS, FedEx, and Grammarly. More surprising were media sites like Wikihow (ruthlessly bold of them not to refer traffic given how their content is created), Investopedia, and US News (ranking top colleges? Sure. But they’ll be hog-tied to a peanut before they send ’em traffic!). GameRant, a site with video game news which, I’m ashamed to say, features prominently in my own Google Discover feed, sends a measly 79 referrals to the Long Tail for every 1,000 monthly unique visitors they receive.

Not visits. Uniques! In Datos’ panel, these are called “users” and refer to individual panelists on different devices, making the metric even more egregiously infuriating (at least, for those who believe websites making references should link to their sources).

Moving on to the inverse—those with the highest amounts of referred traffic per unique visitor—we see search engines dominating the top of the list.

Again, this makes sense and reinforces the fact that we’ve got quality, trustworthy data. Search engines are meant to send out traffic, and they do, though (perhaps surprisingly) Bing and Yahoo are far less generous toward the Long Tail of the web than Google or DuckDuckGo.

Facebook is quite a surprise to see, given their crackdowns on linking out and the algo’s general disfavor toward content with links. Instructure is a fascinating one that I wasn’t familiar with before the study. They’re one of the most-trafficked sites in the United States, mostly by educators and students (at all levels). It’s great to see that they refer so much traffic to other sites in the Long Tail, linking out frequently from the syllabi and workbooks teachers create to the open web.

My other big surprise was seeing Fox News more generous with referral traffic than rivals CNN, NYTimes, MSN, and others in the news business.

How Are Traffic Referrals Changing Over Time?

Over the study’s 13 months, we saw plenty of fluctuation, much of it likely due to seasonality (e.g. e-commerce sites tend to refer more traffic the closer it gets to the US gift-giving holidays). I hand-picked a few of the most interesting and dramatic deltas and split them into two graphs to better visualize the changes.

First, changes among sites lower in the Top 170:

Threads grew fast in Q4 2023 and Q1 2024. Based on their recent app download data, I’d reckon that growth has considerably more headroom. Canva’s traffic grew nicely in 2023, but my guess is that some combination of new AI, localization, and/or link design features from last year is responsible for the 50% growth in referrals to the long tail.

CapitalOneShopping.com is a great example of the e-commerce seasonality trend, though the big surprise is that they didn’t fall further in Q1. The site appears to be taking off on the back of Capital One’s massive growth the last few years (they’ve grown so big that they’re trying to take over Discover Card, which the FTC is looking into as potentially anticompetitive).

Next, the changes among larger sites:

Reddit’s controversial API and community-moderation moves last year are almost certainly connected to the decline in referrals from the platform. Many communities that frequently linked-out, and many third-party clients that encouraged outlinking suffered in that fracas and are no longer part of Reddit. The site’s also taken action in the last few years to host more types of content itself (images, videos, etc.) leading users to post native content (which algorithmically and vote-wise performs better than off-site material), further reducing referral traffic. Overall, Reddit referrals fell 30% over the study period.

OpenAI, meanwhile, rose 82%, though it’s still the smallest of the search/answer engines (more on that below). Twitter bounced around (most probably because of Elon’s experiments with removing, then re-adding link previews and rich visuals to external links), ending up +6% in referrals YoY despite declining overall traffic. Tumblr referrals were down 41% and StackOverflow dropped 18%.

How Do Mobile vs. Desktop Web Referrals Compare?

For those curious about how desktop vs. mobile referrals compare, Datos pulled separate values for four of the largest individual referrers. I’ve visualized those below.

Google and Facebook both send out significantly more referrals per visitor on desktop than mobile. But YouTube and Reddit are, fascinatingly, the inverse! Remember that these are using the mobile web versions, not the apps (which are even less friendly to external links).

I think this is something to study in more depth in the future, especially if we can get access to a greater amount of mobile and mobile app panelists data to analyze.

Analyzing Traffic Referrals from Social Media Sites

Last, but not least, I wanted to look at two unique categories of sites: social media and search engines. Rather than compare these against all referrals, I looked only at the share of social media referrals (below) to see how the major players compare.

YouTube and Facebook are in a dead heat, with Reddit close behind. Twitter, despite having less than 1/5th of Instagram’s users, sends out almost 3X the traffic. LinkedIn recently crossed 1 billion active profiles, but doesn’t disclose monthly active users that could be side-by-side compared with the other networks.

I found it quite surprising that Discord refers so much traffic; hopefully a lot of that is going to indie game developers (a personal passion of mine)😉😎. And perhaps equally surprising is WhatsApp, which claims more than 2B active users, but refers such a small percent of its users (more a fault of user chat behavior than the platform). Remember that while WhatsApp on mobile is always an app (which Datos’ panel doesn’t have information about), WhatsApp desktop does exist and can be seen. It’s always possible that app users are more likely to refer traffic on mobile devices vs. desktop, but I strongly doubt it.

Traffic Referrals from the Search Engines

In search engines, it’s no surprise that this data closely matches sources like StatCounter (they show 87% Google; Datos’ panel suggests 85%).

I’ve included OpenAI because, despite not being a technical search engine, it did begin to link out much more in 2023 when asked search-style queries, adding citations and references that weren’t previously available.

Finally, if Bing looks high in this study (Bing.com by itself is closer to 5% market share in the US), don’t forget that I’ve grouped it with the other Microsoft sites that also use Bing’s search technology and send traffic of their own. As mentioned above, given that Gmail traffic is wrapped into Google.com through the Mail.Google.com subdomain and docs/drive/sheets/etc. are also there, I felt it only fair to include Live.com and Office/Office365/Microsoft/Microsoft Online referrals.

Have Questions? Want to Know More? Join Our Upcoming Webinar on Traffic Referral Data!

The Datos team and I are co-hosting a special episode of SparkToro’s Office Hours on April 16th at 11am Pacific / 2pm Eastern. I promise it’ll be not only informative and chock-full of findings from this study, but also engagingly entertaining. There will be angry-fist-shaking about the traffic-hoarders of the web, tactical tips about how to get traffic from some of these surprisingly-high traffic referrers, and possibly even some new bits of data from the study based on feedback and questions we receive over the next few weeks.

Who Sends Traffic on the Web, How Much, and What’s Changing?
April 16 @ 11am Pacific / 2pm Eastern
Featuring Eli Goodman (cofounder & CEO, Datos) and Rand Fishkin (cofounder & CEO, SparkToro)

(Yes, I still have the numbers to run more calculations and graphs. And yes, I still haven’t run out of love for doing that messy Excel work).

Data Quality and Methodology

If you dove into the charts and graphs above and wondered about the veracity of this data, you’re not alone! Asking questions about data quality, methodology, and panel selection is important and I take my responsibility seriously to present the best, most accurate data possible. So, let me expound a bit on the breadth, depth, and quality of the process we’ve used to compile this information.

Datos’ opt-in, anonymized panel is comprised of 10s of millions of devices around the world. But, to produce the most reliable results possible, we’ve limited the information to a panel of several hundred thousand users in the United States (on desktop and mobile devices as noted) whose behavior, inferred demographics, and visit patterns are both stable (meaning they remained active on the web and opted-in to the Datos panel during each of the 13 months of this study) and representative (i.e. their aggregated visit data matches public information provided by verifiable sources like Facebook and Google’s quarterly financial reports, and their inferred demographic makeup is generally aligned with that of US Internet users).

If you’re not quite familiar with clickstream panels, think of them like the digital equivalent of Nielsen TV families. Instead of a box that sits on thousands of American televisions, feeding data about what shows they watch back to CBS and NBC in 1985, it’s software installed on desktops, laptops, and mobile devices of Internet users that have opted-in to share their visits, noting, anonymizing, and aggregating the URLs they visit. Companies like Comscore, Nielsen/NetRatings, SimilarWeb, and Datos can then estimate traffic, market share, referrals, and other important metrics just as Nielsen’s TV panel helped CBS and NBC estimate how many people watched Magnum P.I. vs. MacGyver.

But, assembling a sizable, representative data panel is challenging! And the wrong selection mix can lead to problematic conclusions. So, I personally audited dozens of data points, dug into explanations for why certain numbers rose and fell (e.g. I compared the month-to-month fluctuation in Twitter’s referral traffic against the timing of Elon’s experiments with removing, then re-adding headlines to posts with external links on the platform), and, after a few weeks of back-and-forth with the Datos team, feel satisfied that what’s presented here is trustworthy.

As with any panel, a margin of error is to be expected. And, because many of the largest traffic-referring domains also have mobile apps that have unique interfaces and behaviors vs. their mobile websites, some imprecision is inevitable. However, taken in aggregate, and compared against one another, these numbers are almost certainly representative of how web traffic referrals flowed in the United States from January 2023-January 2024.


Next Steps

This, I hope, is only the beginning of a beautiful partnership between SparkToro and Datos to bring greater transparency and awareness to important questions of how people and companies behave on the web. I plan to revisit and update the numbers in this study every 1-2 years, and I’m also excited to update my previous clickstream-based studies on zero-click searches in Google, the behaviors of ChatGPT users, how much Google is favoring their own properties in the search results, and more.

Don’t forget to join us for Office Hours on April 16th if you’ve got questions or deeper interest in this study. We should have even more fascinating insights to dig into there.

Huge thanks to the entire team at Datos, and especially my friend, Eli Goodman, their head of marketing, Belinda Conde, product manager Anna Laskovaya, and the incredibly-generous-with-his-time data analyst, Kirill Tashlykov.

p.s. My next dream project is to see if I can prove how often people consume a brand’s content in one platform (e.g. a zero-click Google search, a social media post, a video on YouTube, etc.), fail to click/visit/search-for the brand(s) involved, but later end up on that brand’s website. I think that data could be invaluable to show just how powerful the zero-click-content strategy can be.