AMP Pages and Session Inflation


Post by Max Crowe, Senior Developer

Google’s AMP framework, which aims to improve the mobile web browsing experience, was originally positioned as an open-source competitor to Facebook Instant Articles and Apple News; the feature was ideal for news publishers, but not for other types of websites. But recently, online retailers have begun to adopt AMP, and Google has responded by enhancing it with numerous features to improve its suitability for various brands.

AMP adoption carries with it certain surprising consequences regarding analytics tracking. Brands may find unexpectedly high bounce rates when analyzing user engagement on their AMP pages. This is a consequence of a crucial part of the AMP framework: the AMP cache.

Organic AMP Listings Are Different

Google’s mobile SERP (search engine results page) includes markup that identifies which results are AMP pages. The presence of the AMP logo next to a result guarantees a fast-loading page, but this markup isn’t the only thing that makes these listings different: some special behavior kicks in when you click them. In the screen shots below, notice that the landing page contains something that resembles an address bar; this is not the browser’s built-in address bar, but rather a feature of the page itself:

noreaster-screenshots

Although it may look as if the browser’s address is a URL on abc7ny.com, the real address bar contains a Google URL. This is because Google is serving this result from the AMP cache. Since one role of the AMP JavaScript runtime is to dynamically reconfigure an HTML document to ensure the fastest possible rendering, the DOM (Document Object Model) of a fully-processed AMP page looks rather different than its original HTML source code would suggest. Google uses the AMP cache to store and serve pages that have undergone this processing in advance, thus speeding up rendering by making it unnecessary for the JavaScript runtime to do it again.

Analysis with Firefox’s developer tools reveals what’s really going on here: in fact, the browser never left the original SERP. It is using an iframe to show the article content superimposed over the search results. The iframe’s URL reveals the address of the AMP cache, which is on the host cdn.ampproject.org:

noreaster-screenshots-2

As Google’s Paul Bakaus points out, the AMP framework’s analytics tagging mechanism attributes the view of this “page” to the publisher’s domain, as it should. However, there are two caveats to this:

  • By default, even if visitors have been to the publisher’s origin website before, their visits to AMP pages will be tracked under a separate identity, which makes it impossible to analyze the impact that an AMP page has on the user’s engagement with the rest of the website
  • Even if the publisher ensures that the visitor’s identity persists across all contexts, Google Analytics will not consider a view of an AMP page and a subsequent view of another page on the origin website to take place within the same visit, causing bounce rate metrics to look worse than they actually are

Cross-Domain Analytics Tracking is Tricky

It’s common for AMP pages to be displayed on a domain other than the one belonging to the publisher’s website. This presents a difficulty for analytics platforms. They rely on the persistence of cookies to understand when a visit to a page is coming from the same person who visited another page on the same domain yesterday. By design, browsers only send cookies corresponding to the domain of the URL they are requesting; this ensures that a website can’t know whether or not you have ever visited a competitor site.

When your browser displays an AMP document via Google’s cache, it will not send any cookies the origin domain may have set in the past. The reverse is true as well; if a person finds and reads five AMP articles on your website via organic search, and then goes to a non-AMP portion of the site, the website will think that person is coming to your site for the first time. The various constraints imposed by browsers’ cookie policies and by the AMP framework make this a difficult problem to solve, but not an impossible one: Simo Ahava and Dan Wilkerson (of Reaktor and LunaMetrics, respectively) have devised a solution for sharing the Google Analytics client ID between AMP and non-AMP contexts.

Session Inflation and Misattribution of Bounces

A flawless implementation of a scheme to share Google Analytics client IDs between AMP and non-AMP contexts will still not insulate brands from a separate problem: when a visitor arrives at a page served by the AMP cache via Google organic search, then follows a link on that page to the origin website, Google Analytics will record two sessions. The first will be attributed to organic search, while the second will be considered referral traffic from cdn.ampproject.org. It’s also likely that the first session will be considered a bounce.

Why does this happen? A review of Google’s documentation on the definition of a Google Analytics session reveals the answer. Google Analytics uses a model that Google calls campaign based expiration to govern whether user activity extends or terminates a session. In this model, sessions may only have a single marketing source. When a visitor follows an organic search listing to a website, this initializes a new session, which Google Analytics will attribute to organic search from the appropriate search engine. If that user then returns to the SERP and follows a paid advertisement to the same website, the original session will end and be replaced by a new session attributable to paid search from that engine.

A Potential Workaround

Although we are not aware of any way to prevent Google Analytics from double-counting sessions in this scenario, the use of the event tracking feature in AMP’s analytics component could provide a means to make the necessary reporting adjustments. AMP allows Google Analytics to be configured to track an event whenever any link (optionally matching a specific CSS selector) is clicked. AMP publishers should make use of this feature to track clicks on links to the origin site as events.

Tracking clicks on these links as events accomplishes two things. First, it prevents the initial session (the one attributed to organic search) from being considered a bounce, because it ensures that there is more than interaction hit associated with the session. Second, it provides the analyst with the means to know the approximate extent of the session count inflation. Since we know that a click on an internal link following an organic entry on a page served from the AMP cache will create a new session, we can use the count of AMP internal link click events attributed to Google organic search to know roughly how many extra sessions were created as referrals from cdn.ampproject.org.

To find out if your brand is ready for the mobile web, contact Performics today.


Comments are closed.

Performics Newsletter

[raw]



[/raw]