However, another prominent trend is that for a significant percentage of these sites, the underlying HTML itself is not cacheable. As per httparchive, nearly 40-50% of the analyzed sites have explicit “do not cache” directives in the HTTP response headers. Our internal analysis of the Internet Retailer top 100 websites (a collection of the most popular e-commerce websites) suggests that almost half of web pages are not cacheable.
A page is marked as non-cacheable typically when it involves a degree of personalization – a trend that is increasingly common across a wide range of web sites. Since personalization requires the execution of some server-side business logic, such pages can lead to significantly long delays. Dynamic pages usually represent some of the most interactive, media-rich (and thus latency-prone) pages on the web. Yet, their non-cacheable nature conflicts with the traditional approach of speeding up the delivery of web objects – to cache and serve from local browser storage or the edge.
So, we asked ourselves – is there a systematic way to bring better performance to modern, hard-to-cache, dynamic HTML web pages? Instart Logic’s answer to this question is our new SmartSequence technology with HTML Streaming.
HTML Streaming: What it is and How it Works
HTML Streaming is a novel, principled and transparent approach to the delivery of dynamic HTML pages from our SDAD service. The basic insight is that an HTML page should not be treated as a monolithic object, but as being made up of two types of components:
- elements that change rarely across requests
- elements for which change is frequent (for example, changes across users due to personalization)
Given an HTML page, our goal is to identify and store the rarely-changing HTML elements on an Instart Logic edge server, so it can be served quickly to an end user's browser when a request arrives. We term this cacheable subset a stub, and it includes the client-side Nanovisor. The non-cacheable elements are freshly fetched from the origin, and then “patched-in” with the previous elements already received.
Take a look at the timing diagram below, and let's assume a request is triggered from the browser at time t1. It is received by the HTML Streaming service in an Instart Logic server, and if the stub for the HTML page is present, then the client immediately gets a response, which arrives by time t3. If instead the stub was not present, the request would have gone all the way to the origin, waited through the server processing delay, and arrived at the client at time t7. The difference t7 - t3 is the head start a browser gets because of HTML Streaming.
Now, after sending the stub, the Instart Logic server will make a request to the origin. When the response arrives back at the server (at time t5) the HTML Streaming service within the server compares the HTML in the response to the one sent out earlier with the stub. Any differences are patched by sending instructions to the client Nanovisor. If the resulting patch is found to be unsafe, then the server and the client work in conjunction to reload the page automatically before anything is shown to the end user.
The head start a browser receives when it processes the stub can result in substantial performance gains (up to 40% over certain crucial web page performance metrics such as Start Render, and DOM Content Loaded).
Challenges for HTML Streaming
Our first iteration of HTML Streaming could only be applied to pages where the <HEAD> portion of the HTML was the same across all users. Now with our new enhancements, the service can handle even the most dynamic HTML.
However, transparently (for the origin) creating a subset of a dynamic HTML page that can be speculatively pre-executed and patched to create the full page, exposes several challenging technical problems. We will now discuss some of these challenges.
First, to safeguard end-user privacy, any user-specific information should be identified and removed from the stub. Second, the patching should ensure the execution order of both cached and patched-in scripts is maintained as in the original page, and that the page loads correctly. Third, the stub should evolve and keep up with changes in the origin content (while retaining the above two properties). Finally, any unsafe or incorrect behavior caused by the pre-execution of the cached stub in the browser should be detectable, and corrective action initiated if it happens.
A key design goal of HTML Streaming was thus to learn a "safe" stub, that is, a subset of the HTML which satisfies the criteria outlined above, and allows us to detect unsafe pre-execution of its content, if any. Now let’s take a deeper look into how we compute a “safe” stub.
Learning a Safe Stub
The starting point behind building a safe, cacheable stub is to periodically examine requests over a learning period, and identify elements in the head which are common across requests. This, however, does not ensure safe execution. For example, consider a HEAD with a element of the form
<meta id="csrf-token" value="dkked32">
Assume that the value attribute of this <META> element changes across requests, and hence this element is not included in the cached stab (and will be subsequently patched-in). Now suppose there is a <SCRIPT> element, present in the stub, further down the <HEAD> which accesses this <META> element. This could lead to problems in the page load, since the accessed <META> element was not included in the stub. To deal with this issue, we virtualize the changing element. This entails removing all sensitive (changing) attributes, and then, using the Nanovisor, to set up a watch by intercepting all access functions for this element. The watch allows us to determine if a subsequent patching of this changing element is safe or not.
In addition, there are other conditions (e.g. preserving the execution order of scripts) that also have to be further satisfied to ensure correctness, which we are not going into today as part of this blog post.
Auto-tuned Learning with SmartSequence
HTML Streaming has several moving parts involved in the creation of a safe/performant stub, which have to adapt to a wide range of web sites and updates in the site content. The SmartSequence technology powers and monitors the HTML Streaming feature to ensure this adaptation is transparent to the origin and end users.
As dynamic HTML flows through the service, SmartSequence allows the system to first learn the patterns of which portions of the HTML are unique and truly dynamic, and also monitor for any requests that are triggering a reload, and the reason for this. Based on this information, the system automatically adjusts the periods of up-front learning and even adjusts for which pages the feature is active, on a per-URL (and even per-browser) basis, all by learning from live production traffic. This process is continuous and allows the system to automatically evolve as the website or user behavior changes over time.
In summary, modern web sites are moving towards personalization for better user engagement. However, often this comes at the cost of performance due to non-cacheable dynamic HTML. At the same time, users are growing increasingly impatient and want to view content as soon as possible. Performance is thus an important imperative for these web sites.
Instart Logic’s HTML Streaming feature powered by SmartSequence technology is a new mechanism to accelerate dynamic web page performance and improve user experience. Evaluations of HTML Streaming applied to Internet Retailer Top 100 sites with dynamic HTML content demonstrate significant performance gains for a wide range of sites. In fact, we have observed gains greater than 20-30% on a range of metrics such as Start Render, Load Time and Speed Index, for 20-40% of the sites considered (depending on the metric). These gains hold across first and repeat views, and end user connection types (wired cable or mobile 3G).
HTML Streaming is being deployed today by several of our customers who are enjoying these great performance benefits.