Faster Web Performance Using Virtualization in the Browser

Mehrdad Reshadi

Instart Logic’s NanoVisor Architect explains dual-sided client-cloud architecture

A web page or web app references external resources and determines their absolute or relative position and presentation on the screen. This may be done statically via HTML tags and attributes, or dynamically via JavaScript. For many web applications, the user experience is usually directly correlated to the page load performance, which depends on how fast the browser can download the resources and process and render them on the screen. This in turn depends on bandwidth, latency, and number of network connections to the origins of these resources.


There have been many optimizations in the past that focused on different elements of web browsing. Content delivery networks (CDNs) optimize the middle mile; VM optimizations run the code faster on the server or the client; front end optimizations (FEO) optimize the static content in the page. But almost all of these optimizations are predominantly applicable to the static, homogeneous and predictable world of desktop browsing. Your users now live in a highly dynamic, mobile, and heterogeneous online world. Instead of wired Ethernet, we have 3G, 4G, and LTE, along with public and private Wi-Fi. Standard desktop displays are now accompanied with high-resolution Retina and 4K HD displays all the way down to small devices with their own high-resolution displays. The browser landscape has moved away from few slow-changing desktop browsers to many rapidly-changing desktop and mobile browsers. More importantly, fairly static web pages are now replaced by highly dynamic content adjusted to user context, location, device, and profile. (I realize many of you reading this are already quite aware of this, but it helps set the stage for the points I make below.)

The current dynamism of the web has two important implications:

  • Standalone optimizations applied only to static content will not pay off as much.
  • It is too costly to both develop the content and manually optimize it for different targets and scenarios.

When it comes to optimizing content for faster delivery and rendering, three major choices stand out:

  • Automated Front End Optimizations run on the origin server and manipulate the structure of the page and inline, externalize, combine, or transform resources such as CSS, JavaScript, or images. These optimizations are particularly limited to what is statically specified in the page (for example, only applying to tags in HTML) and cannot deal with cases where JavaScript dynamically requests and manipulates resources on the client side.
  • CDN optimizations run between the origin server and client browser and mainly reduce the latency of the requested resources by caching them as close as possible to the client device. These optimizations are applied at the granularity of individual resources and independent of how they might be used in a certain webpage or application. They are also fairly reactive and only respond to what the client is requesting. Also, CDNs were developed for the wired era and don’t address new challenges with wireless networks.
  • Browser-side optimizations such as caching and prefetching try to identify individual resources in the page and reduce the latency of requesting them; either by reusing a previously downloaded one, or making the request sooner. Once again, these optimizations are fairly static. Prefetching can only work on statically-referenced resources in the page, and caching does not consider the criticality of the resource.

Time and content

We considered the overall effects of sub-elements of resources and content on the overall performance of the page. The novelty of our approach can be best explained by better understanding the evolution of multimedia video streaming.

Video streaming is a proven technique in the world of media delivery that has now entirely replaced downloading full videos. In generic terms, media streaming overlaps the delivery and the consumption of the media. We can also think of streaming as ordering the delivery of the contents according to their time of consumption. This is easy to understand in the context of video. For example, when watching a video at 30 frames per second, every pixel in every frame can be directly associated with the time it should be shown in the display. This strong ordering is explicitly embedded in the content itself.

Unfortunately, web pages and web apps do not have an explicit notion of time in their content. But it is possible to think of the process of loading a page on the screen as a video that starts with a blank screen and gradually (frame by frame) morphs into the final presentation of the content. From this point of view, different bytes coming down the network link will be needed at different times, and ideally we would download them in the same order that they are needed for construction of the page (i.e. overlapping delivery and consumption).

To apply the concept of streaming to the web page, we need to introduce the notion of timeline in the page and control the overall process of page load according to this timeline. When loading a web page, browser, CDN, and resource origin servers “react” to the requests made by the page (or application). For example, if JavaScript code requests several images, the browser must download the code, execute it, request the images, and then render them. However, it could have utilized the network better by downloading only the part of the JS code that was needed for execution as well as the first parts of all images that were needed for initial positioning and rendering of the images; and then while the JS is executing, it could download the rest of the JS code and the rest of the image data. Unless this is explicitly hardcoded in the application, individual elements of the system such as browser and CDN cannot automatically and independently identify this timeline and optimize the delivery and processing.

Nanovisor.js and AppSequencer

This is where our Web Application Streaming approach comes into the picture. It uses our client-side Nanovisor.js along with the cloud-based AppSequencer in a client/server architecture that turns the independent and reactive behaviors of the browser and legacy CDNs into a proactive process that carefully orchestrates the interactions of different entities based on a more optimal timeline. This timeline is extracted and refined as more and more users visit the same page.

Static optimization approaches usually rely on patterns, such as the typical way web apps and content are created and organized. However, the number of possible patterns in a dynamic application quickly becomes monumental and cannot be relied on for efficient optimization. Instead of relying on patterns, our Nanovisor.js client focuses on the behavior/em> of applications. A behavior is in fact the result of executing a pattern. The key insight here is that many patterns exhibit the same behavior when executed in the browser. For example, HTML offers many ways to create and configure an image element (via tag and various JavaScript APIs), but all of these mechanisms in the end create an image object in the Document Object Model (DOM) of the page.

To capture and control the behavior of the application, Nanovisor.js creates a thin virtualization layer between the web app and the browser. This layer allows us to intercept all browser API calls and potentially (a) change their function, (b) postpone them, or (c) use the result of some speculative action done in advance. This is closely analogous to what hardware virtualization techniques such as VMWare, Hyper-V, and Xen do. While those approaches intercept OS or system calls, we intercept the browser calls.

The Nanovisor.js virtual layer also enables us to consider a more holistic and global view rather than focusing only on point optimizations of certain content. For example, instead of optimizing and streaming all images in a certain page, we can learn from the behavior of that page and decide on which images in the page we should apply our streaming approach.

One of our methods is to stream the images of the page so that the page loads significantly faster and becomes interactive much sooner. In my next blog post I’ll drill deeper into the image streaming technology we have implemented on top of our Nanovisor.js and AppSequencer client-server architecture. Stay tuned.