Don’t Buy the JavaScript You Don’t Use

Frank McCabe

Javascript delivery optimization with Instart Logic.

A modern website rivals in size and complexity the word processor software of the 1990’s. To fit a typical web page in yesterday’s floppies (remember those?) could easily take 10-20 floppies. Of this veritable mountain of data it is not uncommon to have more than a megabyte of program code – in the form of JavaScript code. Yet, our modern infrastructure is expected to load and present this to the user in seconds not minutes.

There are five main kinds of resources in a typical web application: HTML, CSS, Images of various formats, font files and JavaScript. Instart Logic's platform has many features that help to optimize the delivery of these resources; features that focus on optimizing the delivery of images and on the delivery of HTML – which are major elements of most web pages.

Instart Logic today is introducing a new service aimed at optimizing the delivery of JavaScript.

Javascript Streaming Diagram.

Delivering JavaScript

According to httparchive, in recent years, the weight of JavaScript on a typical web page has been growing at approximately 10% per year. Today, the median amount of JavaScript is nearly 300K; of which more than half is actually 'third party' code – code that is injected to perform a variety of functions from elaborate displays to tracking users' behavior on the page.
Given the nature of program code it is almost inevitable that not all the JavaScript that is notionally part of a web site will be used. This can be because the JavaScript library is a general library only part of which is needed for the particular page, to JavaScript being needed for some but not all functions on the page.

This over-provisioning is the source of our new optimization: by measuring what JavaScript code is actually used we can optimize delivery by only delivering the code most commonly used. By doing this we can substantially reduce the size of a typical JavaScript file by 30-40%.

JavaScript code is not the same as image data: if we don't deliver some JavaScript code we risk breaking the web page; so a large part of the technology for JavaScript optimization focuses on transparently backfilling code that is actually needed by a user.

JavaScript on a Diet

JavaScript is a highly dynamic language. This makes many forms of traditional program analysis extremely difficult. For example, it is not even possible to know what variables a program has, or even what the actual scope of a variable is – except by seeing what happens at run-time.

One of the consequences of this is that it is effectively impossible to perform a static analysis of a JavaScript source to determine those portions of the code that are not needed.

Instead, we take a different approach – we measure what happens as JavaScript is executed and use machine learning techniques to tailor and transform the JavaScript code.

Instrumenting JavaScript

The first step in the process is to measure which functions in a JavaScript source file are used. This involves giving every function a number:

Instrumenting Javascript is to measure the functions in a source file.

Instrumentation takes the form of recording whether or not a given function has ever been used. We collect this data from a small percentage of the users of the website. We then use statistical analysis on this to partition the JavaScript code into related fragments:

Javascript Instrumentation collects data from a small percentage of the website users.

Once we have a partitioned file, we only deliver the first part of the file when the browser asks for the JavaScript file. This is the foundation of the optimization and this first partition is anywhere from 5% to 90% of the original – typically we see a reduction of about 30-40%.

Filling in Gaps

Of course, just because we don't deliver a given function to the browser does not mean that it is never needed. A key part of the overall system is ensuring transparent backfilling occurs when the partitioning algorithm mis-categorizes a function.

When we decide that a given function is not immediately needed we don't just eliminate the function completely from the code. Instead we replace it with a stub. This stub, when called for the first time, finds the missing function definition – by accessing the Instart Logic service –  and replaces itself with the new definition.  This is one of the ways that JavaScript's dynamic nature helps us – without eval this optimization would not be possible.

The complete system for optimizing JavaScript is fairly complex. This complexity arises from the transformation itself (JavaScript has many special corner cases), the fact that we have to be able to collect data from live browsers, and the handling of the 'function miss' scenarios in a robust and scalable way.

Does it Work?

There are many ways of measuring the load performance of a website – the time to first byte emphasizes the importance of network latency, the time to the document loaded event emphasizes the moment when users can start interacting with the page, and the speed index emphasizes visual progress.

Which will be most impacted for your website? That depends on exactly how you incorporate JavaScript into your website. The important factors are the overall amount of JavaScript, the point in the web page where it is loaded, how much third party JavaScript there is and whether the code is loaded asynchronously or not.

We have seen quite significant reductions in the amount of JavaScript delivered to the browser as a result of this optimization. However, due to the complex ways in which JavaScript can interact with the other resources in the website it can be very hard to predict how the overall numbers will be affected.

The bottom line is that we are able to deliver a significant reduction in load times for quite a variety of different websites.