What Do Compilation Strategies Have to Do with Web Content Delivery?
Instart Logic’s unique Software-Defined Application Delivery service can optimize web apps and their content during delivery via our client-cloud architecture. Controlling the end-to-end application delivery path in software gives us the ability to use a wide range of new techniques and algorithms to optimize performance.
In general, the optimization process involves:
- Reading and de-serializing content from file or network into an intermediate representation (IR) in memory;
- Applying optimizations and transformations that change the IR; and
- Writing and serializing the IR back to the desired format for disk storage or network transfer.
An example of this process is reading a JPEG image file into an image representation in memory, transcoding and compressing that image, and then writing it back to JPEG or a different file format (e.g. WebP or JPEG XR and so on).
Further examples are the transformations we offer which we call True Fidelity Image Streaming, Dynamic HTML Streaming, and Flash Application Streaming.
This process of de-serializing content to an IR, transforming the IR, and serializing the IR is identical to what compilers do. You might be more familiar with compilers in the context of programming languages. Many optimizations follow the same process and can be considered to be compilers. You can find such compilers at the heart of many applications, such as a datasheet equation engine, a word processor, a database query optimizer, a network router, a web browser, a web server, even the autocorrect in a simple chat application.
In this article I’ll talk about the pros and cons of different compilation strategies and where Instart Logic’s approach fits into the big picture. Since compiling programming languages is the better-known, most involved and most complex type of compilation, I’ll explain using that context.
Compilers in programming languages
Today, almost all programs are written in some high level programming language that is then translated to machine code for execution. This is done almost always by a compiler. The main job of a compiler is translating one format to another. More precisely, compilers manipulate graphs, trees, lists, etc. to create new ones. These data structures may have different file representations on disk. Serialization/deserialization processes convert the memory representation to/from the file representation.
For example, an Abstract Syntax Tree (AST) can be represented by a high level programming language while a Control Data Flow Graph (CDFG) can be represented by machine assembly or binary code. The compiler converts the program into AST, transforms the AST to CDFG, then generates the executable binary.
Any typical compiler usually has three main phases:
- The Frontend (or Parser), which reads the file representation and converts (de-serializes) it to a data structure (e.g. C++ to AST) in memory;
- The Optimizer, which transforms the data structure;
- The Backend (or Code Generator), which converts (serializes) the data structure to a file or binary representation.
In order to work effectively, the Optimizer and the Backend need to run through many complex algorithms, and would significantly benefit from knowledge about:
- the way users use the application
- the capabilities of the devices that the program will run on
For example, if the Optimizer knows which code paths are more likely to be executed, it can order them better in the memory to get improved cache performance. Also, if the Backend knows the type of the target device, it may be able to use specific machine instructions that can speed up the program execution.
On the other hand, a program or application itself has typically three main phases:
- Development time – when programmers create, modify, or extend the application
- Delivery time – when the application is being delivered to users
- Run time – when the application actually runs on the user’s device
Different compilation strategies can be categorized based on which compilation phase is done at which application phase. This in turn determines the pros and cons of each approach.
In standard static compilers, all compiler phases happen at development time. The result of the compilation is executable machine binary that is delivered and executed on a user’s machine.
|The Optimizer is not under time constraints and can run very complex algorithms for analyzing and optimizing the code.||The Optimizer does not know the application behavior and should just conservatively compile for all possible cases.|
|The Backend does not know about the actual target device and should rely on the least common denominator set of instructions and characteristics.|
Example of static compilations include the GCC, Visual Studio and LLVM compilers for languages such as C and C++.
Just-in-time (JIT) compilation on Virtual Machines (VMs)
Some languages parse the language syntax to low level format (usually known as bytecode) and then use a VM at runtime to execute this binary. In such cases, parsing happens at the development time, bytecode is delivered, and then the optimization and Backend phases in the VM convert the bytecode to be executable on a user’s machine.
|The Optimizer can observe the application behavior by the user and make better optimization decisions.||The Optimizer runs right before execution and hence is under time constraint to finish quickly. Often only simple and fast optimization algorithms can run because both the runtime of an algorithm and the quality of generated results must be balanced.|
|The Backend knows about the actual target device it is running on and can use all special instructions.|
Examples of JITing and virtual machines include the JVM for Java, Dalvik on Android and the .NET framework for C#, and VB.net.
Interpretation and JITing of scripting languages
While usually the typed dynamic languages use the VM plus bytecode approach, the dynamic untyped scripting languages tend to ship and run the source directly on the user machine. In such cases all phases of the compiler (Frontend, Optimization, and Backend) run on the target device when the user runs the application.
|The Optimizer can observe the application behavior by the user and make better optimization decisions.||The Frontend, Optimizer, and Backend run right before execution and hence are under extreme time constraints. In some cases, these approaches rely on interpretation instead of compilation and suffer from performance loss against the other typed languages; although increasingly a lot of scripting languages mix interpretation and JITing to achieve better performance results.|
|The Backend knows about the actual target device it is running on and can use all special instructions.|
Ahead of Time (AOT) compilation
Although the dynamic compilation approaches can potentially offer better optimizations because of in-depth knowledge of application and device behavior, they cannot often benefit practically from this knowledge because they have little time available at runtime to apply related optimizations.
To address this issue, AOT approaches move the compilation and code generation in front of runtime and apply it at installation time. This approach is usually only applicable to typed languages that deliver fully-typed bytecode instead of source as a complete package.
|The Optimizer and Backend do not run at runtime and can afford to use more complex and time consuming algorithms.||The Optimizer cannot use the application behavior to make better optimization decisions.|
|The Backend does know about the specific device and can use special features when available.||Some language features (such as reflection) are usually disabled in AOT mode.|
Examples of this approach include NGen for .NET framework on Windows, ART on Android, and Xamarin compiler for iOS.
Profile-driven static compilation
Although dynamic approaches can rely on user behavior to better optimize the application, they usually face two problems: There is not enough time to run complex optimization algorithms at runtime. Even in AOT’s case, the user’s patience is limited at installation time. They have access to only one user’s usage of the application, and that knowledge is not always usable across multiple runs of the same application by the same user. There are cases where the Optimizer prefers statistically significant insights in the application usage for better and lower-overhead optimization. For example, to better optimize the two branches of a conditional block, it would help to know the likelihood of each branch being taken. This data, along with the knowledge of how branch predictor works in the target device, can help the Optimizer to generate better code. There are also cases where the Optimizer can make speculative decisions, but needs to ensure the overall performance of the application is helped. To address this issue, profile-driven (a.k.a profile-guided) static compilation approaches rely on collecting a usage profile of the application in different scenarios in order to better optimize the application. However:
|The Optimizer and Backend can use application behavior by potentially multiple users to optimize the application better.||The Backend still does not know about all possible target devices and cannot use all possible instructions available in all machines.|
|Since the profile based decisions are hard coded in the application, if over time the user behavior changes, the application cannot adapt itself to it.|
Optimization during delivery
In the world of the web and rapidly-changing connected applications, the application is constantly delivered to all users. This opens the door for an interesting new way of running compilers and optimizers during delivery time.
This is almost like having the profile-driven compilation approach, with the difference that the optimizations can be constantly updated based on the set of observed devices and user behaviors. As such:
|The Optimizer runs offline on powerful servers (in the cloud), therefore there is no major time constraint and it can run complex algorithms as it sees fit.|
|The Optimizer can collect and deploy many different user behaviors and also quickly adapts to the changes it observes in the data.|
|The Optimizer and Backend can generate different versions for each device and then deliver the right version to that device.|
Depending on the type of optimization, the input/output formats may be the same or different. For example, in the case of HTML Streaming, the input is the HTML page that is being served to the user. Our service monitors the contents of the HTML pages and extracts the common parts of the page which then caches and sends it faster to users. In other words, the service parses the HTML, creates a model of the page, modifies this model, and then generates modified HTML to be served to the user (frontend + optimization + backend).
In the case of Image Streaming, the input format might be a bitmap which then the Instart Logic service converts to progressive JPEG and then streams. In this case, input images are decoded, modified in the memory, and then encoded into the proper format (again, frontend + optimization + backend).
Instart Logic's compilation strategy is cloud-based, implicitly crowd-sourced, adaptive based on data analytics and machine learning, and targets web and mobile. When was the last time you worked on a project that touches on so many cool and hot topics!