The main problem plaguing all web applications (and even simple websites) is the network connection. It may be slow and unreliable or cut out at any time. Nowhere near all users are connected by high-speed fibre.
And even with a perfect connection, there are of course limits. For example, browsers have to decide whether to request the files required for a webpage sequentially or in parallel. Parallel connections have the advantage that downloads are faster, but they also have overhead with each connection.
Modern browsers use sophisticated heuristics or newer protocols such as HTTP/2 or HTTP/3 (aka “QUIC”) to squeeze out the last little bit of performance from the connection. As a developer, however, one rarely has any influence over this.
The much bigger lever is reducing the number of necessary requests from the outset. There are two techniques to achieve this, which can be combined with one another:
- Aggressive caching
- The consolidation of multiple resources
So, how can the term “bundling” be understood?
Some resources, distributed across multiple individual files, can be combined without causing problems.
As an example:
Multiple CSS files, bundled via
<link> in HTML code, can be consolidated into a single file in which they are easily concatenated.
Instead of multiple necessary
GET requests to the server, now only one is needed.
frontend.jsfile is loaded via
<script> tag is located.
So, when these appear in the
<head> of the HTML file, the browser pauses the rendering and instead downloads the rest of the code.
Meanwhile, the browser window remains blank.
For completeness, it should be mentioned that one can delay the processing of the script tag to shortly before the end of the file using the
Files loaded with
type="module" are automatically loaded with a delay; the
defer attribute is redundant here.
There are various options here, depending on the architecture.
One therefore has no choice but to wait until it – including the framework – arrives at the browser.
If one chooses server-side rendering instead, one can for example move the
<script> tag to the end of the HTML code (i.e., before
In addition, through the analysis of import paths, bundlers also take on a translation of the Node world into the browser world.
By convention, packages installed by npm land in the
If one then imports React via
import React from "react", under the hood the file
node_modules/react/index.js is loaded (to put it in simple terms).
This mechanism can be individually configured in each npm package in the
This is however alien to the browser. It will attempt to make a
GET request to fetch the
react resource, leading to a 404.
An alternative approach to this problem are so-called import maps, with which URL remapping for imported modules can be defined. They are however currently lacking browser support, and they don’t help in all situations faced during frontend development.
A user often loads a webpage more than once. Especially in the case of server-side rendering, the web server delivers multiple HTML pages with a substantial overlapping of resources – the subpages usually all use the same style sheet.
Caching therefore must be considered to reduce the loading times of repeated requests. This could be the subject of an entire article of its own, so I can only give a brief introduction to the browser cache here.
Responding to a
GET request, a web server can determine the desired caching behaviour.
Most web servers do this automatically.
When the browser requests a resource, the server replies not only with its content but also delivers specific metadata, such as modification date and hash:
The content including metadata then lands in the browser cache. If the user requests the resource again, for example when following a link, the browser sends specific headers to the server:
This way, the browser indicates to the server that it already has the content with this hash.
If the resource on the server has not been modified, the server replies with the status code
304 Not Modified.
This in turn signals to the browser that it can load the content from the cache.
The server therefore does not have to send the content of the resource, and ends its reply with an empty body.
Correct caching therefore helps to keep transfer volumes small. However, bundling sometimes torpedoes this behaviour. Staying with the example of React: In addition to the framework code (approx. 132 kB in Version 17.0.2), we also have application code. Both land in the same file, which as a result is handled by the server as a single resource.
Unfortunately, both changes to the code and framework updates result in the bundle changing. In the worst case, the browser must therefore download 132 kB of unchanged React code anew with every bug fix because the web server doesn’t know exactly what has changed, but only that something, somewhere, has changed. Although this may not sound like a lot, it quickly adds up when other libraries in addition to React come into play.
The antidote to this is called “code splitting.” Instead of lumping all relevant source files including libraries into a single file, a more intelligent approach is used. Usually, external dependencies land in one file and application-specific code in another file. A bundler makes sure that the import path is correctly implemented, so that the development workflow doesn’t have to be changed. Only the configuration files have to be adjusted.
Code splitting can also be done manually to a certain degree, as the configuration is complicated in places.
For this purpose, many libraries have “prebundled” and compressed files available for download, in the case of React for example
Now instead of writing
import React from "react" in your code, it is sufficient to access the global variable
React, for the provided bundle exports all functionality in this global variable.
In practice, the header of an HTML file could then look as follows:
However, I advise against this approach when further libraries are incorporated, as things quickly become confusing.
Bundlers can recognize these dynamic imports and split the application code accordingly. To do so, browsers must support the ESM standard (which, as of October 2021, over 90% do). Alternatively, the files can be generated in a different module standard. I would advise against this, though, as this incurs greater complexity.
Logically, overly fine-grained partitioning leads to too many HTTP requests whenever a page is viewed. Because, despite caching, the browser must send a request to the web server for each resource.
Ideally, the browser shouldn’t send this request if the resource has not changed on the server. But how would the browser to know? The solution here is “fingerprinting.”
Instead of delivering a script via the URL
/app.js, one can simply insert the corresponding hash:
The file name and the hash – the fingerprint – must now be referenced in the HTML code.
And then the server must be configured so that with these files the following header is set:
This way, the server tells the browser that the files under this path never change. They can therefore be unconditionally loaded from the cache. This accelerates repeated page loading considerably, as the fastest request is one that doesn’t actually happen.
For this, the HTML header given above must be changed as follows:
If the content of the file changes at a later time, the hash also changes, and thus also the reference in HTML. This way, the browser recognizes that a new resource must be requested. A bundler can automatically adapt the hashes (see next section).
ETag-based caching for HTML, fingerprint+
immutable for everything else.
A bundler ensures that after bundling, the names of the generated files contain the correct fingerprint.
As it is very difficult to combine multiple source files of certain asset classes like images and text, fingerprinting is an important performance technology. Paired with code splitting, it is a powerful tool.
Incidentally, it is not necessary to store the script with the file name
app.48de1f9.js in the web server.
One can also correspondingly configure the web server so that the URL is rewritten.
But beware: Resources with an out-of-date fingerprint should result in a 404.
Full-blown web frameworks such as Rails offer this without extensive configuration efforts.
Unfortunately, fingerprinting also has an influence on the actual content of the resources.
In the previous section, I mentioned that at the very least the
<script> tags have to be adjusted.
But other assets are also affected.
If for example one uses SVG files for icons, their paths are referenced in CSS:
But due to fingerprinting, the URL is now suddenly different. The CSS declaration must therefore be correspondingly adjusted by the bundler:
At runtime, this code generates an HTML list (
<ul>) with multiple entries (
<li>) stremming from the
But browsers cannot understand this syntax.
Cue the “transpiler” …
The code is now freed of all idiosyncratic features and all browsers are able to work with it, provided React has been loaded as a dependency. Right?
Internet Explorer 11, which is still used by some applications (and users), is the main culprit here.
It is stuck about a decade in the past, so doesn’t even support the
The technology-agnostic database Browserslist keeps tabs on browser versions and their support of language versions. To transpile, create a configuration file that looks for example as follows:
Babel automatically evaluates this file and configures itself so that the generated code is supported by the two most recent versions of every browser, as well as IE 11 and those with more than 5% market share. All features not supported by the applicable browsers are transpiled by Babel. Caution: The compilation does not always resemble the original. Sometimes Babel has to use every trick in the book to emulate new features.
For the unfortunate 5% of users whose browsers don’t know what to do with this (thanks IE!), Babel is unable to help because as a mere transpiler it knows nothing about the programming interfaces of the browsers.
The same applies for example to
fetch, the significantly easier to operate successor to
What to do?
The solution is called polyfill, a set of very different technologies named after a brand of putty.
This doesn’t always work completely, especially when the new feature is deeply anchored in the browser.
fetch, the appropriate polyfill implements the functionality based on
XMLHttpRequest and in so doing gives it a new appearance.
Progressive enhancement however only works to a limited extent with single-page applications. In these cases, one is often forced to apply large amounts of putty.
Both approaches have in common that, in order to be able to use them correctly, a certain amount of preparatory work is required to determine the concrete demands of the project and thus put things on the right track for the development. If you decide to go with polyfills, bundlers can automatically inject these into the generated bundles. Sometimes the necessary polyfills can even be inferred from the source text and the Browserslist configuration.
At the same time it must be noted that the transpiling of language features, as done by Babel, usually preserves the semantics, whereas polyfills, by their nature, have to make compromises. It is therefore essential to read the documentation.
A browser unable to deal with this simply loads the JPG format as usual. But if it is able to render AVIF, it loads that version instead. As of October 2021, only about two thirds of web users worldwide can view AVIF files. These users benefit from significantly faster loading times.
This problem can be made as complicated as you like, because in order to speed up the website on all platforms, one can also deliver differently sized images for different display sizes. As many bundlers give up at this stage at the latest, there are a few specialized third-party providers that scale, convert, and optimize images on the fly, depending on the request. With caching support, of course. That technology is called “content negotiation,” where the provider evaluates the request header of the browser in order to determine what the browser can deal with.
There is one more important job undertaken by bundlers: So-called “tree-shaking,” aka “dead code elimination.”
On the server side, it is usually unimportant how large the executable artefact is. All transitive dependencies – even if they’re not actually needed in the end – are included in the image. But as I have already made clear here, on the client side, bandwidth is precious.
Tree-shaking describes the process of the bundler removing unused source files from the generated bundle. This isn’t necessarily a complicated job, as a dependency tree is needed anyway for fingerprinting.
In theory anyway.
In fact, many bundlers support tree-shaking below the file level, namely on the level of individual definitions.
Considering constants for example, they have to determine if they are “pure.”
For instance, the following definition should not be deleted, even if
x isn’t accessed:
const x = 3 in contrast is pure and can be disposed of if necessary.
If you have successfully improved the performance of a web application thanks to of a bundler, the result that is executed and displayed by the browser often only has a limited resemblance to the original. Especially the last step described above kills readability.
But there is a remedy to this as well. In the debug mode, one can configure bundlers so that they create a source map. The – more or less unchanged – source text is placed in the header of the compiled file as a comment. This of course makes a mockery of all attempts at size reduction, so this should not be used in production. But during the development phase, browsers can use the source map for example to reconstruct accurate stack traces, which significantly simplifies the search for errors.
Concrete benchmarks are however unavoidable. For this, one can use the web inspector provided by all modern browsers, which can display the network requests with and without cache. This quickly allows bottlenecks to be identified. Automatic tools are another building block and can be used in the search for other problems, such as incorrect display on mobile end devices.
I always advise viewing the newly developed page on a smartphone using mobile internet on the train. Because if it works fine then, it will work fine everywhere.