# Purpose/Description |
The purpose of this experiment was to determine if significant gains could
be had from adding the X-subresource headers to pages. If that was the case,
further experimentation would be done. |
Specifically, I tested the effect of the prefetching on a page with several
external Javascript resources, which would need to be loaded and run serially by
the browser. |
# Setup |
The experiment consisted of loading a single test page from a server, and
timing the page load times under various conditions. There were three
components: the server, the test files, and the client. |
### Server |
The server was a small HTTP server written in Python designed to serve
static content and optionally scan the pages for tags and generate X-subresource
for any external resources referenced in the document. It can also add an
artificial latency to simulate adverse network conditions --- in this case, it
simply waits a certain amount of time after receiving a GET request before
sending back the response. |
TODO(aoates): publish the modified python server. |
### Client |
The client was a standard build of Firefox 3.0.10 with a custom extension
installed. |
The extension listens for incoming requests, grabs any X-subresource headers
and initiates prefetch requests for all the referenced resources. |
TODO(aoates): publish the firefox extension. |
### Test Pages |
The test pages consist of two variations on one page. In each case, the page
loads and executes 10 external Javascript files, and collects timing information
on how long the page load as well as each external JS load takes. The individual
Javascript files print out a message to the page, and stall for a small amount
of time (around 100ms). |
In the first variation, `test5.html`, the external Javascript files are
hosted locally, alongside the page. They are therefore served by the Python
server, which cannot handle multiple parallel connections. In the second
variation, `test5_corp.html`, the Javascript files are hosted on a server which
can handle parallel connections. |
TODO(aoates): publish the actual web pages. |
# Experimental Groups |
There were four experimental groups, testing two variables: |
-
Presence of X-subresource headers in test page headers, and |
-
Location of external resources; either locally, served by the
small Python HTTP server, or on remotely, on `www.corp.google.com`
and served by Apache. |
In each case, the test page itself was served locally by the Python
webserver (which generated the headers). The server was run with the following
command: |
`./xsubserver.py --latency 300 [--no-xsubresource]`with the last parameter
varying. This introduced a "latency" of 300ms to all content served by the
Python web server; the server waited 300ms after receiving the request to send
back the response. The number 300 was chosen to approximate the latency of
downloading one of the javascript files from a typical server. |
# Trials and Results |
Each group was run several times, and a representative set of results is
presented in the table below. For each group, an overall page load time is
presented, as is the load time of each Javascript `script` element (the time
taken to load and execute the Javascript file).
|
|
|
|
|
|
|
| *Results of Experiment #1* |
|
|
External Resource Server |
|
`xsubserver.py` (local)Apache/2.0.55 (remote) |
|
X-subresource |
| Page Load: *3065 ms* | HeadersYes
| JS #1 Load: 306 ms |
| JS #2 Load: 301 ms |
| JS #3 Load: 309 ms |
| JS #4 Load: 306 ms |
| JS #5 Load: 301 ms |
| JS #6 Load: 312 ms |
| JS #7 Load: 299 ms |
| JS #8 Load: 303 ms |
| JS #9 Load: 296 ms |
| JS #10 Load: 317 ms |
| Page Load: *1366 ms* |
| JS #1 Load: 379 ms |
| JS #2 Load: 108 ms |
| JS #3 Load: 116 ms |
| JS #4 Load: 106 ms |
| JS #5 Load: 107 ms |
| JS #6 Load: 106 ms |
| JS #7 Load: 107 ms |
| JS #8 Load: 107 ms |
| JS #9 Load: 107 ms |
| JS #10 Load: 106 ms |
|
|
| Page Load: *4178 ms* | No
| JS #1 Load: 422 ms |
| JS #2 Load: 415 ms |
| JS #3 Load: 411 ms |
| JS #4 Load: 415 ms |
| JS #5 Load: 415 ms |
| JS #6 Load: 415 ms |
| JS #7 Load: 414 ms |
| JS #8 Load: 417 ms |
| JS #9 Load: 422 ms |
| JS #10 Load: 416 ms |
| Page Load: *3777 ms* |
| JS #1 Load: 487 ms |
| JS #2 Load: 353 ms |
| JS #3 Load: 342 ms |
| JS #4 Load: 342 ms |
| JS #5 Load: 352 ms |
| JS #6 Load: 356 ms |
| JS #7 Load: 352 ms |
| JS #8 Load: 347 ms |
| JS #9 Load: 484 ms |
| JS #10 Load: 351 ms |
|
| |
# Discussion |
Let's examine each set of results. |
### Python/local without prefetching |
In this trial, there was an overall page load time (from execution of the
first script tag to sending the body's `onLoad` event) of 4178 ms, with each
individual javascript element taking a little over 400 ms to load and execute.
This is exactly as expected --- with a latency of 300 ms, and an execution
duration of 100 ms, each javascript element should take around 400ms to execute.
Multiply that by 10 elements, and we get a ~4000 ms page load time. This is our
worst-case scenario. |
### Apache/remote without prefetching |
Assuming we modeled the latency accurately with our local Python webserver
(which we didn't), these should be about the same as the local results. Each
Javascript load is 50-60ms faster, however. This can be chalked up to Apache
being just plain better than our local server. |
### Python/local with prefetching |
In this case, we shave about 1 second off our overall page load time. By
examining the server output, it is clear that the page is sending off its
prefetch requests all at the start of the page load, and the files are
downloaded in the network thread. However, since the server cannot handle
parallel connections/requests, the files must be read in serial. In this case,
it looks like the prefetches are around 100ms ahead of the execution in the UI
thread. Since we can load and execute at the same time, that 100ms stacks to
give us a savings of around a second. Pretty good! |
### Apache/remote with prefetching |
In this case, since Apache can handle multiple concurrent connections, the
browser can shoot off all its requests at once. Once again, the loading is a
little behind the execution of the scripts, but since we are downloading them
all concurrently, we've finished fetching the scripts by the time the second one
starts executing. Each subsequent script therefore only takes execution time (no
downloading needed), so we get excellent savings, very close to our optimal page
load time of ~1200-1300 ms. |
### Costs |
Adding the X-subresource headers can be costly. In this case, it increased
the size of the headers on `test5.html` from 166 bytes to 747 bytes. The
appended headers were:` X-subresource: test5_1.js; type=application/x-javascript
X-subresource: test5_2.js; type=application/x-javascript X-subresource:
test5_3.js; type=application/x-javascript X-subresource: test5_4.js;
type=application/x-javascript X-subresource: test5_5.js;
type=application/x-javascript X-subresource: test5_6.js;
type=application/x-javascript X-subresource: test5_7.js;
type=application/x-javascript X-subresource: test5_8.js;
type=application/x-javascript X-subresource: test5_9.js;
type=application/x-javascript X-subresource: test5_10.js;
type=application/x-javascript ` Note that the current implementation of the
extension doesn't use any of the metadata included in the header (it just sends
off a blind request), so that could be eliminated. Additionally, these could be
concatenated into one large X-subresource header:` X-subresource: test5_1.js,
test5_2.js, test5_3.js, test5_4.js, test5_5.js, test5_6.js, test5_7.js,
test5_8.js, test5_9.js, test5_10.js ` This would bring the size of the header
down to 302 bytes. While much more reasonable, this is not insignificant.
However, as plaintext, it's a good candidate for an optimization like gzip'd
headers. |
# Conclusions |
When prefetching is enabled (as in the 1st row of results), the browser
doesn't have to block network loads on executing javascript. It can read the
prefetch headers, send off requests, and forget about them --- they'll continue
in the network thread while work is being done in the UI thread. |
The most fruitful type of sub-resource to be prefetched is probably
Javascript; when the browser encounters an external JS file, it has to block the
main/UI thread until the file is downloaded and executed. If, however, the file
has been prefetched, the time taken to download is instant savings on the
overall load time of the page. In a case like this, with multiple consecutive
scripts, the difference can be dramatic. |
# Future Work |
Some areas of future work include: |
-
Testing with a dummynet for varying network bandwidth, RTT, and
loss rate. |
-
Writing an Apache module that generates X-subresource
headers |
-
Re-writing the extension in C++ |
-
Getting it to work with images and style sheets |
-
Making it more robust in the face of strange cache settings (or,
more generally, knowing when prefetching will make the user
experience worse) |