A couple years ago, I took a road trip from Wisconsin to Washington and mostly stayed in rural hotels on the way. I expected the internet in rural areas too sparse to have cable internet to be slow, but I was still surprised that a large fraction of the web was inaccessible. Some blogs with lightweight styling were readable, as were pages by academics who hadn’t updated the styling on their website since 1995. But very few commercial websites were usable (other than Google). When I measured my connection, I found that the bandwidth was roughly comparable to what I got with a 56k modem in the 90s. The latency and packetloss were significantly worse than the average day on dialup: latency varied between 500ms and 1000ms and packetloss varied between 1% and 10%. Those numbers are comparable to what I’d see on dialup on a bad day.
Despite my connection being only a bit worse than it was in the 90s, the vast majority of the web wouldn’t load. Why shouldn’t the web work with dialup or a dialup-like connection? It would be one thing if I tried to watch youtube and read pinterest. It’s hard to serve videos and images without bandwidth. But my online interests are quite boring from a media standpoint. Pretty much everything I consume online is plain text, even if it happens to be styled with images and fancy javascript. In fact, I recently tried using w3m (a terminal-based web browser that, by default, doesn’t support css, javascript, or even images) for a week and it turns out there are only two websites I regularly visit that don’t really work in w3m (twitter and zulip, both fundamentally text based sites, at least as I use them).
More recently, I was reminded of how poorly the web works for people on slow connections when I tried to read a joelonsoftware post while using a flaky mobile connection. The HTML loaded but either one of the five CSS requests or one of the thirteen javascript requests timed out, leaving me with a broken page. Instead of seeing the article, I saw three entire pages of sidebar, menu, and ads before getting to the title because the page required some kind of layout modification to display reasonably. Pages are often designed so that they’re hard or impossible to read if some dependency fails to load. On a slow connection, it’s quite common for at least one depedency to fail. After refreshing the page twice, the page loaded as it was supposed to and I was able to read the blog post, a fairly compelling post on eliminating dependencies.
Complaining that people don’t care about performance like they used to and that we’re letting bloat slow things down for no good reason is “old man yells at cloud” territory; I probably sound like that dude who complains that his word processor, which used to take 1MB of RAM, takes 1GB of RAM. Sure, that could be trimmed down, but there’s a real cost to spending time doing optimization and even a $300 laptop comes with 2GB of RAM, so why bother? But it’s not quite the same situation – it’s not just nerds like me who care about web performance. In the U.S., AOL alone had over 2 million dialup users in 2015. Outside of the U.S., there are even more people with slow connections. I recently chatted with Ben Kuhn, who spends a fair amount of time in Africa, about his internet connection:
I’ve seen ping latencies as bad as ~45 sec and packet loss as bad as 50% on a mobile hotspot in the evenings from Jijiga, Ethiopia. (I’m here now and currently I have 150ms ping with no packet loss but it’s 10am). There are some periods of the day where it ~never gets better than 10 sec and ~10% loss. The internet has gotten a lot better in the past ~year; it used to be that bad all the time except in the early mornings.
…
Speedtest.net reports 2.6 mbps download, 0.6 mbps upload. I realized I probably shouldn’t run a speed test on my mobile data because bandwidth is really expensive.
Our server in Ethiopia is has a fiber uplink, but it frequently goes down and we fall back to a 16kbps satellite connection, though I think normal people would just stop using the Internet in that case.
If you think browsing on a 56k connection is bad, try a 16k connection from Ethiopia!
Everything we’ve seen so far is anecdotal. Let’s load some websites that programmers might frequent with a variety of simulated connections to get data on page load times. webpagetest lets us see how long it takes a web site to load (and why it takes that long) from locations all over the world. It even lets us simulate different kinds of connections as well as load sites on a variety of mobile devices. The times listed in the table below are the time until the page is “visually complete”; as measured by webpagetest, that’s the time until the above-the-fold content stops changing.
URL | Size | C | Load time in seconds | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
MB | FIOS | Cable | LTE | 3G | 2G | Dial | Bad | 😱 | |||
0 | http://bellard.org | 0.01 | 5 | 0.40 | 0.59 | 0.60 | 1.2 | 2.9 | 1.8 | 9.5 | 7.6 |
1 | http://danluu.com | 0.02 | 2 | 0.20 | 0.20 | 0.40 | 0.80 | 2.7 | 1.6 | 6.4 | 7.6 |
2 | news.ycombinator.com | 0.03 | 1 | 0.30 | 0.49 | 0.69 | 1.6 | 5.5 | 5.0 | 14 | 27 |
3 | danluu.com | 0.03 | 2 | 0.20 | 0.40 | 0.49 | 1.1 | 3.6 | 3.5 | 9.3 | 15 |
4 | http://jvns.ca | 0.14 | 7 | 0.49 | 0.69 | 1.2 | 2.9 | 10 | 19 | 29 | 108 |
5 | jvns.ca | 0.15 | 4 | 0.50 | 0.80 | 1.2 | 3.3 | 11 | 21 | 31 | 97 |
6 | fgiesen.wordpress.com | 0.37 | 12 | 1.0 | 1.1 | 1.4 | 5.0 | 16 | 66 | 68 | FAIL |
7 | google.com | 0.59 | 6 | 0.80 | 1.8 | 1.4 | 6.8 | 19 | 94 | 96 | 236 |
8 | joelonsoftware.com | 0.72 | 19 | 1.3 | 1.7 | 1.9 | 9.7 | 28 | 140 | FAIL | FAIL |
9 | bing.com | 1.3 | 12 | 1.4 | 2.9 | 3.3 | 11 | 43 | 134 | FAIL | FAIL |
10 | reddit.com | 1.3 | 26 | 7.5 | 6.9 | 7.0 | 20 | 58 | 179 | 210 | FAIL |
11 | signalvnoise.com | 2.1 | 7 | 2.0 | 3.5 | 3.7 | 16 | 47 | 173 | 218 | FAIL |
12 | amazon.com | 4.4 | 47 | 6.6 | 13 | 8.4 | 36 | 65 | 265 | 300 | FAIL |
13 | steve-yegge.blogspot.com | 9.7 | 19 | 2.2 | 3.6 | 3.3 | 12 | 36 | 206 | 188 | FAIL |
14 | blog.codinghorror.com | 23 | 24 | 6.5 | 15 | 9.5 | 83 | 235 | FAIL | FAIL | FAIL |
Each row is a website. For sites that support both plain HTTP as well as HTTPS, both were tested; URLs are HTTPS except where explicitly as HTTP. The first two columns show the amount of data transferred over the wire in MB (which includes headers, handshaking, compression, etc.) and the number of TCP connections made. The rest of the columns show the time in seconds to load the page on a variety of connections from fiber (FIOS) to less good connections. “Bad” has the bandwidth of dialup, but with 1000ms ping and 10% packetloss, which is roughly what I saw when using the internet in small rural hotels. “😱” simulates a 16kbps satellite connection from Jijiga, Ethiopia. Rows are sorted by the measured amount of data transferred.
The timeout for tests was 6 minutes; anything slower than that is listed as FAIL. Pages that failed to load are also listed as FAIL. A few things that jump out from the table are:
- A large fraction of the web is unusable on a bad connection. Even on a good (0% packetloss, no ping spike) dialup connection, some sites won’t load.
- Some sites will use a lot of data!
The web on bad connections
As commercial websites go, Google is basically as good as it gets for people on a slow connection. On dialup, the 50%-ile page load time is a minute and a half. But at least it loads – when I was on a slow, shared, satellite connection in rural Montana, virtually no commercial websites would load at all. I could view websites that only had static content via Google cache, but the live site had no hope of loading.
Some sites will use a lot of data
Although only two really big sites were tested here, there are plenty of sites that will use 10MB or 20MB of data. If you’re reading this from the U.S., maybe you don’t care, but if you’re browsing from Mauritania, Madagascar, or Vanuatu, loading codinghorror once will cost you more than 10% of the daily per capita GNI.
Page weight matters
Despite the best efforts of Maciej, the meme that page weight doesn’t matter keeps getting spread around. AFAICT, the top HN link of all time on web page optimization is to an article titled “Ludicrously Fast Page Loads - A Guide for Full-Stack Devs”. At the bottom of the page, the author links to another one of his posts, titled “Page Weight Doesn’t Matter”.
Usually, the boogeyman that gets pointed at is bandwidth: users in low-bandwidth areas (3G, developing world) are getting shafted. But the math doesn’t quite work out. Akamai puts the global connection speed average at 3.9 megabits per second.
The “ludicrously fast” guide fails to display properly on dialup or slow mobile connections because the images time out. On reddit, it also fails under load: “Ironically, that page took so long to load that I closed the window.”, “a lot of … gifs that do nothing but make your viewing experience worse”, “I didn’t even make it to the gifs; the header loaded then it just hung.”, etc.
The flaw in the “page weight doesn’t matter because average speed is fast” is that if you average the connection of someone in my apartment building (which is wired for 1Gbps internet) and someone on 56k dialup, you get an average speed of 500 Mbps. That doesn’t mean the person on dialup is actually going to be able to load a 5MB website. The average speed of 3.9 Mbps comes from a 2014 Akamai report, but it’s just an average. If you look at Akamai’s 2016 report, you can find entire countries where more than 90% of IP addresses are slower than that!
Yes, there are a lot of factors besides page weight that matter, and yes it’s possible to create a contrived page that’s very small but loads slowly, as well as a huge page that loads ok because all of the weight isn’t blocking, but total page weight is still pretty decently correlated with load time.
Since its publication, the “ludicrously fast” guide was updated with some javascript that only loads images if you scroll down far enough. That makes it look a lot better on webpagetest if you’re looking at the page size number (if webpagetest isn’t being scripted to scroll), but it’s a worse user experience for people on slow connections who want to read the page. If you’re going to read the entire page anyway, the weight increases, and you can no longer preload images by loading the site. Instead, if you’re reading, you have to stop for a few minutes at every section to wait for the images from that section to load. And that’s if you’re lucky and the javascript for loading images didn’t fail to load.
The average user fallacy
Just like many people develop with an average connection speed in mind, many people have a fixed view of who a user is. Maybe they think there are customers with a lot of money with fast connections and customers who won’t spend money on slow connections. That is, very roughly speaking, perhaps true on average, but sites don’t operate on average, they operate in particular domains. Jamie Brandon writes the following about his experience with Airbnb:
I spent three hours last night trying to book a room on airbnb through an overloaded wifi and presumably a satellite connection. OAuth seems to be particularly bad over poor connections. Facebook’s OAuth wouldn’t load at all and Google’s sent me round a ‘pick an account’ -> ‘please reenter you password’ -> ‘pick an account’ loop several times. It took so many attempts to log in that I triggered some 2fa nonsense on airbnb that also didn’t work (the confirmation link from the email led to a page that said ‘please log in to view this page’) and eventually I was just told to send an email to account.disabled@airbnb.com, who haven’t replied.
It’s particularly galling that airbnb doesn’t test this stuff, because traveling is pretty much the whole point of the site so they can’t even claim that there’s no money in servicing people with poor connections.
What about tail latency?
My original plan for this was post was to show 50%-ile, 90%-ile, 99%-ile, etc., tail load times. But the 50%-ile results are so bad that I don’t know if there’s any point to showing the other results. If you were to look at the 90%-ile results, you’d see that most pages fail to load on dialup and the “Bad” and “😱” connections are hopeless for almost all sites.
HTTP vs HTTPs
URL | Size | C | Load time in seconds | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
kB | FIOS | Cable | LTE | 3G | 2G | Dial | Bad | 😱 | |||
1 | http://danluu.com | 21.1 | 2 | 0.20 | 0.20 | 0.40 | 0.80 | 2.7 | 1.6 | 6.4 | 7.6 |
3 | https://danluu.com | 29.3 | 2 | 0.20 | 0.40 | 0.49 | 1.1 | 3.6 | 3.5 | 9.3 | 15 |
You can see that for a very small site that doesn’t load many blocking resources, HTTPS is noticeably slower than HTTP, especially on slow connections. Practically speaking, this doesn’t matter today because virtually no sites are that small, but if you design a web site as if people with slow connections actually matter, this is noticeable.
How to make pages usable on slow connections
The long version is, to really understand what’s going on, considering reading high-performance browser networking, a great book on web performance that’s avaiable for free.
The short version is that most sites are so poorly optimized that someone who has no idea what they’re doing can get a 10x improvement in page load times for a site whose job is to serve up text with the occasional image. When I started this blog in 2013, I used Octopress because Jekyll/Octopress was the most widely recommended static site generator back then. A plain blog post with one or two images took 11s to load on a cable connection because the Octopress defaults included multiple useless javascript files in the header (for never-used-by-me things like embedding flash videos and delicious integration), which blocked page rendering. Just moving those javascript includes to the footer halved page load time, and making a few other tweaks decreased page load time by another order of magnitude. At the time I made those changes, I knew nothing about web page optimization, other than what I heard during a 2-minute blurb on optimization from a 40-minute talk on how the internet works and I was able to get a 20x speedup on my blog in a few hours. You might argue that I’ve now gone too far and removed too much CSS, but I got a 20x speedup for people on fast connections before making changes that affected the site’s appearance (and the speedup on slow connections was much larger).
That’s normal. Popular themes for many different kinds of blogging software and CMSs contain anti-optimizations so blatant that any programmer, even someone with no front-end experience, can find large gains by just pointing webpagetest at their site and looking at the output.
What about browsers?
While it’s easy to blame page authors because there’s a lot of low-hanging fruit on the page side, there’s just as much low-hanging fruit on the browser side. Why does my browser open up 6 TCP connections to try to download six images at once when I’m on a slow satellite connection? That just guarantees that all six images will time out! I can sometimes get some images to load by refreshing the page a few times (and waiting ten minutes each time), but why shouldn’t the browser handle retries for me? If you think about it for a few minutes, there are a lot of optimiztions that browsers could do for people on slow connections, but because they don’t, the best current solution for users appears to be: use w3m when you can, and then switch to a browser with ad-blocking when that doesn’t work. But why should users have to use two entirely different programs, one of which has a text-based interface only computer nerds will find palatable?
Conclusion
When I was at Google, someone told me a story about a time that “they” completed a big optimization push only to find that measured page load times increased. When they dug into the data, they found that the reason load times had increased was that they got a lot more traffic from Africa after doing the optimizations. The team’s product went from being unusable for people with slow connections to usable, which caused so many users with slow connections to start using the product that load times actually increased.
Last night, at a presentation on the websockets protocol, Gary Bernhardt made the observation that the people who designed the websockets protocol did things like using a variable length field for frame length to save a few bytes. By contrast, if you look at the Alexa top 100 sites, almost all of them have a huge amount of slop in them; it’s plausible that the total bandwidth used for those 100 sites is probably greater than the total bandwidth for all websockets connections combined. Despite that, if we just look at the three top 35 sites tested in this post, two send uncompressed javascript over the wire, two redirect the bare domain to the www subdomain, and two send a lot of extraneous information by not compressing images as much as they could be compressed without sacrificing quality. If you look at twitter, which isn’t in our table but was mentioned above, they actually do an anti-optimization where, if you upload a PNG which isn’t even particularly well optimized, they’ll re-encode it as a jpeg which is larger and has visible artifacts!
“Use bcrypt” has become the mantra for a reasonable default if you’re not sure what to do when storing passwords. The web would be a nicer place if “use webpagetest” caught on in the same way. It’s not always the best tool for the job, but it sure beats the current defaults.
Appendix: experimental caveats
The above tests were done by repeatedly loading pages via a private webpagetest image in AWS west 2, on a c4.xlarge VM, with simulated connections on a first page load in Chrome with no other tabs open and nothing running on the VM other than the webpagetest software and the browser. This is unrealistic in many ways.
In relative terms, this disadvantages sites that have a large edge presence. When I was in rural Montana, I ran some tests and found that I had noticeably better latency to Google than to basically any other site. This is not reflected in the test results. Furthermore, this setup means that pages are nearly certain to be served from a CDN cache. That shouldn’t make any difference for sites like Google and Amazon, but it reduces the page load time of less-trafficked sites that aren’t “always” served out of cache. For example, when I don’t have a post trending on social media, between 55% and 75% of traffic is served out of a CDN cache, and when I do have something trending on social media, it’s more like 90% to 99%. But the test setup means that the CDN cache hit rate during the test is likely to be > 99% for my site and other blogs which aren’t so widely read that they’d normally always have a cached copy available.
All tests were run assuming a first page load, but it’s entirely reasonable for sites like Google and Amazon to assume that many or most of their assets are cached. Testing first page load times is perhaps reasonable for sites with a traffic profile like mine, where much of the traffic comes from social media referrals of people who’ve never visited the site before.
A c4.xlarge is a fairly powerful machine. Today, most page loads come from mobile and even the fastest mobile devices aren’t as fast as a c4.xlarge; most mobile devices are much slower than the fastest mobile devices. Most desktop page loads will also be from a machine that’s slower than a c4.xlarge. Although the results aren’t shown, I also ran a set of tests using a t2.micro instance: for simple sites, like mine, the difference was negligible, but for complex sites, like Amazon, page load times were as much as 2x worse. As you might expect, for any particular site, the difference got smaller as the connection got slower.
As Joey Hess pointed out, many dialup providers attempt to do compression or other tricks to reduce the effective weight of pages and none of these tests take that into account.
Firefox, IE, and Edge often have substantially different performance characteristics from Chrome. For that matter, different versions of Chrome can have different performance characteristics. I just used Chrome because it’s the most widely used desktop browser, and running this set of tests took over a full day of VM time with a single-browser.
The simulated bad connections add a constant latency and fixed (10%) packetloss. In reality, poor connections have highly variable latency with peaks that are much higher than the simulated latency and periods of much higher packetloss than can last for minutes, hours, or days.
The websites compared don’t do the same thing. Just looking at the blogs, some blogs put entire blog entries on the front page, which is more convenient in some ways, but also slower. Commercial sites are even more different – they often can’t reasonably be static sites and have to have relatively large javascrit payloads in order to work well.
Appendix: irony
The main table in this post is almost 50kB of HTML (without compression or minification); that’s larger than everything else in this post combined. That table is curiously large because I used a library (pandas) to generate the table instead of just writing a script to do it by hand, and as we know, the default settings for most libraries generate a massive amount of bloat. It didn’t even save time because every single built-in time-saving feature that I wanted to use was buggy, which forced me to write all of the heatmap/gradient/styling code myself anyway! Due to laziness, I left the pandas table generating scaffolding code, resulting in a table that looks like it’s roughly an order of magnitude larger than it needs to be.
This isn’t a criticism of pandas. Pandas is probably quite good at what it’s designed for; it’s just not designed to produce slim websites. The CSS class names are huge, which is reasonable if you want to avoid accidental name collisions for generated CSS. Almost every td
, th
, and tr
element is tagged with a redundant rowspan=1
or colspan=1
, which is reasonable for generated code if you don’t care about size. Each cell has its own CSS class, even though many cells share styling with other cells; again, this probably simplified things on the code generation. Every piece of bloat is totally reasonable. And unfortunately, there’s no tool that I know of that will take a bloated table and turn it into a slim table. A pure HTML minifier can’t change the class names because it doesn’t know that some external CSS or JS doesn’t depend on the class name. An HTML minifier could theoretically determine that different cells have the same styling and merge them, except for the aforementioned problem with potential but non-existent external depenencies, but that’s beyond the capability of the tools I know of.
For another level of ironic, consider that while I think of a 50kB table as bloat, Google’s AMP currently has > 100kB of blocking javascript that has to load before the page loads! There’s no reason for me to use AMP pages because AMP is slower than my current setup of pure HTML with a few lines of embedded CSS and the occasional image, but, as a result, I’m penalized by Google (relative to AMP pages) for not “accelerating” (deccelerating) my page with AMP.
Thanks to Leah Hanson, Jason Owen, and Lindsey Kuper for comments/corrections