While many libraries now offer versions of their sites optimized for mobile devices and smaller screens, I wanted to investigate if these sites generally meet user expectations in terms of performance. While libraries may not have the resources of large web companies like Google and Facebook, those sites still determine user expectations for libraries. Commonly visited sites establish a norm and library sites which load slowly are likely to be perceived as broken or poorly designed.
Web performance is particularly important when considering mobile devices, which have smaller cache sizes, fewer parallel connections, less computing power, and sometimes utilize slower connections such as 3G.  Furthermore, a study from Compuware actually indicates that users expect sites to load quicker on their smartphones than on a desktop or laptop computer.  This same study noted that the most common complaint "encountered accessing websites or applications on your mobile phone" was slow load speeds.
To obtain a large sample of library mobile websites, I scraped URLs from the M-Libraries page of the Library Success wiki. The script which harvests URLs is included in the git repository. It takes the hrefs of all anchors elements in the
Mobile interfaces (and/or OPACS) section, which includes several links to screenshots and other miscellany which I manually removed.
To compare library sites against a common metric that most people would be familiar with, I chose to look up the top ten most-visited websites in the United States using Alexa. There are certainly other services which claim to know the top websites, and taking a global view would have resulted in a different set, too.
Alexa doesn't list the top sites specifically for mobile devices, so to determine which of the top ten sites have separate mobile sites I spoofed an iPhone using
curl -LIA "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0_1 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A523 Safari/8536.25" $URL
$URL is the desktop URL listed in Alexa. For the top ten sites, half used the same URL while half redirected to a mobile-specific location, e.g. m.facebook.com.
Already, assuming users start on the mobile site introduces slight inaccuracies in a performance analysis. How a user ends up on a mobile page also affects their experience. At their worst, some of the Alexa sites perform multiple redirects before landing on the mobile URL. Library sites, too, might redirect users to the experience optimized for their device. On the whole, while redirects should only account for a couple HTTP requests at most (and ideally only the first time a user accesses a site on a given device), they're still to be avoided and can cause a noticeable page load delay. 
Once I had a list of both mobile library websites and mobile versions of the Alexa top ten, I ran heuristic performance tests using Yahoo!'s YSlow. To run the tests programmatically on hundreds of URLs, I used PhantomJS which runs a headless WebKit browser. To work around any user-agent sniffing sites were doing (e.g. redirecting desktop browsers to a desktop site), I set PhantomJS's user agent string to an iPhone 6. Since tests are heuristic and simply check to see if a site meets a particular guideline, it doesn't matter that they're run in WebKit as opposed to another rendering engine whose real performance may differ.
The specific YSlow tests recorded were:
Five of the M-Libraries sites timed out & didn't produce any data. Five out of 135 total sites isn't a bad number.
The full data set as well as all calculations mentioned below are present in this Google spreadsheet.
Overall, it was evident the best-performing library sites were competitive with the best-performing Alexa sites. The third quartiles of the overall YSlow grades for each set of sites were comparable (93.75 for the Alexa top ten versus 92 for libraries). However, as sites in each data set perform worse, the drop-off was much more significant for libraries. The bottom quartiles of overall YSlow grades were much further apart (a respectable 83 for the Alexa sites versus 62 for libraries).
Interestingly, library websites actually fared better than the Alexa top ten in terms of using a limited number of HTTP requests. The median for mobile library sites (13.5) was a full three requests fewer than for the Alexa top ten.
This was reflected in better performance on the YSlow requests grade as well. In fact, the median score for library sites was a perfect one hundred percent. The Alexa top ten median was 90.5%.
Other than a handful of outliers, the worst of which used an unfathomable ninety-one requests to serve a mobile site, most library sites were very responsible about limiting the number of resources they were serving to devices. However, there were so many outliers serving up dozens of requests that the average number of requests for a library site turned out to be greater than the Alexa top ten average (18.05 versus 17.4).
While it's (perhaps) an obvious counterpoint, it's not fair to say that the Alexa top ten use more requests because they're more sophisticated sites serving a greater number of user tasks. Library sites fulfill a number of functions, most notably search and retrieval which puts them in the same category as Google and Yahoo! from the Alexa list, while many of the Alexa sites (Facebook, Twitter) were judged by a simplistic login page rather than their full-feature web applications that use more requests.
Where did library sites fall behind? The GZIP grade showed a distinct advantage to the Alexa top ten.
While the Alexa sites ensured that most of their text resources were compressed, libraries did not do as good a job (median 100% for Alexa versus a paltry 78% for libraries). This poor result was disappointing because GZIP is one of the easiest performance optimizations to apply; it typically involves adding a small stanza of text to a server configuration file. Once configured, the server rarely needs to be revisited or maintained.
Finally, I tried to run some tests of statistical significance on the two sample populations. A Mann-Whitney U test (results on a worksheet in the data set) of the two sample population's overall YSlow grades seems to indicate that they are significantly different. However, it's worth noting that the sample sizes are so drastically different (130 to 10) that the Mann-Whitney U test may not be appropriate, tending to show that the populations differ. Looking at the mean ranks of the two populations, the Alexa sites' 95.3 is far better than the 68.6 of M-Libraries. A Kolmogorov-Smirnov test also tends to indicate that the two sample populations are dissimilar, with a P value of 0.039.
It's flawed to compare such disparate sample sizes. If a second study was to be run, scraping a comparable number of links from the Alexa top websites would be logical.
Mobile library sites scraped from the Library Success wiki performed decent in terms of the performance metrics used in this analysis. They were competitive in terms of overall YSlow grade and actually outperformed the Alexa top ten in terms of minimizing HTTP requests. However, the differences between the two sample populations are likely statistically significant. The number of libraries using GZIP compression to deliver fewer bytes to users was also disappointing. Libraries should research whether their mobile sites are serving compressed assets and alter their server configuration if they are not.