How to read performance articles

Monday, December 17th, 2012 at 2:30 pm

Summary: Performance articles are very good to make a point, but they are very much fixed in time and the shelf-life of their findings can be very short. This can lead to great ideas and technologies being prematurely discarded instead of their implementations being improved or fixed. Proceed with caution.

Every few months I see articles about how someone likes some of the great new features we have in HTML5 and how simple they are to use. Without fail, there will be a follow-up post by performance specialists warning everybody about using these technologies as they are slow.

Performance is sexy, and performance is damn important. Our end users should not suffer from our mistakes. There is no doubt about that. There is also no doubt though, that a simple API and one that builds on top of existing development practices will be used more than one that is more obscure but better performing.

This is the dilemma we find ourselves in. LocalStorage is damn easy, but can perform terribly, using WebSQL or IndexDB is much harder but performs better but is not supported consistently across browsers. Data attributes on HTML elements are an incredibly clever way to add extra information to HTML and keep them easy to maintain, but suffers from reading and writing to the DOM which is always slow.

Instead of finding a middle ground, however, we write articles, give talks or post about our favourite point of view of the subject at hand. Performance articles boast with a lot of numbers, graphs and interactive tests that allow people to run a bit of script across several browsers and paint a picture of doom and gloom.

Designers who just want to use technologies on the other hand then write articles that show that not everything is terrible as they never reach the brute force amount of tests these test cases use to come up with a sensible sample to make a statement one way or another.

I am bored of this, and I think we are wasting our time. What needs to happen is that performance testing and implementation should lead to what we really need: improved browsers with fixes for issues that cause delay. A lot of performance articles would be better off as comments in a bugtracking system, because there they get read by the people who can fix the issues. We need much more feedback from implementers why they don’t like to use a more performance efficient technology and what could be done to make them like it.

Right now our polarised writing causes the worst outcome: people are afraid to use technologies and browser vendors don’t see a point in fixing them because of that.

Libraries love implementers

Libraries, on the other hand, recognise the issues and fix them internally. Almost every jQuery script you see is incredibly tightly knit with the DOM and reads and writes on iterations, fails to do caching or concatenation before causing reflows and would be an absolute nightmare if implemented in plain JavaScript. Library makers learned this and swallowed that pill – implementers like using the DOM, they just don’t like the DOM API. Storing something in an attribute makes it easy to change and means that people will not mess with your code. That’s why libraries use caching mechanisms and delayed writing and all in all a DOM facade to allow developers to use and abuse the DOM without the performance punishment (in many cases, not all of course).

The same needs to happen in browsers and standards. That something is slow is not the news. We live in a world of ever changing browsers and technologies. As Jake Archibald put it:

@codepo8 agreed. Also, most “x is faster than y” advice has a very short shelf life

A lot of the performance issues of technologies is based in how they were defined or the lack of definition and browsers implementing them differently. These are the hardest to fix. But we can only fix them when and if people use them. Without usage no browser vendor will be enticed to spend time and effort fixing the issues – after all as nobody complains, it probably is OK.

Read with caution

So when reading about the performance of anything web related, I think it is important to consider a few things. I also think that well-written posts, books and articles should mention those instead of showing a graph and declaring “xyz considered harmful”:

  • What usage of the technology causes issues – if storing lots of large videos slows down localStorage that doesn’t mean using it for a few small strings makes it a “do not use technology”
  • What are the effects of the performance issue – if what you do delays a page load for 10 seconds, that is an issue, if the problem only persists in certain browsers and with a certain kind of data, less so
  • Are there workarounds – what can implementers do to still use the technology, reap its rewards but not cause the issues?
  • What are the catalysts – a lot of times a performance issue does not really show until you use it excessively. DOM access for example when cached is not a problem, when you keep reading and writing the same values, it is though

Of course, performance experts will tell you that this is a given. People should take the numbers with a grain of salt and test them against their own implementations. Well, that is not how publishing works and this is certainly not how quoting in a 140 character medium works.

A current example

Let’s take a quick example about this:

Stoyan Stefanov’s “Efficient HTML5 data- attributes” talks about the performance of HTML5 data attributes and that they are slow. We have the graphs as proof, we have an interactive test to run in our browsers. And of course Twitter was full of “data attributes are slow and suck”. The interesting part of the article to me, however, was in the last two paragraphs:

Using data attributes is convenient, but if you’re doing a lot of reading and writing, your application’s performance will suffer. In such cases it’s better to create a small utility in JavaScript which will store and associate data about particular DOM nodes.

In this article, you saw a very simple Data utility. You can always make it better. For example, for convenience you can still produce markup with data attributes and the first time you use Data.get() or Data.set(), you can carry over the values from the DOM to the Data object and then use the Data object from then on.

This, to me, is the missed opportunity here. Right now data attributes perform terribly as they are connected to the DOM node, meaning you do an attribute read and write to the DOM every time you read a dataset property. This doesn’t make much sense – why would you need an extra API then?

The mythical Data utility Stoyan writes about is what this article should have started with. Of course, I can see his plan to make people start playing with the idea and thus getting deeper into the subject matter. This would be lovely, but it means that readers need to either do that or check the comments or follow-ups for that solution. Articles have a much shorter shelf-life these days as they had in the past – it makes more sense to show a solution that fixes the issues rather than a blueprint for one. This is not a workshop.

The magic moment here is not saying that the following will be slow to read and write if you use dataset or get attribute:

<div id="foo" data-x="200" data-y="300"></div>

It is also not that you can replace it with a much better performing script like this:

Data.set(div, {x: 200, y: 300});

They are not the same – by a long shot. The former is much easier to maintain and keeps all the data in the same spot. The latter is spreading the info into two documents and two languages – very much against the whole idea of what data attributes are there for.

An article with this title should have shown a solution that turns the HTML solution into a performing solution – by, for example, looping the document once and storing the information in a data object for lookup.

I am not saying that the article is bad – I think the last two paragraphs made it much more objective. What I am saying though is that it is primed to be quoted wrongly and lead to truisms that stand in the way of the underlying issue being fixed.

Performance improvements happen in several areas. The ones with the biggest impact is making browsers faster by fixing issues under the hood and by making it easy for people to develop. We promised developers a lot with the HTML5 standards – this stuff should perform without implementers having to build their own workarounds. This is the main lesson we learned from the success of jQuery.

So, if you read “xyz considered harmful”, read carefully, consider the implementation parameters and don’t discard a very useful tool just because you see some numbers that right now look grim. Technology changes faster than we think and we need to measure with use cases, not lab tests.

Share on Twitter