Christian Heilmann

You are currently browsing the Christian Heilmann blog archives for August, 2010.

Archive for August, 2010

Using HTML5 Storage to cache application interfaces

Thursday, August 26th, 2010

One of the things that gets me excited about the new features of browsers is the “HTML5” Web Storage module. I like it mostly because of its simplicity.

The Web Storage idea is to simplify storing of information for the user. I always hated using cookies because of all the domain issues. It was also a mess to check for them and then fall back to other things. Most of all people are paranoid about them and I know a lot of corporate computer users who have all cookies disabled.

Web Storage on the other hand is a simple object in the browser. You can set localStorage.foo to ‘bar’ and when you read out localStorage.foo next time you load the document it will be ‘bar’. That’s how easy it is. You can store 5MB of textual data which could be integers or strings. With JSON.stringify() you can easily store more complex information.

So I was wondering what you could use that for to get the most out of it and I realised that in a lot of cases I render simple HTML in PHP and then do some clever stuff in JavaScript to make it a different interface. If I wanted to retain the state of that interface I’d have to somehow store it in a DB so next time the user comes to the site, I re-render the last state of affairs.

With localStorage you could do that by simply caching the innerHTML of the main app (if you build it progressively):

You can check the source of the trick in this Gist on GitHub:

If you don’t get what I am doing here, check this quick screencast:

I’ll be using that a lot – it is terribly simple and yet powerful.

view-source will teach you things that are wrong

Monday, August 23rd, 2010

Lately I find more and more people in comments fighting for the need for “view-source” in our web products and claiming it to be a “vital part of the open web” and a “great way of learning for new developers”. This ails me as it is in my point of view a very outdated idea of learning and building web sites. I am not saying that view-source is not important – I am saying that there are better ways of learning and analysing code.

When view-source was king

Back in the days when I started working on the web you learned by looking at the source code of other peoples live web sites, copying and pasting what they’ve done and reverse engineering the workings to see how you can use or improve it. This is how I learnt JavaScript as there were no free blogs, tutorials or articles out there to tell me. The books available (basically the JavaScript Reference) dealt with the technologies themselves and not their application in a browser world. JavaScript for example was still considered a toy language in comparison to the mighty Perl or ASP or PHP or Java.

A few years before the web I learnt Assembly Language by analysing games on my C64 and trying to get endless lives. I did this by freezing the game, checking the part of the screen that changed when I lost a life (the counter) and then hunting through the memory to find the code that altered the counter on the screen (finding the DEC). I learnt how to cheat the system – not how to write Assembly Language. That came later when I had spent a few years reverse engineering.

That kind of commitment we don’t have time for if you want to learn web development.

View source gives you a view and not the source

Despite the time spent, the problem with looking at a web site with view source is that you don’t see the source, but you see a view:

  • If you build web sites with performance in mind and in high end environments you send browser-optimized views to different browsers. This means that learners would see what is optimised for Firefox and then try it out in Safari. Or worse, they see what we need to jump through to make old IEs behave and consider this the best way of developing.
  • Live code is normally minified and concatenated and has comments removed. So you learn the how but not the why. Some of the things you see might be terrible hacks but necessary for the environment used to build the web site.
  • Checking generated source is even worse. Browsers add browser-specific code that the original writer never added.
  • Live web sites are normally built by committee and have a lot of things in them the developer feels dirty about: tracking codes, third party ads with ridiculous code quality, quick hacks to get the thing out the door at the agreed time and date.
  • Most web sites these days are not written by hand – if you build for the web and you don’t use the power templating, CMS and databases and web services offer you are missing out on a lot of great opportunities. HTML is the end product of a methodology and a build process – not the start of it.

Open source is the new view-source

If you really want to learn about how web sites are built and how to use certain technologies, look at the source code repositories instead. GitHub, SourceForge, Google Code and all the others are full of great examples. This is where developers communicate with each other and show off the newest technologies.

As the final product is created and not written by hand you will find the important comments as to why something is the way it is there.

Say my entry to the 10K competition World Info. If you look at the source code you will see minified JS and CSS, all of it inline. I would never code that way. This is the result of a build script. I tell the world that:

World info source code message by codepo8

If you look at the source on GitHub you get step by step comments how I have built the solution.

What would a new learner get more information from? This was not much more work for me – as I document where I write I keep things up-to-date. Even more interesting, I actually fixed a few problems and change my code while I was documenting it – as I was forced to look at it again from a different angle after having written it.

Learn the why not only the how

The main problem with teaching people how to become good web developers is that there is a massive lack of patience. Instead of realising that knowing the syntax of a language doesn’t make you a developer we think this is all that is needed. It is like learning the grammar of a language and then trying to communicate without having the vocabulary. Or analysing the syntax of a poem without looking at the metaphors and their meaning or the historical environment it was written in. Most of what makes development and writing art and craft is lost because of the lack of patience.

W3Schools is a great example. It tells you the quickest solution and gives you something to play with. This is why it is massively successful. It is a terrible resource though as it doesn’t explain what can go wrong, when this would be a bad solution and it gives people the idea that they know everything by knowing the syntax. The PHP documentation is better as you learn in the community comments how to apply the functions and how they fail.

If you really want to learn about web development and standards then there are a few very good resources:

And far too many personal blogs that I could list here now. None of these are 2 second lookup tasks – but once you went through some of them you will know the why and the how and you will be able to see what is sensible to take on from a source view and what is probably not that a good idea.

Worldinfo – my Event Apart 10KB submission (information and documented source code)

Tuesday, August 17th, 2010

As you might know, an Event Apart in association with Microsoft are currently running a competition asking developers what they can do in under 10KB and I thought I have a shot at that.

So here’s my submission: an interface to get information about any country on this planet in under 5K:

I got the idea last Thursday during Pub Standards in London when someone asked me if it is possible to get information about all the countries in the world using YQL. The main interest was not only to get the names but also the bounding box information in order to display maps with the right zoom level. And it is, all you need to do in YQL is the following:

select name,boundingBox from geo.places.children(0) where
parent_woeid=1 and placetype=”country” | sort(field=”name”)

This gets all the children of the entry with the WOEID of 1 (Earth, that is) in the GeoPlanet dataset that are a country and sorts them alphabetically for you.

Each of the results comes with bounding box information which you then can use to display a map with the Open Streetmap static image API (or any other provider). For example:

http://pafciu17.dev.openstreetmap.org/?
module=map&bbox=38.804001,37.378052,
48.575699,29.103001&width=500&height=250

Or as I use it:

var image = ‘http://pafciu17.dev.openstreetmap.org/?module=map&bbox=’+
bb.boundingBox.southWest.longitude+’,’+
bb.boundingBox.northEast.latitude+’,’+
bb.boundingBox.northEast.longitude+’,’+
bb.boundingBox.southWest.latitude+
‘&width=500&height=250’;

The last piece to the puzzle was where to get country information from and of course the easiest is Wikipedia. Every country web site in Wikipedia has a info table about it which turned out to be too much of a pain to clean up so all I did was to scrape the first three paragraphs following this table with YQL:

select * from html where
url=”http://en.wikipedia.org/wiki/Christmas_Islands”
and xpath=”//table/following-sibling::p” limit 3

The rest, as they say, is history. I built the system in all in all 2 hours and now I spent some time to clean it up and spice it up:

World Info - my 10kb app compo entry (spiced up source version) by photo

As the first loading of the data takes a long time I use HTML5 local storage to cache the country information. This means you only have to wait once and subsequently it’ll be much faster.

You can download and see the source of Worldinfo on GitHub and read through the massive amount of comments I left for you.

If I were to build this as a real product I would cache the results on a server rather than hammering the APIs every time a user comes along – as the information doesn’t change much this makes much more sense. I will probably release a PHP version of that soon. For now, this is what we have.

validate() || dont()

Tuesday, August 17th, 2010

If you’ve been on Twitter this morning you’ll have witnessed a bit of a discussion between Remy Sharp and me about his web site http://doesvalidationmatter.com/ which initially just said “no”. I didn’t agree with such a sweeping statement and I still don’t – if you make a statement like that then you need to quantify it. Therefore I called it “arrogant bullshit” which was eloquent enough for 140 characters.

The dangers of influential people making sweeping statements

As Remy and me are friends we just went to IRC and had a longer discussion about what each of us were thinking and I explained in detail the worries I have with luminaries like Remy making a one-off statement like this. I love Remy’s work and I envy his drive and his dedication to the newest technologies out there. That’s why my name is on the back of his and Bruce Lawson’s book where I praised their pragmatism and hands-on examples how to use HTML5 and all its cool APIs in today’s world.

The issue with a statement like “validation doesn’t matter” from someone respected and very much on the forefront of great publications right now is not that it might be wrong or right – it is that people will use it as an excuse without even understanding the subject matter: “Cool, Remy says validation doesn’t matter so let’s use table layouts and font tags – after all they work”.

The issue of validation being of any importance in a web development (and especially markup) context is not easy. The reason is once again the forgiving and undefined environment we as web developers build our products in.

If I put a syntax error in a JavaScript file, it breaks. If I put a syntax error in a PHP script the page doesn’t get rendered (or throws an error). If I write invalid HTML the browser either ignores it or – what is actually worse – tries to fix it for me. This is how browsers always worked as markup is not considered code and this is also why engineers still consider web developers designers. Imagine Word simply replacing everything it considers badly written or a typo when you save and email the document without telling you – this is what browsers do for us.

Markup and validation issues

  • Right now we re-invent markup as we want to build real applications rather than simulating them (every Ajax app is a hack, if you really look at it)
  • The official validators of the W3C are too strict and outdated
  • XHTML was born dead, not because it was a bad idea but because there was no support by the main browser in use at the time
  • XHTML2 and XHTML modules reminded me of scribbling of mad men on a prison wall – far too complex for the average mind and trying to solve issues that were either edge cases or just far in the future
  • We are used to having to write half-valid code to get things to work and we don’t get any punishment when we write invalid code.
  • Flagging up sensible extensions to the HTML namespace like WAI-ARIA as errors and disallowing a site to go live if it doesn’t validate holds us back in building better web applications.

All in all the need for validating markup seems arbitrary and part of an approach to web development that is over. The only time I cared about XHTML was when I had to use XML and XSLT to develop web sites for a certain CMS. I also learnt there that XML providers will never take your data if it doesn’t come with a schema or doesn’t validate. This was just a given.

Neither of the above reasons ever stopped me from writing clean code and validate my HTML. And here’s why:

Validating is part of a process

Validators are brutal and stupid machines. The results should be analysed and applied as they make sense. As Douglas Crockford says about JSLint: “it will hurt your feelings”.

Therefore validators for me are a first step to find and remove obvious flaws. When I was moderator on on IRCNet in the long long ago we never helped people who didn’t validate their code – the reason was that 90% of the problems vanished once the persons realised they had forgotten to close a DIV and the browser rendered whatever but not what they intended to get. This allowed us to help far more people with real problems.

Validating markup is the “have you tried turning it off and on again” you do with computers – it will solve a lot of your issues and give you some time to re-think your ways. Furthermore, it solves a lot of issues, some of them magically.

Most of the issues of validation are not about validating or the tools – it is about people misinterpreting the results. As there is such a gap between the standards and what browsers can do some errors being flagged up should be exceptions and not flagged up as show-stoppers.

This has been the bane of accessibility for years – the accessibility standards and validators based on them just spat out reports that people ticked off to abide to a standard and not to build sensible products. Some measures taken to make your site AAA compliant actually made them less accessible. The reason was not that people were such sticklers, the reason was that – at least in England – inaccessible web sites are against the law. So instead of really building accessible products people built sites that validate and stapled the results of an automated validation report to the handover – done.

Which of course is nonsense. You cannot validate semantics and you cannot automatically test if an alternative text for an image really makes sense. For this you need a human to check this. The accessibility standards had that in them. They also asked for testing with users with different abilities, but all of this became not important when you can show a cool report that means your site passed an automated test.

This attitude also spilled out to enterprise environments where the interface of a web app is less important than how it ties in with the business processes. There a validation report from the W3C validator meant the thing goes live or doesn’t. Regardless of what the HTML really is. And these kind of people and the poor developers who had to work for them started the whole animosity towards validation.

Validators check the syntax but not the meaning or the quality of the code. They are the start of a quality process, not the end of it. By saying validation doesn’t matter at all we neuter the whole process.

Let’s educate people about validating and debugging

Instead of telling flat out that validation is dead and we don’t need it we should educate people how to read and weigh validation reports. We should also help the W3C and others to write validators that result in more sensible reports and allow for exceptions and weighing of omissions. Then we have a powerful piece of the puzzle of creating clean, meaningful code.

The siren song of browser support

The bigger confusion about the need or sense of validation is that people say that it works “in all modern browsers” and therefore it is fine to use. The issue with that is that “modern” quickly becomes “holy shit, why did we ever use this” and “nobody uses that one any longer, let’s not support it”. Another issue of course is that if you see a browser as your main support and test platform then it is also up to you to test your code in all the browsers that are “modern” and in all their permutations across different systems.

A lot of the products that now keep us from getting rid of IE6 are the ones that were built back when IE6 was “the awesome” and supporting it was “building for the future”. A lot of what we achieved in the last years and the current communication between browser vendors was based on us fighting for standards support. Let’s be aware of the trap we fell into once and where thinking about standards got us before we condemn the tool that says how far you’ve come close to following a standard that should be the norm.

Remy will publish his thoughts soon, too. This is an interesting discussion. This now was only about validating – the more interesting question is if it is worth it to write valid code. Personally I think browsers have enough on their plate rendering the things we give them – why should they also fix what we did wrong?

Would opt-in or opt-out for Google Streetview be a better solution?

Saturday, August 14th, 2010

Seems like the whole of Europe is currently up in arms against Google streetview showing their houses (let’s not even start with the sniffed wireless points and their data) and as friends in Google tell me there are queues in the Google Germany office where people request their houses to be removed.

Should Google Streetview be opt-in? by photo

The reason is privacy and people are worried about security. As my mom put it “Thieves only need to use that Streetview thing and see where there are nice houses to steal things” to which I replied that they could also use a technology like a bicycle or a car to do the same thing and they wouldn’t have to go far to steal things.

Now, today a friend of mine Christian Bräunlich had a damn good idea and put it on a mailing list:

Bestimmte Leute wollen ja ihre Haeuser nicht im Streetview haben. Mein
Vorschlag zur Loesung: es wird ein spezieller 2D-Barcode definiert. Jeder kann
sich den ausdrucken und über die Tür kleben. Dieses Haus wird dann verpixelt.
Vorteil: geringer Aufwand, erprobte Technik: das hat schon vor 6000 Jahren
funktioniert. Häuser können automatisiert verpixelt werden. Den Barcode sieht
man kaum, man könnte ja verschiedene je nach Hausfarbe, definieren.
Ich denke ja nicht, dass alle fristgerecht ihre Anträge einreichen koennen zur
Entfernung, und ausserdem bindet das doch viel Manpower bei Google.

In English:

Some people aren’t happy about seeing their houses in Streetview. Here’s my proposal for a solution: you define a 2D barcode that people can print out and display on their house. Houses with a bar code don’t added to Streetview. The benefits are: this is simple to achieve, the technique is old and already proven (6000 years ago, really). You can hardly see the barcode and you could offer several different ones according to the colour of the house. I don’t think that people will be able to send in their requests to be removed from Streetview in time and the overhead in manpower at Google to respond to removal requests is another problem.

This is the opt-out idea. You could also turn that around and make it an opt-in. If you want to have your house in – display a barcode.

The only problem I can see with this is when you have houses with several tenants. The other benefit of this is that Google could offer these barcodes and send them by mail. They could also create a generator that would allow for example shops to also add their names and product offers in the barcode data and thus enhance the Streetview information even further. What do you think?