Diving into the web of data – the YQL talk at boagworld live 200Friday, February 12th, 2010
I just finished a quick podcast demo for the 200th podcast of Boagworld, streamed live on ustream. I thought I had an hour but it turned out to be half an hour. My topic was YQL and I wanted to actually do something like shown in the video that was just released (click through to watch or download the video in English or German):
The story of this is:
- We spend a lot of time thinking about building the interface and using the right semantic markup and trying to make browsers work (or expect certain browsers in certain settings as we gave up on that idea)
- What we should be concentrating on more is the data that drives our web sites – it is boring to have to copy and paste texts from Word documents or get a CMS to generate something that is almost but not quite like useful HTML.
- You could say that once published as HTML the data is available but for starters HTML4 is bad as a data format to store information. Furthermore too many pieces of software can access web sites and the cleanest HTML you release somewhere can be messed up by somebody else with a CMS or any other mean of access further down the line. Sadly enough most content editing software still produces HTML that is tied with its presentation rather than what it should structure and define.
- Having worked on the datasets provided by the UK government at data.gov.uk I’ve realised that we are nowhere near as a market to provide re-usable and easily convertible data to each other. XML was meant to be that but got lost in complexities of dictionaries, taxonomies and other things that you can spend days on to define English content but have to re-think anyway once you go multilingual. Most content – let’s face it – is maintained in Excel sheets and Word documents – which is OK cause people should not be forced to use a system they don’t like.
- If you really think about the web as a platform and as a media then we should have simple ways to provide textual data or binary information (for videos and images) instead of getting bogged down in how to please the current generation of browsers.
- If you really want to be accessible to any web user – and that is anyone who can get content over HTTP - you should think about making your content available as an API. This allows people to build interfaces necessary for edge cases that you didn’t even think existed.
- YQL is a simple way to use the web as a database and mix and match data and also a very simple way to provide data in easy to digest formats – give it a go.
In any case, after the 2 o’clock podcast where most of my questions were eloquently answered by Jeremy Keith and the Skype connection died in the last 5 minutes I spent the afternoon putting together some demos for this YQL talk as YQL is really easiest to explain with examples and to have something for the people on flaky connections to play with. So if you go to:
You can see what I talked about during the podcast. People in the chat asked if this will be open source. Yes it is, the passcode is pressing cmd+u in Firefox or whatever other way you choose to “view source” in your browser of choice.