webdevelopment | Christian Heilmann

Posts Tagged ‘webdevelopment’

Mess them up while they are young? Better web tutorials now!

Thursday, July 28th, 2011

I am currently tech-reviewing a book that is used in educational environments (schools, universities, corporate training) on HTML5, CSS and JavaScript, or – in short – web development. And needless to say I am pretty shocked by what is taught in schools on that subject.

It annoys me that the first materials new learners get seems to repeat a lot of mistakes of the past. Following I will show some of the issues and what we should be doing instead.

Disclaimer: these are examples I did come across in current code examples and training materials. Anyone shouting that “nobody uses that anymore” and “what kind of idiot would code like that” should please take the stairs down from their ivory tower and get their hands dirty in real, day-to-day training that happens in our schools and corporate training rooms. I am not making this up – this is the impression people get when starting on the web from a school or corporate angle rather than reading magazines, blogs or follow some known people on Twitter.

The “hello world” from hell

The thing I keep seeing as the first code example is something like this:

This is normally followed by explanations that:

this is the way to embed a script,
that the type and/or the language is important to tell the browser what kind of script it is and
that the comments are needed to stop browsers from displaying the code. The JavaScript line comment “//” in the last line is sometimes added “to make sure that your HTML validates”

If you really use an HTML5 Doctype none of them are needed. JavaScript is the only language browsers that support HTML5 and are in use will execute so the type attribute only makes sense when you want to embed a script that shouldn’t be rendered by all. Some client side templating languages do this. JScript is dead, let’s not even bother telling people about it.

As for the comments, I haven’t come across any browsers ever that display the code inside script blocks. If you really worried about backwards compatibility that goes back to ~~Lynx~~Netscape 1(as apparently Lynx is still in production) then you should also not embed your scripts in the page.

In the end, this is what we will do in any case to reap the benefits of maintaining our scripts in separate files. Embedded code in a page is only needed for high performance environments that need everything in one cached file. The other time I keep seeing embedded JS code is to apply a quick fix to a product or to add a “one off as this is a special page” and thus making maintenance harder.

Let’s do this instead: the way to write a “hello world” would be to write an index.html that explains the very needed basics of semantic HTML and embed the following:

That way people never get tempted to mix behaviour, structure and presentation and get to understand what it means to build a web product rather than a page that has everything in one file.

Of course, there can be benefits of keeping all in one document. For starters you can explain how the different things work together. In a real development environment, however, this is the odd one out and our primers should not start explaining those. It is much harder to un-learn things than it is to learn them, so why do we start with something you won’t use later on?

Let’s treat both CSS and JS like we treat images in an HTML document – an external resource. This is the biggest use case and it is the one we should get people to start with.

The curse of `document.write()`

Every time I see a document.write() or document.writeln() in materials these days I die a bit inside. I was around when we had to use that to write framesets into popup windows we just opened. The reason was that browsers didn’t support the DOM to reach content in the page or to create HTML content. I don’t want to remember those times, and neither should people who come into the world of web development for the first time.

I understand that it is the easiest way to show output in a document – to give new developers their first “wow, what I just programmed shows up on a screen” moment. I also know, however, that I would never hire anyone who uses document.write just to print out something on the screen.

Using document.write() teaches developers that JS writes out content where you add a script node – not how to interact with HTML.

Let’s do this instead: instead of writing to the document we have a few options. These are ordered in preference, starting with the least desirable:

Use alert() – which is only marginally better, but comes with a few benefits. Alerts are disruptive. They stop the execution of your code and you can for example show how the value of a variable changes through the execution of your script. The issue with them is that if you want to show a lot of values, it can become annoying to hit enter to get rid of all the alerts. In new Firefoxes for example alerts are stacked instead of stopping the execution, which means that this benefit is gone.
Start with a simple introduction to the DOM. It is really simple if you think about it:

Using the Document Object Model (DOM) we can reach elements in the page. Say for example you have a in the document. Using document.getElementById('result') in your JavaScript will give you access to this element. You can read and write to this element by using innerHTML. So if you do a document.getElementById('result').innerHTML = 'My result' you will see that the content of the paragraph is changed to “my result”. If you do an alert(document.getElementById('result').innerHTML) you will get an alert stating ‘my result’ as this is the text content of the paragraph after you changed it.

That’s not too hard, is it?
Introduce debugging tools from the very start – explain consoles in browsers. All modern browsers come with a development console that can show messages written out with console.log(). By starting with that we explain developers how to debug code before they write it and we let them see what is going on in their code. It is like showing a new trainee where the manuals and the first aid kit are before allowing them to use the power tools. Most of our job is testing our code – not writing it. In the real world we spend most of our time debugging and refactoring code – so why not start with this?

The curse of `alert()` and `prompt()`

Using alerts and prompts to print out results and get content from users is a simple way that works in any browser. However, in a real product you will hardly ever use an alert (other than for debugging) and I can safely say that I never saw the use of prompt outside of tutorials.

Hands up who ever made the mistake of adding an alert() in a massive loop or try to debug a blur() event handler using an alert and thus ending up in an endless debugging loop. Yes, it happens and it doesn’t make sense to go that way when browsers (and server-side JavaScript) have debugging tools these days that are much better in showing us what happens under the hood. A console shows us the objects and DOM elements rather than a cryptic [object object] as debugging information.

Let’s do this instead: instead of using the browser object model and alerts, confirms and prompts use the console for debugging and form fields for data entry. It is really not that hard to write a simple explanation how to access data in a form:

Forms in documents allow you to get information from the user and act on it. Say you have the following form in a page:
Please enter a number
The label element is needed to tell users what the entry field is and what they should do with it. The type attribute tells the browser what kind of entry field we expect – in this case a number. The name is needed for a backend script to get the information when the form is sent to the server and the id is needed to connect the label and the input element, the placeholder shows the user an expected value and the required attribute validates the form in the browser.

As with all elements in the document you can reach the input field using the document.getElementById() method in JavaScript. and you can set and read the value of the field with getAttribute() and setAttribute(). So if you want to use JavaScript to set the value of the entrynumber to 5, all you need to do is document.getElementById('entrynumber').setAttribute('value',5). If you want to read the value of the field you can do a console.log(document.getElementById('entrynumber').getAttribute('value'))

You can set and read the value of the field by reading or writing to the value property of the element. To set the value of the entrynumber to 5, all you need to do is document.getElementById('entrynumber').value = 5. If you want to read the value of the field you can do a console.log(document.getElementById('entrynumber').value)

(after some testing and feedback from Mathias Bynens this still is the better way to teach new developers)

Back that up with a working example you link to and you already explained how forms work, what some of the attributes do, that browsers these days have client-side validation and how to read and write form values. And it wasn’t that long and hard to do.

The myth of the short attention span

Whenever I ask people to explain the why and not just the how in tutorials I get feedback that our readers are busy and want to see the outcome as soon as possible. The general consensus is that readers have an incredibly short attention span and don’t want repetition or get long winded explanations.

That may be true, but it doesn’t mean that you need to spoon-feed information. It means that the start of your tutorial must grab their attention and you must make it interesting. If going through a tutorial and coding along is fun then people will not realise how long they spend on a certain task.

Look at gaming: Angry Birds didn’t come with a tutorial – every time you get a new type of bird there is a simple picture showing you what they do and how to activate their special skills with your finger. They could have written a long document but they didn’t need to. If you really think people have a short attention span and lose interest really fast just see how many times people try to solve the same level over and over again as it is fun to see pigs explode. The same applies to “Cut the rope”. You feel bad when the little frog frowns as the candy gets lost and its constant excitement about the impending feeding makes you want to solve the puzzle all the more. Why do we have this attention only when it comes to “wasting our time” playing games?

Our training materials should learn from that and keep our readers excited and wanting to learn more. You don’t do that with dry examples of die throwing simulators and mathematical loops calculating prime numbers. Bring emotion into your materials or show real life examples and you will have the attention of your readers. In a training, have the group discover the materials bit by bit rather than giving them a massive binder with information and examples that need to be done by then end of the day if you want to get a shiny certificate to collect dust on your wall.

Lipstick and a wig on a pig

You can not re-use old materials by spicing them up with new technology. Period. Whenever a new buzzword appears, some old books get dusted off and new features get added. This happened with DHTML, then Ajax, then Mobile Web Development and now HTML5. Replace the Netscape 3 screenshots with Chrome ones, slap on a chapter covering the new techniques and voilá – you got yourself a new seller. I am sure a lot of people get approached by publishers to “write a new version of this old book, you, know, upgrading it”. Just say no!

Unless the original version of the book has been incredibly well written – and I have yet to find one of that calibre – this is a waste of everybody’s time and borders on a scam. In the past, a big part of web development was either working around bugs in browsers or catering for a special environment. Both are things of the past and you can’t just add a theoretical new chapter to upgrade an old book that is based on dated approaches.

We still struggle to find a correct way to build things on the web and our development practices change constantly. Performance issues, security concerns and stability are constantly fixed and changed in browsers and what was a great idea in the past can easily now get your site hacked or make it perform incredibly bad.

Changing the approach of explaining code in web development

All of this inspired me to write another beginners tutorial to web development. Instead of showing all the things that are possible we should show what makes sense and gets people on their way. A solid foundation with the least necessary information is a great starting point to go out there and learn on your own terms. Our tutorials and courses should start a desire to learn more – not to ensure we got all the information out that we were asked to shove down the throats of our students.

Tags: alert, badtech, documentwrite, outdated, tutorials, webdevelopment
Posted in General | 13 Comments »

Things you can use – my talk at the Deveoper Evening with Yahoo and Opera in Oslo, Norway

Thursday, December 3rd, 2009

I am currently in (freezing) Oslo in Norway and about to leave for the NITH university for the Yahoo and Opera developer evening. Here are the slides and notes of what I am about to present tonight. ~~Audio will follow later~~Audio is now available.

~~For those who will attend tonight: stop right here, you’ll just spoil it for yourself :)~~

Slides

Things you can use (by the Yahoo Developer Network and friends)

Audio

The audio recording is available on archive.org or as a 54MB MP3 file.

Notes

Things you can use (by the Yahoo Developer Network and friends)

In the following hour or so I will be talking to you about some of the things that have been done for you that you can build on. Web development is confusing enough as it is. There is no need to make it more complex by not using what has been done for us already. But first, a few facts about me:

I am Chris and I’ve been developing web sites for about 12 years. I’ve worked with numerous frameworks and CMS and I’ve delivered various international and high-traffic sites. I’ve written several books and dozens of articles and hundreds of blog posts on the subject of web development, accessibility, performance, maintainability, internationalization and many other things I had to battle in my day to day job.

Right now I work as a Developer Evangelist, which is a pretty new job in the market and as many people were confused about it, I wrote a Developer Evangelism Handbook which is free and creative commons online, but you can also buy a hard copy on Lulu.com.

I work for the Yahoo Developer Network and you will find most of the things I will talk about today there.

Learning the basics

The good thing about these days we live in is that we have great resources on the web to learn good basics of web development. That we still learn it by viewing source and trial and error after copy and paste is human nature but also keeps us from evolving. The Opera Web Standards curriculum is an amazing resource to learn web standards based development (as is the WaSP interact) and the Yahoo Developer Network theatre is full of videos of talks and tutorials. All of this is free to use and to build upon.

Starting with a clean canvas

One thing I learnt is that in order to deliver products that work and are fun to maintain you need to work on a solid base. Web development is hindered by the fact that our deployment environment is totally unknown to us. This is why we need to even the playing field before we should go out on the pitch to play.

You achieve this by defining what you call “support” for browsers. In the case of Yahoo this is the graded browser support document. If you aim to make your web product look and work the same on every browser out there you are doing it wrong. Web design is meant to go with the flow and accustom itself to the ability of the user agent (browser in most cases). At Yahoo, we have the Graded Browser Support for this – browsers that are not capable of supporting new technologies will not get them. Web sites are not meant to look and work the same everywhere. On the contrary – the ability to accustom the interface to different user agents is what makes web development so powerful.

CSS frameworks and frontend libraries

The next step is to free ourselves from the limitations of browsers and especially their differences. This is what CSS frameworks and front end libraries like jQuery, Mootools, Dojo, YUI and many more are for. All of these have the same goal: allow you to build code that is predictable and limited to the bare necessities. We should not have to bloat our code just to make our products work with random browser implementation problems. These are a moving target as there is all kind of weirdness happening across the board. Libraries make our job predictable and allow us to use web standards without catering for browsers. If you build your code based on libraries you can fix your product for the next browser by upgrading the library. If you choose to do everything yourself – good luck.

Building interfaces that work

The next thing to consider is that an application interface is not just the look and feel. In order to make it work for everybody we’ll need to understand the ways users interact with our products. This includes simple usability (not overloading the user, not confusing the user) but also means knowing about different interaction channels – for example keyboard users. A great resource for starting the journey towards usable interfaces is the Yahoo Design pattern library. There we collected information how our end users use the web and reach a goal easily. If anything, have a look at these patterns before you start building your first widget or application. They even come with stencils for different designer tools.

Using the web

The big change in web development over the last few years was that we stopped trying to do everything ourselves. The web is full of specialized systems such as YouTube, Flickr and Google Maps that allow you to host data in specific formats and make it dead easy for you to convert and re-use that data in web formats. Using this information happens via Application Programming Interfaces or short APIs. These allow you to demand data in a certain format and get back only what you need. There are hundreds of APIs available on the web. For an idea of what is available, check out programmableweb.com.

Thinking data first

The main thing to be aware of if you want to build great products is to separate your data from your presentation. This is essential to allow for localization, internationalization and to keep your code maintainable. In the case of a mashup of different data sources this means you need to think about making it as easy as possible for you to use different APIs. The complexity of your product increases with the number of APIs you use. Every API has different ways to authenticate, expects different parameters and returns data in different formats.

Mixing the web with YQL

YQL is a solution that Yahoo built for its own needs. All of our products are built on APIs – for scalability reasons. Having to learn all these APIs and negotiating access cost us a lot of our time so we thought we’d come up with a better solution. This solution is called YQL. YQL is a SQL-style language to get data from the web. The following query would get us photos of London from flickr:

select * from geo.places where text=”london”

Or would it? Actually it would give us photos with the text “london” and not photos taken in Oslo. If we wanted that we need to use another API. The Yahoo Geo APIs allow you to define any place on earth and get it back as a “where on earth ID” or short woeid. This format is supported by flickr, so we can use these APIs together. Twitter will also soon support this.

select * from flickr.photos.search where woe_id in (

select woeid from geo.places

where text=”london”

)

This gives you some data of the photos you want to show but not all, so let’s use another API method to get that information.

select * from flickr.photos.info where photo_id in(

select id from flickr.photos.search where woe_id in(

select woeid from geo.places where text=”london”

)
	)

This is a lot of data so in order to only retrieve what we really need we can filter the data down by replacing the *:

select farm,id,secret,owner.realname,
server,title,urls.url.content
from flickr.photos.info where photo_id in(
select id from flickr.photos.search where woe_id in(
select woeid from geo.places where text=”london”
)

)

Using this query in the YQL console, choosing YQL as the output format and “flickr” as the callback will give us a valid URL to use in a browser or script:

http://query.yahooapis.com/v1/public/yql?
q=select%20*%20from%20flickr.photos.info%20where%20photo_id%20in%20(
select%20id%20from%20flickr.photos.search%20where%20woe_id%20in%20(
select%20woeid%20from%20geo.places%20where%20text%3D%22london%22
))&format=json&diagnostics=false&callback=flickr

Using this in the src attribute of a script tag and writing a few lines of Dom Scripting displays the photos.

Is this limited to Yahoo?

No, of course not. First of all you can use any data on the web as a source using the atom, csv, feed, html, json, microformats, rss and xml tables.

Scraping HTML with YQL

The HTML table is quite interesting as it allows you to get data from any HTML document, cleans it up by running it through HTML Tidy and then allows you to access parts of it using XPATH.

This allows you for example to take a simple web site like this one about TV jokes and turn it into a wiget to include into other pages with a few lines of JavaScript.

Extending YQL

You can also add your own data by providing a simple XML schema called an open table. In this schema you need to tell YQL what the data endpoint is, what parameters are expected and what gets returned.

Open tables also allow for an execute block which allows you to write JavaScript that will be executed by YQL on the server side using Rhino and has full e4x support.

Using this we can turn the earlier example of the Flickr photos returned as a list into an open table and make it much easier to get Flickr photos in the right format:

select * from flickr.photolist where text=”me” and location=”uk” and amount=20

Using the JSON-P-X output this means we can simply use innerHTML to render out photos.

If you want your tables to be available in YQL, all you need to do is to add to the open tables repository on GitHub. For more information, check out the YQL documentation.

Building with Blocks

The best thing you can do right now if you want to build a web application is using already tested and working building blocks. The Yahoo User Interface library is full of these as this is exactly how we build our own tools.

For creating layouts with CSS without having to know all the hacks that browsers need you can use the YUI grids and if you are really lazy you can even use the WYSIWYG grids builder.

Using the CSS grids and some of the YUI widgets it is very easy to build a working application that is tested across all the browsers defined in the graded browser support. Examples are this geographical Flickr search and a showcase to get information for Delhi.

One thing that is really useful about the widgets provided in YUI is that they are all driven by custom events. That way you can extend and change their functionality without having to mess around with the code. Simply write an event listener for the custom event, add your functionality and prevent the original functionality.

Another great benefit of YUI is the very detailed documentation and the hundreds of examples you can use to get you started.

Wanna get super famous?

The last thing I want to talk about today is the Yahoo Application Platform or short YAP. Using YAP you can build a web application using JavaScript, HTML and CSS and add it to the Yahoo Homepage, My Yahoo and in the future even more properties of Yahoo.

You start at and develop your application to the largest part in your Applications Dashboard. Yahoo apps have two views: a small view which is more or less static (but allows for some Ajax) and a large view which gives you full access and much wider screen space. The small view is overlayed over the Yahoo page and will show Yahoo ads next to it. The large view can be monetized by you.

One thing that can be a bit of a frustration about YAP when you go at it with a normal web development mindset is that not all CSS/HTML and JavaScript is allowed as YAP uses Caja to keep our applications secure. Therefore we’ve put together some Caja-ready code examples to get you on track.

The easiest way to build YAP apps is by using the Yahoo Markup Language (YML) and YUI as YUI was re-written to be Caja compliant.

If you want to take a look at what a YAP application looks like, check out the source of TweetTrans on GitHub. In essence it is a simple PHP API call using YQL and a YML interface to display the results using Ajax. No JavaScript involved as YML does that for us.

You can install TweetTrans by clicking the following link: http://yahoo.com/add?yapid=zKMBH94k. This is also the way to promote your own applications (simply replace the application ID with yours) until the YAP application gallery is up and running.

The more powerful way of promoting your application in Yahoo is to piggy-back on our social connections and you can do this by diving into the social graph API. The easiest way to do that is to use the social SDK also available on GitHub. Notice that the SDK will not work on a localhost – you need to run it inside the application dashboard or a Yahoo container on the homepage.

Elevator pitches

Yahoo User Interface Library – YUI

YUI is the system that Yahoo uses to build its web sites. It is constantly tested to work for the largest amount of users, free, open source and covers everything from design patterns to out-of-the-box widgets. It is modular and you can use only what you need. You can either host it yourself or get it from a network of distributed servers.

Yahoo Query Language – YQL

YQL is a web service that allows you to mash-up any data on the web from various sources with a simple SQL-style language. You can filter the data down to what you need and you can convert the data with server-side JavaScript before returning it. Data providers can use YQL to publish an API on the web on top of Yahoo’s infrastructure and cloud storage.

Yahoo Application Platform – YAP

YAP is the Yahoo Application Platform which allows you to build applications that run on the Yahoo homepage and soon other properties. You can dive into Yahoo’s social graph to promote your applications and you can create highly secure web apps as YAP uses Caja to ensure code quality.

Tags: css, javascript, mashups, security, webdevelopment, yahoo, yap, ydnoperaoslo, yql, YUI
Posted in General | 5 Comments »

Developing with the web, eh? Speaking at the domain convergence in Toronto, Canada

Tuesday, August 18th, 2009

I have just returned from Toronto, Canada where I spoke at the second Domain Convergence conference about web development with distributed data and about using the social web.

First of all I have to say that this was an eye-opening experience for me as I never even knew that there is a whole community of “Domainers” – people who buy, sell and “develop” internet domains much like other people do with real estate.

The organisers (an old friend of mine hailing back from the time when we created C64 demos together) wanted me to talk about web development with APIs and the social web as the current state of this in the domain market is very much random. The common way of creating some content for a domain is to hire a developer on rent-a-coder, get a $99 logo, write some content and put it live. The main goal for the site thus created is to make people click the ads on it – selling the domain is secondary step.

This is totally the opposite of what I consider web development but instead of condemning it I loved to get the opportunity to show another way of building web sites on a very small budget and without much technical know how. This is what I had to say:

The presentation SlideCast

Developing with the web

You can also listen to or download the audio recording at archive.org

Notes

Following are the notes that I was meant to say during the presentation. I ad-lib a lot though. Can’t help it, eh?

Developing with the web

The current financial crisis is reported to have its toll on the web business as well. Whilst that is true to a degree (declining ad sales show that) there might be a different reason though: the web moves on and so do the habits of our users.

Build it and they will come doesn’t cut it any more

Anybody who builds a web product and wonders why there are less and less comments, visits and clicks must realise that they are part of a massive network of sites and offers. People use the web and multi-task their social interactions.

Back when this was all fields

I remember that back when I started as a web developer you had one browser window. If you wanted to have several pages open you needed several windows. All of these used up memory of which we didn’t have much. As computers were slow we maybe had another chat client open or an IRC client and an email client. In any case, working and surfing were two different tasks.

The web 2.0 revolution

With web 2.0 this changed. Computers became more powerful, browsers started supporting multiple tabs and software moved from bulky desktop to lighter web-based interfaces. The web turned from a read to a read/write media and its users started actively defining their web experience by altering it rather than just reading online content.

Bring the noise!

The next wave was more and more social networks, blogging, micro-blogging and messaging systems. As humans we are hard-wired to communicate and tell the world about everything we do. Only our social standards and learned manners changed that. As neither seems to be of much significance on the web we are happy to tell the world what we do – no matter if the world is interested or not. After all, once we found something out and told the world about it it is time to go back and find the next big thing.

Oh, shiny object!

This wealth of real-time information causes an overload and makes people easily distracted. It also cheapens communication and forces us to give more and more information in shorter and more to-the-point formats.

If you can’t fight them…

Enough complaining though. The funny thing is that the distributed web we have right now is exactly what the web should be like. Web sites are not radio or TV channels – they are small data sets that tie in with a much larger network of data. So in order to be part of this we need to distribute our content.

Spreading and collecting

The way to have fun with the web of data is to distribute ourselves around the web and bring the data back to our sites.

The first step is to spread our content on the web:

upload photos to Flickr
bookmark and tag URLs atDelicious
write short and succinct news updates at Twitter
upload videos to YouTube
link addresses and set up driving instructions with Google Maps
write CVs and bios at Xing or LinkedIn

The benefits of this approach are the following:

The data is distributed over multiple servers – even if your own web site is offline (for example for maintenance) the data lives on
You reach users and tap into communities that would never have ended up on your web site.
You get tags and comments about your content from these sites. These can become keywords and guidelines for you to write very relevant copy on your main site in the future. You know what people want to hear about rather than guessing it.
Comments on these sites also mean you start a channel of communication with users of the web that happens naturally instead of sending them to a complex contact form.
You don’t need to worry about converting image or video materials into web formats – the sites that were built exactly for that purpose automatically do that for you.
You allow other people to embed your content into their products and can thus piggy-back on their success and integrity.

If you want to more about this approach, check out the Developer Evangelism Handbook where I cover this in detail in the “Using the (social) web” chapter.

Bringing the web to the site

The other thing to think about is to bring the web to the site and allow people to use the services they are happy to use right in your interface.

One of the main things I found to be terribly useful there is Chat Catcher. Chat Catcher is a wordpress plug-in or PHP script that checks various social networks for updates that link to your web site. That way you can get for example Twitter updates with links to your site as comments on your blog. This showed me just how much people talk about my posts although I didn’t get many comments. The only small annoyance is that re-tweets show up but that can be tweaked.

Another interesting small tool is Yahoo Sideline which allows you to define searches for the Twitter network, get automatic updates and answer the tweets.

Cotweet does the same thing but on a much more professional level. Cotweet was build for corporations to manage their Twitter outreach and it helps you not only to track people talking about certain topics but also to define “shifts” for several people to monitor Twitter during a certain time period. Using this in a company with offices in various parts of the world you can have a 24 hour Twitter coverage without having to log in from several locations to Twitter itself.

Another way to take advantage of the social web is to allow visitors of your web site to update their status on various social networks directly from your site. This could be as easy as adding a “tweet this” or “bookmark this in delicious” button or as complex as implementing third party widgets like on the new Yahoo homepage.

Digging into the web of data.

Bringing the social web to your site is the first step. Using information of the “web of data” in your site is much more powerful, but also a bit more complex to accomplish.

Getting to the juicy, yummy data

The web is full of nice and juicy data and there is a lot of it around. Our attempts to get to it can be quite clumsy though. What we need is a simple way to access that data. One classic way of doing that is Yahoo Pipes.

Yahoo Pipes is a visual interface to remix information from various data sources like web sites and RSS feeds. The interface reminds the user of visio or database organising tools and is very easy to use. The issue with pipes is that it is a high tech interface (users who can’t see or cannot use a mouse have no way of using it) and it is hard to change a pipe. Any change means that you need to get back to the pipes interface and visually change the structure of the pipe. The good news is that you can clone other people’s pipes and learn from that.

A much easier and more flexible way of mixing the web is YQL. The Yahoo Query Language, or short YQL is a unified interface language to the web. The syntax is as easy as very plain SQL:

select {what} from {service} where {condition}

Say for example you want kittens on your site (who doesn’t?) the following YQL statement would grab photos with the word “kitten” in the description, title or tags from Flickr.

select * from flickr.photos.search where text=”kitten”

Say you only want 5 kittens – all you need to do is add a limit command:

select * from flickr.photos.search where text=”kitten” limit 5

Nice, but where can you get this? YQL is an API in itself and has a URL endpoint:

http://query.yahooapis.com/v1/public/yql?q={query}&format={format}

Output formats are either XML or JSON. If you choose JSON you can use use the data immediately in JavaScript.

Re-mixing the web.

You can nest several YQL queries to do complex web lookups. Guess what this does:

select * from flickr.photos.info where photo_id in (

select id from flickr.photos.search where woe_id in (

select woeid from geo.places where text=’london,uk’

) and license=4

)

What this does is:

Find London, England using the places API of the Yahoo Geo platform. This is needed to make sure you really find photos that were taken in London, England and not for example of a band called London. The Geo places API returns a place as a where on earth or short woe ID.
Search Flickr photos that have this where on earth ID
Get detailed information for each photo found in Flickr.
Only return photos that have a license of 4 as this means the photos are creative commons and you are allowed to use them. This is extremely important as Flickr users can be very protective about their photos.

Display them using free widgets!

You can use systems like the Yahoo User Interface Library to display the photos you found easily in a nice interface.

Using several APIs via YQL and the YUI CSS grids you can show for example information about a certain location without having to maintain any of the content yourself. This example about Frankfurt,Germany shows what this might look like. This page automatically updates itself every time the wikipedia entry of frankfurt is changed or somebody uploads a new photo of Frankfurt, the weather changes in the Yahoo weather API or somebody adds a new event happening in Frankfurt.

All of this works using APIs

The driver behind all these things are Application Programming Interfaces or short APIs. An API is a way to get to the information that drives a web site on a programmatic level and a chance to remix that information. Say for example you want the currently trending topics on Twitter. You can either go to the Twitter homepage or you can open the following URL in a browser:

http://search.twitter.com/trends/current.json>

This will give you the currently trending topics in JSON format – something you can easily convert and use.

API issues

There are of course a few issues with APIs. First of all, there are a lot of them out there. Programmable Web, a portal trying to keep up with the release and demise of APIs currently lists over 1400 different APIs. This is great but the issue is that most of them don’t follow any of the structure of others. Authentication, input and output parameters and access limitations vary from API to API and can make it hard for us to use them. This is where YQL comes in as an easy way out.

The other issue with APIs is that they might only be experimental and can get shut down or change even without warning. Therefore you need to write very defensive code to read out APIs. Expect everything to fail and keep local copies of the data to display should for example Twitter be offline again for a day.

Conjuring content?

The generation of content-rich web products without having to write the content yourself sounds like a dream come true. It is important though to keep reminding us that we use other people’s content and that we are relying on them to deliver it.

Content issues

Whilst you can automate a lot it remains a very good idea to have a human check third party content before putting it live. All kind of things can go wrong – bad encoding can make the text look terrible, a hickup in the CMS can seed wrong content into the category you chose, texts can bee too large to display and many other factors. Therefore it is a good idea to write a validation tool to shorten text and resize images in the data you pull.

Lack of context

One usability issue is that there might be a lack of context. Earlier we showed that it is important to distinguish between a geographical location like Paris, France and a person like Paris Hilton. The same applies to content you get from the web. People are happy to share their content but can get very cross when you display it next to things they do not approve of. Remember that people publish data on the web out of the good of their heart. Don’t cheapen this by displaying information out of context.

UGC issues

One thing professional journalists and writers have been critising for a long time is that while there is a lot of immediacy in user generated content there is also a lack of quality control. Sadly enough in a lot of cases the people that are most driven to state their opinion are the least appropriate ones to do so. Comments can be harsh and make people stop believing in web2.0 as a means of telling people your thoughts. The trick is to take the good with the bad. For you this means once again that the amount of UGC data is not what you should measure – the quality is what you need to care about.

Placement

Bad placement can cheapen the whole idea of the web of data. If you for example show photos of laptops of a certain brand next to news that these models are prone to explode or die one day after warranty expiration you won’t do yourself any favour. Again, showing people’s reviews and blog posts next to bad news might also get them cross.

Legal issues

Which brings us to legal issues. Just because data is available on the web doesn’t naturally mean that you can use it. You can have a lot of free data on the web and use it to make the now more or less useless parking sites for domains more valuable to the reader. You do however also need to make sure that you are allowed to use that content and that you display it with the right attribution.

Demos and resources

Here are a few things to look at that any developer can use and build with a bit of effort.

My portfolio page is completely driven by YQL and maintained outside the server. If I want to upgrade it, I either post something on my blog, change some of the blog pages, add new bookmarks to delicious or upload new slides to SlideShare
Keyword Finder uses the Yahoo BOSS API to find related keywords to any item you enter. The keywords are the keywords that real users entered to reach the top 20 results in a Yahoo search
YSlow is a Firefox add-on that allows you to analyse why your web site is slow and what can be done to make it more nimble and thus create happier visitors.
GeoMaker is a tool build on Yahoo Placemaker to find geographical locations in texts and turn them into maps or embeddable microformats.
Blindsearch is a side-by-side comparison of Yahoo, Bing and Google search results and thus a fast way to find out how you rank on all three.
Correlator is an experiment in alternative search interfaces. Instead of getting a catch-all search result page you can filter the results by interests, locations or people.
The developer evangelism handbook is a free online book I’ve written and in this chapter I go into more detail about the subjects covered here.
Web Development Solutions is a book I’ve published with Friends of Ed dealing with the subject of web development using distributed data.

Play nice!

As one of the last things I want to remind you that it is important to play nice on the social web. It is all about giving, sharing and connecting and if you abuse the system people will complain about you and cause a stir that is very hard to counter with PR or marketing. If you however embrace the social aspect of the web you can have quite an impact as the following example shows.

Colalife.org

Colalife is a great example how using the social web can have an impact on the life of people. The organisers of colalife used the social web (Facebook and Twitter) to persuade Coca Cola to ship free medication with their drinks into third world countries. Coca Cola has world-wide distribution channels with chilled storage and the Colalife people designed a special device that would allow to store contraceptives and rehydration medication in bottle crates without taking up extra space. Coca Cola agreed and now medication is delivered to those who need it for free and the company creates a ton of good will. It was one person with an idea and a network of millions to freely tell a corporation about it that made it happen.

Thanks!

That’s all I wanted to talk about, thanks.

Extra achivement

As an extra, I am happy to announce that because of my inventiveness in ordering drinks the Radisson Hotel in Toronto now offers a new cocktail of my invention called “keylime pie” which consists of Smirnoff Vanilla, Sour Mix and Soda. This was another first for me – I never had any of my inventions featured on a cocktail bar menu!

Tags: canada, dataweb, development, domainconvergence, domainers, ontario, social media, toronto, webdevelopment, yql, yslow, YUI
Posted in General | 4 Comments »

Creating Happy Little Web Sites – my tech talk at the Guardian

Saturday, June 28th, 2008

Here’s a presentation I have given today at the Guardian office in London. In it I am covering the different great ideas I found out about developing web sites. Check the presentation here:

[slideshare id=488632&doc=happylittlewebsites-1214566328957709-8&w=425]

The Guardian have recorded my talk and will release it on the Inside Guardian blog

Tags: guardian, maintainability, performance, presentation, webdevelopment, webdevtrick
Posted in beginners, bestpractice, bestpractices, development, presentation, scripting | Comments Off on Creating Happy Little Web Sites – my tech talk at the Guardian

It’s all about APIs these days.

Friday, November 2nd, 2007

It is quite cool to see the increase of coverage of the topic of web APIs. It is also very exciting to APIs finally evolving to work across several systems, aggregate and move from a one way stream of retrieving data to an alternative entry point for applications. How cool will it be for example to write reviews for amazon on movie or book sites? Having write APIs would allow us to leverage the knowledge of people on the web at the places they hang out rather than having to lure them into using a web app.

Anyways with this new API interest I had a triple release today: There is a podcast about APIs for .net magazine together with Jeremy Keith, Paul Hammond, Drew McLellan and hosted by Paul Boag, Ajaxian is featuring my ‘hack’ of the Slideshare RSS feed and I uploaded the presentations I gave at the University Hack Day introduction at Dundee, Scotland yesterday.

Enjoy!

Tags: apis, hackday, hacks, rss, slideshare, unihackday, webdevelopment
Posted in apis, hackday, hacks, rss, slideshare, unihackday, webdevelopment | Comments Off on It’s all about APIs these days.

Posts Tagged ‘webdevelopment’

Mess them up while they are young? Better web tutorials now!

The “hello world” from hell

The curse of document.write()

The curse of alert() and prompt()

The myth of the short attention span

Lipstick and a wig on a pig

Changing the approach of explaining code in web development

Things you can use – my talk at the Deveoper Evening with Yahoo and Opera in Oslo, Norway

Slides

Audio

Notes

Things you can use (by the Yahoo Developer Network and friends)

Learning the basics

Starting with a clean canvas

CSS frameworks and frontend libraries

Building interfaces that work

Using the web

Thinking data first

Mixing the web with YQL

Is this limited to Yahoo?

Scraping HTML with YQL

Extending YQL

Building with Blocks

Wanna get super famous?

Elevator pitches

Yahoo User Interface Library – YUI

Yahoo Query Language – YQL

Yahoo Application Platform – YAP

Developing *with* the web, eh? Speaking at the domain convergence in Toronto, Canada

The presentation SlideCast

Notes

Developing with the web

Build it and they will come doesn’t cut it any more

Back when this was all fields

The web 2.0 revolution

Bring the noise!

Oh, shiny object!

If you can’t fight them…

Spreading and collecting

Bringing the web to the site

Digging into the web of data.

Getting to the juicy, yummy data

Re-mixing the web.

Display them using free widgets!

All of this works using APIs

API issues

Conjuring content?

Content issues

Lack of context

UGC issues

Placement

Legal issues

Demos and resources

Play nice!

Colalife.org

Thanks!

Extra achivement

Creating Happy Little Web Sites – my tech talk at the Guardian

It’s all about APIs these days.

The curse of `document.write()`

The curse of `alert()` and `prompt()`

Developing with the web, eh? Speaking at the domain convergence in Toronto, Canada