Christian Heilmann

You are currently browsing the Christian Heilmann blog archives for December, 2009.

Archive for December, 2009

Building a (re)search interface for Yahoo, Bing and Google with YQL

Wednesday, December 9th, 2009

If you do a lot of research using web searches can be frustrating. Different search engines have different results, you need to open things in tabs and in general it can be pretty time consuming to find what you need.

To make this a bit easier I thought it’d be cool to have an interface that searches Yahoo, Google and Bing at the same time and thus I built GooHooBi:

As explained in the screen cast the thing under the hood of GooHooBi is YQL. Instead of fussing about with all the different search APIs all I did was to play with the YQL console and put together the following YQL statement:

select * from query.multi where queries=’
select Title,Description,Url,DisplayUrl
from microsoft.bing.web(20) where query=”cat”;
select title,clickurl,abstract,dispurl
from search.web(20) where query=”cat”;
select titleNoFormatting,url,content,visibleUrl
from google.search(20) where q=”cat”

The query.multi table in YQL allows you to list a few queries which will be executed one after the other on the YQL server. The queries themselves I put together by using the different tables in the console and only selecting what I really need from each of them.

You can try this query in the YQL console and you can see the JSON output.

The rest is pretty easy. Cut this up into a parameterized string and do a cURL call:

$query = filter_input(INPUT_GET, ‘search’, FILTER_SANITIZE_SPECIAL_CHARS);

$queries[] = ‘select Title,Description,Url,DisplayUrl ‘.
‘from microsoft.bing.web(20) where query=”’.$query.’”’;
$queries[] = ‘select title,clickurl,abstract,dispurl ‘.
‘from search.web(20) where query = “’.$query.’”’;
$queries[] = ‘select titleNoFormatting,url,content,visibleUrl ‘.
‘from google.search(20) where q=”’.$query.’”’;
$url = “select * from query.multi where queries=’”.join($queries,’;’).”’”;
$api = ‘http://query.yahooapis.com/v1/public/yql?q=’.
urlencode($url).’&format=json&env=store’.
‘%3A%2F%2Fdatatables.org%2Falltableswithkeys&diagnostics=false’;

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $api);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
$data = json_decode($output);

Then loop over the results and assemble the HTML output.

You can check the source code of GooHooBi.

In addition to this, here’s a half hour live coding screencast how to build something similar:

Building a search mashup with YQL using Google, Yahoo and Bing – live :) from Christian Heilmann on Vimeo.

The source of the code built in this screencast is also on GitHub.

Using YQL to load and convert RSS feeds really, really fast.

Tuesday, December 8th, 2009

My esteemed colleague Stoyan Stefanov is currently running an advent calendar (blog post a day) on performance. Today I have a guest slot on his blog showing how you can use YQL to retrieve five RSS feeds much faster than with any other technology.

Retrieving five RSS feeds speed comparison.

As stated at the end of the article, you could use a YQL open table with embedded JavaScript to move all of the hard conversion work to the YQL server, too.

This table does exactly that. The speed of the retrieval slows down a bit with this (as YQL needs to do another request to pull the table definition):

Retrieving five RSS feeds and converting it on the server with YQL execute by  you.

However, using this table to retrieve multiple feeds as HTML is dead easy:

$data = array(
‘http://code.flickr.com/blog/feed/rss/’,
‘http://feeds.delicious.com/v2/rss/codepo8?count=15’,
‘http://www.stevesouders.com/blog/feed/rss’,
‘http://www.yqlblog.net/blog/feed/’,
‘http://www.quirksmode.org/blog/index.xml’
);
$url =’http://query.yahooapis.com/v1/public/yql?q=’;
$query = “use ‘http://github.com/codepo8/yql-rss-speed-comparison/raw/master/rss.multi.list.xml’ as m;select * from m where feeds=”’”.implode(“’,’”,$data).”’” and html=’true’ and compact=’true’”;
$url.=urlencode($query).’&format=xml&diagnostics=false’;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$content = curl_exec($ch);
curl_close($ch);
$content = preg_replace(‘/.*
$content = preg_replace(‘/div>.*/’,’div>‘,$content);
echo $content;

To use the open table you simply need to give it the list of RSS feeds as the feeds parameter:

use ‘http://github.com/codepo8/yql-rss-speed-comparison/raw/master/rss.multi.list.xml’ as m;
select * from m where feeds=”
‘http://code.flickr.com/blog/feed/rss/’,
‘http://feeds.delicious.com/v2/rss/codepo8?count=15’,
‘http://www.stevesouders.com/blog/feed/rss’,
‘http://www.yqlblog.net/blog/feed/’,
‘http://www.quirksmode.org/blog/index.xml’
” and html=’true’ and compact=’true’

Try it out in the YQL console.

The html parameter defines if you want to get HTML back from the table. Take it out to get a list of feeds instead.

See the results as HTML (with an HTML parameter) or as feeds (without the HTML parameter).

The compact parameter defines if you want to get descriptions back for each entry or not.

See the results as HTML (without descriptions) or as HTML with descriptions.

By using the JSON-P-X output format (xml with callback) you could easily use this in JavaScript:



If you want to compare yourself, get the source code of all the examples from GitHub.

Things you can use – my talk at the Deveoper Evening with Yahoo and Opera in Oslo, Norway

Thursday, December 3rd, 2009

I am currently in (freezing) Oslo in Norway and about to leave for the NITH university for the Yahoo and Opera developer evening. Here are the slides and notes of what I am about to present tonight. Audio will follow laterAudio is now available.

For those who will attend tonight: stop right here, you’ll just spoil it for yourself :)

Slides

Audio

The audio recording is available on archive.org or as a 54MB MP3 file.

Notes

Things you can use (by the Yahoo Developer Network and friends)

In the following hour or so I will be talking to you about some of the things that have been done for you that you can build on. Web development is confusing enough as it is. There is no need to make it more complex by not using what has been done for us already. But first, a few facts about me:

I am Chris and I’ve been developing web sites for about 12 years. I’ve worked with numerous frameworks and CMS and I’ve delivered various international and high-traffic sites. I’ve written several books and dozens of articles and hundreds of blog posts on the subject of web development, accessibility, performance, maintainability, internationalization and many other things I had to battle in my day to day job.

Right now I work as a Developer Evangelist, which is a pretty new job in the market and as many people were confused about it, I wrote a Developer Evangelism Handbook which is free and creative commons online, but you can also buy a hard copy on Lulu.com.

I work for the Yahoo Developer Network and you will find most of the things I will talk about today there.

Learning the basics

The good thing about these days we live in is that we have great resources on the web to learn good basics of web development. That we still learn it by viewing source and trial and error after copy and paste is human nature but also keeps us from evolving. The Opera Web Standards curriculum is an amazing resource to learn web standards based development (as is the WaSP interact) and the Yahoo Developer Network theatre is full of videos of talks and tutorials. All of this is free to use and to build upon.

Starting with a clean canvas

One thing I learnt is that in order to deliver products that work and are fun to maintain you need to work on a solid base. Web development is hindered by the fact that our deployment environment is totally unknown to us. This is why we need to even the playing field before we should go out on the pitch to play.

You achieve this by defining what you call “support” for browsers. In the case of Yahoo this is the graded browser support document. If you aim to make your web product look and work the same on every browser out there you are doing it wrong. Web design is meant to go with the flow and accustom itself to the ability of the user agent (browser in most cases). At Yahoo, we have the Graded Browser Support for this – browsers that are not capable of supporting new technologies will not get them. Web sites are not meant to look and work the same everywhere. On the contrary – the ability to accustom the interface to different user agents is what makes web development so powerful.

CSS frameworks and frontend libraries

The next step is to free ourselves from the limitations of browsers and especially their differences. This is what CSS frameworks and front end libraries like jQuery, Mootools, Dojo, YUI and many more are for. All of these have the same goal: allow you to build code that is predictable and limited to the bare necessities. We should not have to bloat our code just to make our products work with random browser implementation problems. These are a moving target as there is all kind of weirdness happening across the board. Libraries make our job predictable and allow us to use web standards without catering for browsers. If you build your code based on libraries you can fix your product for the next browser by upgrading the library. If you choose to do everything yourself – good luck.

Building interfaces that work

The next thing to consider is that an application interface is not just the look and feel. In order to make it work for everybody we’ll need to understand the ways users interact with our products. This includes simple usability (not overloading the user, not confusing the user) but also means knowing about different interaction channels – for example keyboard users. A great resource for starting the journey towards usable interfaces is the Yahoo Design pattern library. There we collected information how our end users use the web and reach a goal easily. If anything, have a look at these patterns before you start building your first widget or application. They even come with stencils for different designer tools.

Using the web

The big change in web development over the last few years was that we stopped trying to do everything ourselves. The web is full of specialized systems such as YouTube, Flickr and Google Maps that allow you to host data in specific formats and make it dead easy for you to convert and re-use that data in web formats. Using this information happens via Application Programming Interfaces or short APIs. These allow you to demand data in a certain format and get back only what you need. There are hundreds of APIs available on the web. For an idea of what is available, check out programmableweb.com.

Thinking data first

The main thing to be aware of if you want to build great products is to separate your data from your presentation. This is essential to allow for localization, internationalization and to keep your code maintainable. In the case of a mashup of different data sources this means you need to think about making it as easy as possible for you to use different APIs. The complexity of your product increases with the number of APIs you use. Every API has different ways to authenticate, expects different parameters and returns data in different formats.

Mixing the web with YQL

YQL is a solution that Yahoo built for its own needs. All of our products are built on APIs – for scalability reasons. Having to learn all these APIs and negotiating access cost us a lot of our time so we thought we’d come up with a better solution. This solution is called YQL. YQL is a SQL-style language to get data from the web. The following query would get us photos of London from flickr:

select * from geo.places where text=”london”

Or would it? Actually it would give us photos with the text “london” and not photos taken in Oslo. If we wanted that we need to use another API. The Yahoo Geo APIs allow you to define any place on earth and get it back as a “where on earth ID” or short woeid. This format is supported by flickr, so we can use these APIs together. Twitter will also soon support this.

select * from flickr.photos.search where woe_id in (
select woeid from geo.places
where text=”london”
)

This gives you some data of the photos you want to show but not all, so let’s use another API method to get that information.

select * from flickr.photos.info where photo_id in(
select id from flickr.photos.search where woe_id in(
select woeid from geo.places where text=”london”
)

)

This is a lot of data so in order to only retrieve what we really need we can filter the data down by replacing the *:

select farm,id,secret,owner.realname,
server,title,urls.url.content
from flickr.photos.info where photo_id in(
select id from flickr.photos.search where woe_id in(
select woeid from geo.places where text=”london”
)

)

Using this query in the YQL console, choosing YQL as the output format and “flickr” as the callback will give us a valid URL to use in a browser or script:

http://query.yahooapis.com/v1/public/yql?
q=select%20*%20from%20flickr.photos.info%20where%20photo_id%20in%20(
select%20id%20from%20flickr.photos.search%20where%20woe_id%20in%20(
select%20woeid%20from%20geo.places%20where%20text%3D%22london%22
))&format=json&diagnostics=false&callback=flickr

Using this in the src attribute of a script tag and writing a few lines of Dom Scripting displays the photos.

Is this limited to Yahoo?

No, of course not. First of all you can use any data on the web as a source using the atom, csv, feed, html, json, microformats, rss and xml tables.

Scraping HTML with YQL

The HTML table is quite interesting as it allows you to get data from any HTML document, cleans it up by running it through HTML Tidy and then allows you to access parts of it using XPATH.

This allows you for example to take a simple web site like this one about TV jokes and turn it into a wiget to include into other pages with a few lines of JavaScript.

Extending YQL

You can also add your own data by providing a simple XML schema called an open table. In this schema you need to tell YQL what the data endpoint is, what parameters are expected and what gets returned.

Open tables also allow for an execute block which allows you to write JavaScript that will be executed by YQL on the server side using Rhino and has full e4x support.

Using this we can turn the earlier example of the Flickr photos returned as a list into an open table and make it much easier to get Flickr photos in the right format:

select * from flickr.photolist where text=”me” and location=”uk” and amount=20

Using the JSON-P-X output this means we can simply use innerHTML to render out photos.

If you want your tables to be available in YQL, all you need to do is to add to the open tables repository on GitHub. For more information, check out the YQL documentation.

Building with Blocks

The best thing you can do right now if you want to build a web application is using already tested and working building blocks. The Yahoo User Interface library is full of these as this is exactly how we build our own tools.

For creating layouts with CSS without having to know all the hacks that browsers need you can use the YUI grids and if you are really lazy you can even use the WYSIWYG grids builder.

Using the CSS grids and some of the YUI widgets it is very easy to build a working application that is tested across all the browsers defined in the graded browser support. Examples are this geographical Flickr search and a showcase to get information for Delhi.

One thing that is really useful about the widgets provided in YUI is that they are all driven by custom events. That way you can extend and change their functionality without having to mess around with the code. Simply write an event listener for the custom event, add your functionality and prevent the original functionality.

Another great benefit of YUI is the very detailed documentation and the hundreds of examples you can use to get you started.

Wanna get super famous?

The last thing I want to talk about today is the Yahoo Application Platform or short YAP. Using YAP you can build a web application using JavaScript, HTML and CSS and add it to the Yahoo Homepage, My Yahoo and in the future even more properties of Yahoo.

You start at and develop your application to the largest part in your Applications Dashboard. Yahoo apps have two views: a small view which is more or less static (but allows for some Ajax) and a large view which gives you full access and much wider screen space. The small view is overlayed over the Yahoo page and will show Yahoo ads next to it. The large view can be monetized by you.

One thing that can be a bit of a frustration about YAP when you go at it with a normal web development mindset is that not all CSS/HTML and JavaScript is allowed as YAP uses Caja to keep our applications secure. Therefore we’ve put together some Caja-ready code examples to get you on track.

The easiest way to build YAP apps is by using the Yahoo Markup Language (YML) and YUI as YUI was re-written to be Caja compliant.

If you want to take a look at what a YAP application looks like, check out the source of TweetTrans on GitHub. In essence it is a simple PHP API call using YQL and a YML interface to display the results using Ajax. No JavaScript involved as YML does that for us.

You can install TweetTrans by clicking the following link: http://yahoo.com/add?yapid=zKMBH94k. This is also the way to promote your own applications (simply replace the application ID with yours) until the YAP application gallery is up and running.

The more powerful way of promoting your application in Yahoo is to piggy-back on our social connections and you can do this by diving into the social graph API. The easiest way to do that is to use the social SDK also available on GitHub. Notice that the SDK will not work on a localhost – you need to run it inside the application dashboard or a Yahoo container on the homepage.

Elevator pitches

Yahoo User Interface Library – YUI

YUI is the system that Yahoo uses to build its web sites. It is constantly tested to work for the largest amount of users, free, open source and covers everything from design patterns to out-of-the-box widgets. It is modular and you can use only what you need. You can either host it yourself or get it from a network of distributed servers.

Yahoo Query Language – YQL

YQL is a web service that allows you to mash-up any data on the web from various sources with a simple SQL-style language. You can filter the data down to what you need and you can convert the data with server-side JavaScript before returning it. Data providers can use YQL to publish an API on the web on top of Yahoo’s infrastructure and cloud storage.

Yahoo Application Platform – YAP

YAP is the Yahoo Application Platform which allows you to build applications that run on the Yahoo homepage and soon other properties. You can dive into Yahoo’s social graph to promote your applications and you can create highly secure web apps as YAP uses Caja to ensure code quality.