Christian Heilmann

Posts Tagged ‘analysis’

Analyzing the FIFA2010 worldcup with Guardian Data and YQL

Wednesday, July 14th, 2010

Breaking news: The Guardian once again involved in committing a data awesome! As before, the UK newspaper graced developers with a really cool piece of information published on the web: all the World Cup 2010 statistics as an Excel Sheet.

Now, the easiest way to play with this data is to use YQL, so I simply took a copy of the information and shared it as a CSV document on Google Docs. That way I can use it in YQL:

select * from csv where url=”http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv” and columns=”surname,team,position,time,shots,passes,tackles,saves”

You can Try this out in the console and see the results here.

Using YQL to filter and sort this, you can do some interesting searches on that information. For example:

What were the German mid field Players?

select * from csv where url=”http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv” and columns=”surname,team,position,time,shots,passes,tackles,saves”
and team=”Germany” and position=”Midfielder”

You can Try this out in the console and see the results here.

Using sort() and reverse() you can do rankings. For example:

Who was the goalkeeper with the most saves?

(Neuner of Germany, Kingson of Ghana and Enyeama of Nigeria in case you wonder)

select * from csv where url=”http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv” and
columns=”surname,team,position,time,shots,passes,tackles,saves”
and position=”Goalkeeper” | sort(field=”saves”) | reverse()

You can Try this out in the console and see the results here.

Which was the player that spent most time on the pitch?

select * from csv where url=”http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&
output=csv” and
columns=”surname,team,position,time,shots,passes,tackles,saves”
| sort(field=”time”) | reverse()

You can Try this out in the console and see the results here.

Who were the players who were the least on the pitch in the German and Brazilian teams?

select * from csv where url=”http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv” and columns=”surname,team,position,time,shots,passes,tackles,saves”
and team in (“Germany”,”Brazil”) | sort(field=”time”)

You can Try this out in the console and see the results here.

Using the CSV output and YQL you can do all kind of cool things with that data – as YQL also releases it as JSON it makes it easy to create interactive interfaces and visualizations, too – why don’t you have a go?

TweetEffect – find out when people followed or left you on Twitter

Saturday, January 17th, 2009

I’ve just finished uploading and fixing TweetEffect a web app that allows you to check which of your latest http://twitter.com updates resulted in people following or leaving you:

TweetEffect Screenshot

With the demise of Qwitter there was no real update mechanism to know when you lost followers, this might be a step to fill this gap. TweetEffect does not tell you who left you, only the number of people. The reason is the draconic rate limiting of Twitter. This is also the reason why the app largely runs in JavaScript and only does checks server-side when you provide a user ID on the URL. This allows for bookmarking and sharing with others.

Check out some examples:

What do you think? Anything else to add?