Christian Heilmann

Analyzing the FIFA2010 worldcup with Guardian Data and YQL

Wednesday, July 14th, 2010 at 10:20 am

Breaking news: The Guardian once again involved in committing a data awesome! As before, the UK newspaper graced developers with a really cool piece of information published on the web: all the World Cup 2010 statistics as an Excel Sheet.

Now, the easiest way to play with this data is to use YQL, so I simply took a copy of the information and shared it as a CSV document on Google Docs. That way I can use it in YQL:

select * from csv where url="http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv" and columns="surname,team,position,time,shots,passes,tackles,saves"

You can Try this out in the console and see the results here.

Using YQL to filter and sort this, you can do some interesting searches on that information. For example:

What were the German mid field Players?

select * from csv where url="http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv" and columns="surname,team,position,time,shots,passes,tackles,saves"
and team="Germany" and position="Midfielder"

You can Try this out in the console and see the results here.

Using sort() and reverse() you can do rankings. For example:

Who was the goalkeeper with the most saves?

(Neuner of Germany, Kingson of Ghana and Enyeama of Nigeria in case you wonder)

select * from csv where url="http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv" and
columns="surname,team,position,time,shots,passes,tackles,saves"
and position="Goalkeeper" | sort(field="saves") | reverse()

You can Try this out in the console and see the results here.

Which was the player that spent most time on the pitch?

select * from csv where url="http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&
output=csv" and
columns="surname,team,position,time,shots,passes,tackles,saves"
| sort(field="time") | reverse()

You can Try this out in the console and see the results here.

Who were the players who were the least on the pitch in the German and Brazilian teams?

select * from csv where url="http://spreadsheets.google.com/pub?
key=0AhphLklK1Ve4dEdrWC1YcjVKN0ZRbTlHQUhaWXBKdGc&single=true&gid=1&x=1&
output=csv" and columns="surname,team,position,time,shots,passes,tackles,saves"
and team in ("Germany","Brazil") | sort(field="time")

You can Try this out in the console and see the results here.

Using the CSV output and YQL you can do all kind of cool things with that data – as YQL also releases it as JSON it makes it easy to create interactive interfaces and visualizations, too – why don’t you have a go?

Tags: , , , , ,

Share on Mastodon (needs instance)

Share on BlueSky

Newsletter

Check out the Dev Digest Newsletter I write every week for WeAreDevelopers. Latest issues:

Word is Doomed, Flawed LLM benchmarks, hard sorting and CSS mistakes Spot LLM benchmark flaws, learn why sorting is hard, how to run Doom in Word and how to say "no" like a manager.
30 years of JS, Browser AI, how attackers use GenAI, whistling code Learn how to use AI in your browser and not on the cloud, why AI makes different mistakes than humans and go and whistle up some code!
197: Dunning-Kruger steroids, state of cloud security, puppies>beer
196: AI killed devops, what now? LLM Political bias & AI security Learn how AI killed DevOps, create long tasks in JS, why 1 in 5 security breaches are AI generated code & play "The Scope Creep"
195: End of likes, JS Zoo and Tim Berners-Lee doesn't see AI vs Web Meta kills like buttons, Tim-Berners-Lee thinks AI won't kill the web, GitHub is ending toasts and the worst selling Microsoft product.

My other work: