Christian Heilmann

Going a little crazy – one HTTP request RSS reader in JavaScript

Monday, December 21st, 2009 at 1:07 am

the joker and two face by  ♠NiJoKeR♣. Ok, using YQL and playing around with the console can make you go a bit too far.

A few days ago and in response to my 24 ways article on YQL my friend Jens Grochtdreis asked me how to get the thumbnails and some other data from the Slideshare site in one YQL request. He tried multiple XPATH filtering until I pointed out that there is a perfectly valid RSS feed with thumbnails.

That made we wonder why we really have to care about the detection of a feed but instead use it when it is there and let the computer do the detection for us. What I wanted to do was to turn the following HTML automatically into a list with the feed data as embedded lists:

The ungodly YQL request I came up with was the following:

select
title,link,content.thumbnail,thumbnail,description
from feed where url in (
select href from html where url in (
“http://wait-till-i.com”,
“http://flickr.com/photos/codepo8”,
“http://slideshare.com/cheilmann”,
“http://youtube.com/chrisheilmann”
) and
xpath=”//link[contains(@type,’rss’)][1]”)
|unique(field=”link”)

What is going on here? I am using the html table to read in each of the resources I want to analyse:

select * from html where url in (
“http://wait-till-i.com”,
“http://flickr.com/photos/codepo8”,
“http://slideshare.com/cheilmann”,
“http://youtube.com/chrisheilmann”
)

Then I use xpath and return the first link element that has a type attribute that contains the word RSS. In YQL I only take its href attribute.

select href from html where url in (
“http://wait-till-i.com”,
“http://flickr.com/photos/codepo8”,
“http://slideshare.com/cheilmann”,
“http://youtube.com/chrisheilmann”
) and
xpath=”//link[contains(@type,’rss’)][1]”)

Notice the joy that is xpath syntax… 0 is the first – every developer knows that! We then use the feed table to get the feed information from each of these hrefs as urls:

select
title,link,content.thumbnail,thumbnail,description
from feed where url in (
select href from html where url in (
“http://wait-till-i.com”,
“http://flickr.com/photos/codepo8”,
“http://slideshare.com/cheilmann”,
“http://youtube.com/chrisheilmann”
) and
xpath=”//link[contains(@type,’rss’)][1]”)

The last thing that was a problem is that Flickr returns the photo items several times that way as it has a feed for the url of the photo and one for the link to the license of the photo. Therefore we needed to use unique() to get only the first of these:

select
title,link,content.thumbnail,thumbnail,description
from feed where url in (
select href from html where url in (
“http://wait-till-i.com”,
“http://flickr.com/photos/codepo8”,
“http://slideshare.com/cheilmann”,
“http://youtube.com/chrisheilmann”
) and
xpath=”//link[contains(@type,’rss’)][1]”)
|unique(field=”link”)

So, this actually does what we want – we have all the different requests in one HTTP request and then only need some JavaScript to display it. The data coming back is a mess, as it is just an array of items – so we need to loop and check the link of each to know when to go to the next list item.

This is very quick and dirty:

var x = document.getElementById(‘feeds’);
var containers = [];
if(x){
var links = x.getElementsByTagName(‘a’);
var resources = [];
var urls = [];
for(var i=0,j=links.length;i containers.push(links[i].parentNode);
urls.push(links[i].getAttribute(‘href’));
}

var yql = ‘select title,link,content.thumbnail,thumbnail,’+
‘description from feed where url in (select href ‘+
‘from html where url in (“’+urls.join(‘”,”’)+’”) and’+
’ xpath=”//link[contains(@type,’rss’)][1]”)’+
‘|unique(field=”link”)’;
var api = ‘http://query.yahooapis.com/v1/public/yql?q=’+
encodeURIComponent(yql)+’&format=json&callback=foo’;
var s = document.createElement(‘script’);
s.setAttribute(‘src’,api);
document.getElementsByTagName(‘head’)[0].appendChild(s);
}

function foo(o){
var items = o.query.results.item;
var c = 0;
var out = ‘’;
for(var i=0,j=items.length;i out += ‘

  • ‘+items[i].title+’‘;
    if(items[i].thumbnail || items[i].content){
    var thumb = items[i].thumbnail || items[i].content.thumbnail;
    out += ‘‘;
    } else {
    if(items[i].description.indexOf(‘src’)!=-1){
    var thumb = items[i].description.split(‘src=”’)[1];
    thumb = thumb.split(‘”’)[0];
    out += ‘‘;
    }

    }
    out += ‘

  • ‘;
    if((items[i+1] && items[i+1].link.substr(0,20) !=
    items[i].link.substr(0,20))){
    containers©.innerHTML+=’
      ‘+out+’
    ‘;
    c++;
    out=’‘;
    }

    }
    containers©.innerHTML+=’

      ‘+out+’
    ‘;
    }

    However, the bad news about this is that it is pretty pointless as the performance is terrible. Not really surprising if you see what the YQL servers have to do and how much data gets loaded and analysed.

    pointless performance by  you.

    You could of course cache the result locally and thus get it down to a very small amount. However, if you go this way you might as well go fully server-side.

    I am currently working on making icant.co.uk perform much faster, so watch this space for a generic RSS displayer :)

    Tags: , ,

    Share on Mastodon (needs instance)

    Share on Twitter

    Newsletter

    Check out the Dev Digest Newsletter I write every week for WeAreDevelopers. Latest issues:

    160: Graphs and RAGs explained and VS Code extension hacks Graphs and RAG explained, how AI is reshaping UI and work, how to efficiently use Cursor, VS Code extensions security issues.
    159: AI pipelines, 10x faster TypeScript, How to interview How to use LLMs to help you write code and how much electricity does that use? Is your API secure? 10x faster TypeScript thanks to Go!
    158: 🕹️ Super Mario AI 🔑 API keys in LLMs 🤙🏾 Vibe Coding Why is AI playing Super Mario? How is hallucinating the least of our worries and what are rules for developing Safety Critical Code?
    157: CUDA in Python, Gemini Code Assist and back-dooring LLMs We met with a CUDA expert from NVIDIA about the future of hardware, we look at how AI fails and how to play pong on 140 browser tabs.
    156: Enterprise dead, all about Bluesky and React moves on! Learn about Bluesky as a platform, how to build a React App and how to speed up SQL. And play an impossible game in the browser.

    My other work: