Christian Heilmann

Adding transcripts to presentations embedded from SlideShare using YQL

Sunday, January 11th, 2009 at 9:37 pm

I like SlideShare a lot (yeah, repetition, I know). It is a great way of spreading your presentations as it allows others to embed them into their blogs and web sites and it also allows people to download and re-use what you’ve done.

One thing I really like about SlideShare is that it creates HTML transcripts of your presentation displayed far down the page as shown in the following screenshot:

Screenshot of slideshare showing transcripts

Now, the annoying thing is that the SlideShare API - full of goodness as it is – does not provide a way to get to these transcripts in case you want to display them alongside your presentation. You also need to authenticate for detailed information which is very needed but also bad if you just want to offer a JavaScript solution.

Good thing that there is an easy way to retrieve information from any web site out there now and get it back as JSON to use in JavaScript: YQL. Using the YQL console and some XPATH I can for example extract the transcription information and get it back as XML (the slideshare URL is http://www.slideshare.net/cheilmann/shifting-gears-presentation and the XPATH //ol/li):


http://query.yahooapis.com/v1/public/yql?q=select * from html where url%3D”http%3A%2F%2Fwww.slideshare.net%2Fcheilmann%2Fshifting-gears-presentation” and xpath%3D’%2F%2Fol%2Fli’&format=xml

If you define the format as JSON and provide a callback parameter you can easily use this to write a small script to inject transcripts into SlideShare embeds:

slideshareTranscripts = function(){
var div = document.getElementsByTagName(‘div’);
var containers = {};
for(var i=0;div[i];i++){
if(div[i].id.indexOf(‘__ss’)!==-1){
var slideurl = div[i].getElementsByTagName(‘a’)[0].href.split(‘?’)[0];
containers[slideurl]={c:div[i],id};
get(slideurl);
}

}
function get(url){
var url = ‘http://query.yahooapis.com/v1/public/yql?’ +
‘format=json&callback=slideshareTranscripts.doit&q=’ +
‘select%20strong,p%20from%20html%20where%20url%3D%22’ +
encodeURIComponent(slideurl) +
‘%22%20and%20xpath%3D%27%2F%2Fol%2Fli%27&’;
var s = document.createElement(‘script’);
s.src = url;
s.type = ‘text/javascript’;
document.getElementsByTagName(‘head’)[0].appendChild(s);
}

function doit(o){
var url = decodeURIComponent(o.query.uri).split(‘”’);
var out = document.createElement(‘ol’);
var lis = o.query.results.li;
for(var i=0,j=lis.length;i var li = document.createElement(‘li’);
var strong = document.createElement(‘strong’);
var p = document.createElement(‘p’);
strong.appendChild(document.createTextNode(lis[i].strong));
p.appendChild(document.createTextNode(lis[i].p));
li.appendChild(strong);
li.appendChild(p);
out.appendChild(li);
}

containers[url[1]].c.appendChild(out);
}

return{doit:doit}
}();

You can see the script in action and download it for your own use. All you need to do is add it to the bottom of any document with one or several SlideShare embed codes in it. Here’s what the script does:

slideshareTranscripts = function(){
var div = document.getElementsByTagName(‘div’);

We get all the DIV elements in the page (there is probably a faster way using CSSQuery these days, but let’s be bullet-proof).

In order to find out which one of the DIVs is a SlideShare embed, we check for an ID that contains __ss as the embed codes have a generated ID starting with this.

What we will do with each of these is to find out what the url of the slideshow is. This is needed because of two reasons: first of all we want to retrieve the transcript and secondly we need a way of matching the returned data from YQL with the right DIV container.

As generated script nodes are not safe to load one after the other this makes sure that the right transcript gets added to the right slides.

So, what we do is check the ID of the DIV, get the first link, retrieve the url from it, store the DIV in an object called containers and use the url as the property name. This property gets a shortcut to the correct DIV as its value. All we need to do then is to get the transcript data via get():


var containers = {};
for(var i=0;div[i];i++){
if(div[i].id.indexOf(‘__ss’)!==-1){
var slideurl = div[i].getElementsByTagName(‘a’)[0].href.split(‘?’)[0];
containers[slideurl]={c:div[i],id};
get(slideurl);
}

}

The get() method is your garden variety script node generation function that calls the YQL API and defines slideshareTranscripts.doit() as the callback parameter. This means that the script will generate a script node, point it to YQL, YQL then gets the transcript information from SlideShare, turns it into JSON and calls doit() with the transcript information as a parameter.


function get(url){
var url = ‘http://query.yahooapis.com/v1/public/yql?’ +
‘format=json&callback=slideshareTranscripts.doit&q=’ +
‘select%20strong,p%20from%20html%20where%20url%3D%22’ +
encodeURIComponent(slideurl) +
‘%22%20and%20xpath%3D%27%2F%2Fol%2Fli%27&’;
var s = document.createElement(‘script’);
s.src = url;
s.type = ‘text/javascript’;
document.getElementsByTagName(‘head’)[0].appendChild(s);
}

The job of doit() is to generate the right markup and inject it into the embed codes. We create an ordered list and loop over all the li elements. We create the fitting HTML code for each of the list items and add the content as text nodes (innerHTML would try to render markup used in the slides – not a good idea).

We then retrieve the URL from the uri property and add the list as a new child node to the container div stored previously in the container object.

nt
function doit(o){
var out = document.createElement(‘ol’);
var lis = o.query.results.li;
for(var i=0,j=lis.length;i var li = document.createElement(‘li’);
var strong = document.createElement(‘strong’);
var p = document.createElement(‘p’);
strong.appendChild(document.createTextNode(lis[i].strong));
p.appendChild(document.createTextNode(lis[i].p));
li.appendChild(strong);
li.appendChild(p);
out.appendChild(li);
}

var url = decodeURIComponent(o.query.uri).split(‘”’);
containers[url[1]].c.appendChild(out);
}

return{doit:doit}
}();

That’s it. Of course you can simplify this by matching only the OL and using innerHTML to write out the list in one go, but this solutions also allows you to alter the HTML output of the list to – for example – make every http://example.com a link.

Happy transcribing!

Tags: , , , , , ,

Share on Mastodon (needs instance)

Share on Twitter

My other work: