Christian Heilmann

Yahoo BOSS keyword extraction API wrappers (JS/PHP)

Thursday, November 13th, 2008 at 3:49 pm

One of my favourite “old school” Yahoo APIs is the term extractor which is a service that extracts relevant keywords from a text you give it.

Yahoo BOSS is now supporting this feature for indexed web sites. While you’d normally just get a list of sites with for example:

http://boss.yahooapis.com/ysearch/web/v1/donkeys?format=xml&appid={appid}

You can get the keywords for each of the pages returned by adding the (so far undocumented) view=keyterms parameter:

http://boss.yahooapis.com/ysearch/web/v1/donkeys?format=xml&view=keyterms&appid={appid}

This can be pretty useful to get a list of keywords related to a certain term.

In order to do this, I’ve written a small API in PHP and JavaScript that gets you the related terms from the first ten search results and returns them as an array.

The PHP API wrapper

The PHP version takes three parameters: the mandatory term to search for, an optional callback method name to wrap around the JSON return value and an optional format parameter that can be set to HTML to return an HTML list instead of a JSON object.

The JavaScript API wrapper

The JavaScript wrapper uses dynamically generated script nodes to retrieve the data and can be used by simply calling a BOSSTERMS.get() method with a search term and the name of a callback method. The return object has a term property, the keywords as an array and a string that is an HTML list of the terms.

Get the lot

You can download the whole BOSS keyword API here. As always, it is BSD licensed, so go nuts using it :)

Tags: , , ,

Share on Mastodon (needs instance)

Share on BlueSky

Newsletter

Check out the Dev Digest Newsletter I write every week for WeAreDevelopers. Latest issues:

Don't stop thinking, AI Slop vs. OSS Security, rolling your own S3 Despite AI you still need to think, Bitter lessons from building AI products,  AI Slop vs. OSS security and pointer pointer…
200: Building for the web, what's left after rm -rf & 🌊🐴 vs AI What remains after you do a rm -rf? Why do LLMs know about a seahorse emoji? What image formats should you use? How private is your car?
Word is Doomed, Flawed LLM benchmarks, hard sorting and CSS mistakes Spot LLM benchmark flaws, learn why sorting is hard, how to run Doom in Word and how to say "no" like a manager.
30 years of JS, Browser AI, how attackers use GenAI, whistling code Learn how to use AI in your browser and not on the cloud, why AI makes different mistakes than humans and go and whistle up some code!
197: Dunning-Kruger steroids, state of cloud security, puppies>beer

My other work: