We released Calais a little less than nine months ago. It’s been a fascinating process and an edifying period.

On the one hand we’ve seen a level of interest and adoption well beyond anything we’d anticipated: 6,000 registered developers. Well over 1,000,000 transactions per day. Dozens of creative and inspirational applications. It’s been great.

On the other, we have been reminded that semantically enabling the web is primarily a challenge of critical mass. Publishers are waiting for semantic consumers (search engines, news aggregators and applications) before they work on adding semantic metadata to their content.  Meanwhile application developers are waiting for the publishers to act.

We know we’ll get there in the end – but it’s slower than we’d like to see.

SemanticProxy is our attempt to jumpstart the semantic consumer end of the equation. We have all the standards we need.  What we’re missing is a critical mass of semantically enhanced content.

SemanticProxy doesn’t solve that problem, but it can act as a catalyst.

SemanticProxy makes any web site – particularly news sites – behave like a semantically enabled web site. Instead of making you write the programs to fetch a page, clean the HTML, process it with Calais and then get the resulting RDF, SemanticProxy does the heavy lifting.

You hand it a URL, and SemanticProxy hands back rich semantic metadata as RDF or MicroFormats.

SemanticProxy follows the standards for publishing linked data on the web – a good overview of which can be found here.

The best way to experiment with it is to get a Calais API key and use the URL Builder. Copy the resulting URL, paste it into your browser and you’ll see the results.

While doing that will show you what’s going on, SemanticProxy is really meant for machines to talk to. You could construct a simple web crawler that fetches the semantic content of each page. You could build a browser plugin that exposes the underlying semantic content of a page while you’re browsing. We’re looking forward to seeing your ideas.

It’s still in beta. We’ve optimized it for the top 30 English language news sites – but it works quite well on Wikipedia and other sites as well. Go forth and experiment.

We’ve designed SemanticProxy to scale well with demand. It runs almost entirely in the cloud.  It can handle tens of millions of transactions a day right now, and it can scale to hundreds of millions whenever we need to.

Visit SemanticProxy.com and let us know what you think. We’d appreciate feedback, ideas and critiques.

Trackback URL for this post:

Login or Register to post a comment.