Here’s what’s up. Over the last several months you may have read about partnerships between OpenCalais and organizations like CNET/CBSi, The Huffington Post, Magus, Associated Content, and a variety of others. And we, of course, think those are great.

But, we realized we’re not doing a good enough job of sharing some of the basic ways that people are getting value out of OpenCalais. We have the great opportunity to talk to a wide range of individuals and organizations every week and we think about this stuff – a lot – ourselves. So – the purpose of this blog entry (and more to come) is to just simply start sharing our thoughts and what you can do with OpenCalais. We can’t share everything (NDA’s and all that) – but we can share a lot.

This post is going to cover two things: 1) What OpenCalais does at a ridiculously high level and 2) a few starter ideas for what you can do with it. Follow-on posts will cover additional ideas and we’ll try to keep them coming at a good clip.

What Does Calais Do?

Several things actually.

  • It analyzes text you send it and extracts entities (people, organizations, geographies, etc.). In many cases, it links those entities to the world of Linked Data.
  • It extracts facts – like the fact that John Doe is the CEO of Acme Corporation or such.
  • It extracts events – like mergers, earning announcements, natural disasters and a bunch of others.
  • It attaches a topic to the text as a whole, much like a newspaper would (Sports, Finance, Health, etc.).
  • It creates SocialTags – our attempt to “tag” the article a way a human would to file it away somewhere.

There is a whole lot more going on in the background – but those are the basics.

Why is OpenCalais Unique?

  • First, there are a lot of entity extraction tools out there. Some are good – some not so great.  Entity extraction is fine (if a little mundane) – but the real power of understanding text comes from understanding Facts and Events. Calais is the only tool that does that well.  It may be the only tool that does that at all.
  • Second, it is high performance. We process millions and millions of transactions every day for a diverse set of users. On average they take less than 0.75 seconds to complete.
  • Third, it’s free for up to 50,000 submissions per day for commercial or non-commercial purposes. If you need more or want an SLA we have commercial options and have deals in place with clients who process millions of transactions per day.
  • Fourth, you can count on it being here. OpenCalais is provided by Thomson Reuters – the world’s largest content company. We’re not going anywhere – so you can feel confident building a business or solution on top of OpenCalais technology.

How Can I use OpenCalais?

We’re not going to try and structure this as use cases for different groups (publishers, bloggers, museums, etc.) – rather we’re just going to talk about the general types of things you can do with OpenCalais. You’ll have to do the translation to your situation on your own. We’re also not going to put out the complete technical recipe – just the general concept.  And – no smoke and mirrors – this is all stuff you can do today.

The first round…

Triage — A simple use that saves time and money. If you’re faced with a large influx of content (say press releases) and you have a staff reviewing them for material that might be relevant to you – OpenCalais can probably help. Send the document to OpenCalais, get back the metadata and apply a few business rules. If you only care about mergers and acquisitions, then filter for that type of event and throw the rest away. We have real-world cases where this has reduced the volume of material to be processed by 60% and improved the accuracy of results by 10%.

Workflow — Triage with a twist. Use the metadata returned by OpenCalais to route documents to the right person and/or system based on the facts and events inside the document.

Content Enhancement — There’s a whole world of Linked Data out there and OpenCalais can be your entry point. For example – take in press releases, and extract the companies mentioned in them. Use OpenCalais’ Linked Data entry points to get the SIC codes and the link to DBPedia. Access DBPedia and enhance your content with other information about the company like locations, people, products. Access Geonames to figure out what region the company is located in. Take that enhanced content and do cool things (like triage and workflow and presentation) with it.

Alerting — Give users the ability to be alerted when certain types of content becomes available. Unlike simple keyword alerting with OpenCalais + Linked Data you can construct alerts like, “Tell me when there is M&A activity for a company in the Steel industry.”

Media Monitoring — Whether you’re a media monitoring company or do it for your own company – it just got easier. Take in a content feed (social media, press releases, news), use OpenCalais to categorize and organize it, – put the results in a database and set some trigger levels.

Content Harmonization — Are you managing content from diverse sources? OpenCalais can take content from multiple sources (different news feeds, different museum collections, etc.) and apply a consistent set of metadata tags to all of it. Pop it in your content management system and you can treat it all as one harmonized content asset.

Automated News Portals — Want to create a general purpose news portal? Or maybe one that deals only with baseball news? Great. Subscribe to and/or acquire some content sources, and feed them through OpenCalais. Then use the metadata to throw away what you don’t care about and to organize the rest by topic, geography, person – whatever. A great example of an off-the-shelf solution that does this is OpenPublish.

Finer-Grained / Higher-Value Syndication — Do you have content consumers via RSS or other syndication methods? Give them a better experience by allowing them to create their own channels based on OpenCalais metadata. Create channels based on region, types of events, companies, etc. – or any combination of those and other items.

SEO — Something we get asked about all the time – we know people are experimenting – but they’re not being very public about their experimentation. Here’s a simple idea though: make your content more search friendly. Two routes: One easy, one a little harder.

Route 1: Translate events into human readable text and get it on your page. Have a complicated article about an LBO of company x by people y? OpenCalais will identify an M&A event. Take that event and turn it into a tag like “Acquisitions” – something people might actually search for. Don’t just use it as a metatag – incorporate into the page via navigation or whatever so Google pays attention.

Route 2: Use linked data to enhance your content. If you’re talking about a company or geography use OpenCalais Linked Data to enhance the page with additional information from Dbpedia, Geonames, CIA world fact book or a bunch of other sources.

New Presentation Metaphors — With consistent metadata extraction across your content you can implement new navigation and search tools. Two examples. The Powerhouse Museum (here’s an example) tags everything in their collection and shows them as search terms. Some interesting insights emerge. Second example: Slate Magazine processes the day’s news and creates a network diagram of what’s connected to what (here it is). Pretty interesting. Our recommendation here: unless your audience is researchers, start simple and expand – it’s easy to overwhelm the average user with novel graphics, etc.

Looking Ahead

That’s just a start. We’ll keep publishing this entry with new ideas as we get the time. There are about 15 additional ones on the list today that we haven’t covered here – and more will be coming in the next few weeks.

And remember – these are building blocks. Mix and match them to create something really cool, and let us know what you’re doing.

Teaser: 
A wide range of OpenCalais applications - analyzing text and enriching information

Trackback URL for this post:

http://www.opencalais.com/trackback/29168
Login or Register to post a comment.