About Open Calais

About Open Calais

With a market-leading ontology linked to Thomson Reuters’ authorities and products, Thomson Reuters Open Calais™ offers the easiest and most accurate way to tag the people, places, companies, facts, and events in your content to increase its value, accessibility and interoperability.

How do we do it? We use Natural Language Processing (NLP) and machine learning algorithms trained by hundreds of Thomson Reuters’ Editorial teams for several years to offer the industry’s best combination of company extraction and relevance. For the user, the process is pretty simple. You feed unstructured text into the extraction engine (news articles, blog postings, etc.) to examine your text and locate:

  • Entities: (Companies, people, places, products, etc.)
  • Relationships: (John Doe works for Acme Corp.)
  • Facts: (John Doe is a 42-year old male CFO)
  • Events: (Jane Doe was appointed a board member of Acme Corp.)
  • Topics: (Story is about M&As in the Pharma industry)

Open Calais then processes the information extracted from the text and returns semantic metadata in RDF format. Here are just a few of the many advantages:

  • Contextual navigation: Pinpoint the most relevant companies, people and industries
  • More focused news: Get highly relevant, targeted news for companies and industries of interest
  • Fast processing: It takes, on average, well under a second to process a sizable news article
  • Greater intelligence: Goes far beyond classic entity identification and returns the relevant facts and events hidden within the text

Open Calais relies on the curated authorities maintained by thousands of Thomson Reuters’ data team members and also leverages the identity management provided by Thomson Reuters’ experts.