named entity extraction + disambiguation?
named entity extraction + disambiguation?
Posted on: Mon, 05/19/2008 - 07:24
can calais tell thedifference, currently, between the concepts behind:
http://en.wikipedia.org/wiki/Cambridge%2C_Massachusetts
http://en.wikipedia.org/wiki/Cambridge
?
it seems to assign the same hash GUID to all "cambridge"'s
best--
--chris sizemore
Trackback URL for this post:
http://www.opencalais.com/trackback/483

So I was playing around with OpenCalais a bit and fed it this url bumped up my threshold to 0.5 and got the following reponse for entities:
City - New York
Person - David Paterson
Person - Sheldon Silver
Person - Sheldon Smith
Currency - USD
Company - Intel
PublishedMedium - New York Magazine
So all of that is correct except for Company. Essentially the link in question refers to Daily Intel a weblog.
Hi try SemanticProxy. This service tries to clean all the non relevant info from the HTML article. (At the moment there is a bug with how the XML/RDF output file is translated into HTML (http://semanticproxy.com/demo.html). We will fix this soon. However the XML/RDF has the right information inside (http://semanticproxy.com/usingSemanticProxy.html).
In future versions of Open Calais we will add this HTML cleansing capabilities to the api.opencalais.com.
Ofer
Yeah this looks like it cleans up pretty well for the entities I posted and it doesn't even show Daily Intel which doesn't matter to me in this case. I'll just wait till it hits the API then..
Hi Chris -
Geo disambiguation has been released as a new feature in R3.1.
R3.1 is currently in Tech Preview (read the blog entry for more details) and it will soon be rolled into production.
Try to experience with the new functionality and let us know if it solves the identity problems.
Regards.
You're right. This is a weakness right now. Calais understands City, State, Country, etc - but doesn't understand the combination of these as a unique item. It's something we're thinking about (see other posts on address extraction). and want to make happen in the reasonably near future.
Paris, TX and Paris, France would be a startling mistake for a traveler to make,
Regards,