Slow API reponse
Slow API reponse
Posted on: Mon, 01/10/2011 - 09:38
I'm using the OpenCalais API to get tags/topics for URLs in a webapp I'm building on Google App Engine. GAE puts a 10 second limit on response times and I'm finding that 60-70% of links submitted to the OpenCalais API are not responding within that time limit.
Is this something that everyone is experiencing? If so what can we do to increase the response times?
I'm already pre-processing the sent HTML to clean up any scripts / css. I've also added the directive to not calculate relevance scores (which I was hoping would make things go faster).
Any ideas would be appreciated!
Trackback URL for this post:
http://www.opencalais.com/trackback/94333

If the service itself is timing out, it will return a 500 Internal Timeout error - are you receiving these or does the call just vanish? What happens when you do a traceroute to the target, are there any drops?
Hi Neal, sorry for the late reply.
Out of 600 or so attempted documents today, 217 just vanished. I only submit to Open Calais when I find indications of entities in the docs via internal means, so I am confident that they should be processable by OC.
I use the java HTTPClient, with retries set to 5, and have tried timeouts as long as 90 seconds.
It looks like currently traceroute is dropping out in ec2:
dswift@bunkhouse ~$ traceroute api.opencalais.com
traceroute to api.opencalais.com (75.101.136.33), 30 hops max, 40 byte packets
1 fw01.astound.net (10.67.1.1) 0.641 ms 0.252 ms 0.254 ms
2 astound-66-234-218-1.ca.astound.net (66.234.218.1) 8.310 ms 8.269 ms 8.928 ms
3 76-14-1-122.sf-cable.astound.net (76.14.1.122) 8.065 ms 15.428 ms 15.365 ms
4 76-14-1-121.sf-cable.astound.net (76.14.1.121) 15.883 ms 15.820 ms 15.746 ms
5 xe-0-2-0.mpr3.sfo3.us.above.net (64.124.146.181) 15.088 ms 15.007 ms 14.944 ms
6 xe-2-0-0.er1.sjc2.us.above.net (64.125.31.154) 18.513 ms 18.446 ms 18.288 ms
7 xe-1-1-0.cr1.sjc2.us.above.net (64.125.26.197) 18.700 ms 18.160 ms 18.144 ms
8 xe-0-1-0.mpr1.pao1.us.above.net (64.125.31.65) 18.079 ms 18.302 ms 14.629 ms
9 paix01-sfo4.amazon.com (198.32.176.36) 16.726 ms 16.578 ms 17.271 ms
10 72.21.222.194 (72.21.222.194) 98.243 ms 98.134 ms 98.101 ms
11 72.21.222.215 (72.21.222.215) 19.205 ms 17.577 ms 18.005 ms
12 72.21.222.8 (72.21.222.8) 90.497 ms 90.485 ms 89.055 ms
13 72.21.220.171 (72.21.220.171) 88.798 ms 72.21.220.169 (72.21.220.169) 88.877 ms 72.21.220.171 (72.21.220.171) 88.481 ms
14 72.21.222.147 (72.21.222.147) 92.740 ms 72.21.222.143 (72.21.222.143) 89.958 ms 72.21.222.139 (72.21.222.139) 93.163 ms
15 * 216.182.224.53 (216.182.224.53) 89.884 ms 216.182.224.49 (216.182.224.49) 89.812 ms
16 216.182.224.10 (216.182.224.10) 93.435 ms 87.930 ms 87.793 ms
17 ec2-75-101-160-37.compute-1.amazonaws.com (75.101.160.37) 87.645 ms 88.595 ms 89.164 ms
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
Let me know if you want me to setup a traceroute monitor to give you some info on this over time.
OK, The traceroute loss is not the issue; hosts inside our vendor's API verification cloud do not respond behind the initial host, and our service does not live in ec2. What size (in K) are the docs you are submitting?
Typically less than 1K.
Examples (2) of disappearances from today:
Korea Online Discounter to Buy Malaysian Firm Wall Street Journal SEOUL—Ticket Monster Co., South Korea's largest online deals company by market share, said Tuesday it would acquire Malaysia's Integrated Methods Sdn Bhd, in the first step of what...
Apple has indicated that Steve Jobs will participate in the opening keynote address at the company’s Worldwide Developer Conference (WWDC), wh…
Hello,
Any ideas on why this is happening? I had worked with Yoni a bit on this, and was really hoping to use Open Calais in our app, but these disappearing transactions are a show-stopper.
Thanks,
David
Does it always work with curl?
I am having very similar issues.
I am sending text-only data, but more than 50% of my connection attempts actually go unanswered after 20 seconds.
I have tried increasing my connection timeout to 60 seconds, with no difference.
I have also tried retrying 10 times, with the same results.
However, sometimes during one of my post attempts, I will post the same content via a curl call, and it goes through just fine.
I am working with one of the opencalais team on this issue, but if you have any further information to share, please do -
Thanks,
David