Second Life Forums Archive - Some metrics in support of outbound XML-RPC

Huns Valen

Don't PM me here.

Join date: 3 May 2003

Posts: 2,749

10-04-2005 02:31

For the first time, I started doing some statistical analysis on network latency between LL's servers and DreamHost. When I say "network" I'm not talking about the raw TCP/IP stuff - I mean the end-to-end process of a device in SL sending an email to a process at DH, having it do something with that email and sending a response over XML-RPC, and finally, the device in SL receiving the response.

Internet latency between SL and DH does not seem to be an issue. Pings are usually on the order of 10-20 milliseconds (derived by SSHing into DH and pinging sim100.agni.lindenlab.com).

However, when I measured the latency of the process described above (i.e. SL -> email -> DH -> program at DH -> XML-RPC -> SL) in a simple "ping" setup, I found some data that concern me.

3,492 "pings," taken continuously at a rate of once every 20 seconds, revealed the some interesting data. I've attached a graph. Here are some highlights:

196 pings took longer than 1 minute and were discarded (i.e. not represented in the calculation of mean/standard deviation, below)
3,296 pings (94%) took a minute or less
Of those, the mean response time was 5.5852 seconds, and the standard deviation was 9.1874 seconds
Response times in the 0-10 second range, in order: 0, 15, 1531, 727, 414, 135, 49, 39, 33, 17, 13
The most common response time was 2.x seconds (1531, or 46%)
The data do not approximate a normal distribution (too much skew), so it's hard to apply the 68-95-99.7 rule. (See this Wikipedia article.)

The "ping" service usually takes a second or less to run, and there's a little delay while my machine sends the email and another little delay while the XML-RPC data gets back to Second Life, plus a bit of time while the script parses the response and decides what to do with it. So, a lot of the time, the response is under 5 seconds, all told. However, there are many cases where it takes much more than 5 seconds.

I think a big part of this problem is that we do not have outbound XML-RPC. We have to rely on email, and to be blunt, email sucks. It's a connectionless protocol, and your message gets to its recipient in some undefined and highly variable amount of time. LL's mail server gets a message, sends it out to DreamHost when it isn't too busy, then DreamHost delivers it to my application (again, when it isn't too busy.) Maybe that process takes a second. Maybe it takes half a minute. Maybe DH's mail server is bogged and it won't arrive for five minutes. Or maybe it's LL's server that is bogged - I have no way of knowing.

With XML-RPC, communication with the outside world would be direct (that is, over a stateful, connection-oriented protocol.) Let's say you try to open an outbound channel. If there's a problem, your script should presumably get some kind of error, at which point it can notify you that there's a problem with connectivity. With email, it's "send it and hope that it gets there." Furthermore, it should be consistently quicker than email. Given reasonably low network latency between SL and the outside machine (such as, say, less than 500 milliseconds), unless some process severely nails the CPU on the outside machine, response times over five seconds should be extremely rare, at least for a service that does nothing but reply to the sender with "Hello, here is the timestamp you gave me."

_____________________

[ huns? || words || 1TBS KREW || VHI Aircraft || Whitepaper on Aircraft Physics ]

Make yourself and your objects immune to llPushObject().

Jesrad Seraph

Nonsense

Join date: 11 Dec 2004

Posts: 1,463

10-04-2005 03:09

Outbound RPC would enable many things that currently have to be done with a combination of llLoadURL(), llSetMediaURL() and llEmail() right now. Please, Lindens ?

_____________________

Either Man can enjoy universal freedom, or Man cannot. If it is possible then everyone can act freely if they don't stop anyone else from doing same. If it is not possible, then conflict will arise anyway so punch those that try to stop you. In conclusion the only strategy that wins in all cases is that of doing what you want against all adversity, as long as you respect that right in others.

Satchmo Prototype

eSheep

Join date: 26 Aug 2004

Posts: 1,323

10-04-2005 04:34

Nice analysis. What would you do to prevent LL from becoming a denial of service or spam cluster? The saving grace of email is the rate limiting.

_____________________

----------------------------------------------------------------------------------------------------------------
The Electric Sheep Company
Satchmo Blogs: The Daily Graze
Satchmo del.icio.us

Malachi Petunia

Gentle Miscreant

Join date: 21 Sep 2003

Posts: 3,414

10-04-2005 04:53

From: someone

What would you do to prevent LL from becoming a denial of service or spam cluster? The saving grace of email is the rate limiting.

There have been various proposals that generated effectively "source quench" responses from unmodified potential DDoS targets, but desipte Huns' well presented analysis, I'd be betting on Havok 3 first.

Minsk Oud

Registered User

Join date: 12 Jul 2005

Posts: 85

10-04-2005 09:13

Not sure why you would expect XML-RPC to have a significant performance advantage over e-mail, other than the closer association of requests and replies (well, that and SL's mail system having a horrible track record). A fair variety of mail transport agents support keeping SMTP connections alive, which produces behavior almost equivalent to XML-RPC/HTTP (needs one SMTP connection in each direction).

Check the e-mail headers or server logs to make sure both mail daemons are reusing connections. If not, the dropping end need to crank up its connecton cache options. Hopefully LL already has its servers configured to use completely static SMTP connections for internally relaying e-mail (from MX/lsl.secondlife.com to the sim servers and possibly vice versa).

Chris

_____________________

Ignorance is fleeting, but stupidity is forever. Ego, similar.

Huns Valen

Don't PM me here.

Join date: 3 May 2003

Posts: 2,749

10-04-2005 19:19

From: Satchmo Prototype

Nice analysis. What would you do to prevent LL from becoming a denial of service or spam cluster? The saving grace of email is the rate limiting.

Email rate limits are trivial to get around.

As for DDoS, let's look at a couple scenarios:

Second Life - Write a DDoS script, some kind of navigation script, drop them in some prims and tell them to navigate to various sims and start flooding someone. Effort: Probably a day to two weeks, depending on experience and how many features you build into it. Objects are limited to something like four "generations" so you can't just send one object and expect it to infinitely spawn itself; you have to rez all the objects within four generations. Yield: Packets not more than 1KB in size, dribbling out from a few tens or maybe hundreds of sims, depending on where you start and whether you have pathfinding logic in each object. Consequences: Banned from SL. Need to use a different credit card AND a different network card AND be coming from a different subnet each time.
Hosting Provider - SSH in, write a shell script that backgrounds 100 instances of ping. Effort: I could write and test this script in perhaps five minutes. Yield: I can saturate the link with packets of any size ping will accept (e.g. 5,000, 10,000, whatever the limit is.) Consequences: Kicked off the hosting provider; go to another one and do the same thing ten minutes later. There are zillions of these places all over the world, they are a dime a dozen.

As you can see, the effort required to RPC bomb someone from SL would be gargantuan, and the payload hardly impressive, compared to using a hosting provider with shell access. Yet, you don't see shell providers saying, "No, you can't initiate outbound connections because you might use them to flood someone's machine." The vast majority of customers are using their services for legit purposes. So, they let the legit people do their biz, and handle the abuse cases as they occur.

From: Minsk Oud

Not sure why you would expect XML-RPC to have a significant performance advantage over e-mail

Opening an XML-RPC connection to an instance of Apache at DreamHost should not take more than a few hundred milliseconds at most. Sending an email requires reliance on mail servers, which are significantly slower, and more failure-prone. There are times when Uba and Oak Grove have been incapable of sending email. When that happens, the scripts don't get any acknowledgement, the emails just disappear. So we get additional single points of failure with zero added benefit. That's janky. We already have enough issues to handle as it is. With XML-RPC, the connection is stateful. It has to be opened before data can be sent, so if there is a connection problem, you can catch it right there - either by llWhateverFunction() returning FALSE, or some event returning SOME_CONSTANT_ABOUT_RPC_FAILURE, or whatever.

From: someone

Check the e-mail headers or server logs to make sure both mail daemons are reusing connections. If not, the dropping end need to crank up its connecton cache options. Hopefully LL already has its servers configured to use completely static SMTP connections for internally relaying e-mail (from MX/lsl.secondlife.com to the sim servers and possibly vice versa).

I don't control the mail servers at DreamHost. This illustrates another problem with email - you have to rely on yet another set of servers beyond your control. If RPC dies, and I check my sending script, and I check my program at DH, I can figure out where the problem is. With email, I have to work with DH's administrators directly, which is time consuming. I have done it in the past, when some of my emails were taking upwards of five minutes to be delivered. Do you know what they wanted me to do? Well, after the support request had been sitting in the queue for about a day, they responded, saying that they didn't know which of their email servers might have been failing (they are in some kind of round-robin) and they wanted me to contact LL and have them forward relevant mail server logs.

So, I would have had to put in a support request to LL, wait several days to get a response, hopefully from someone willing to get an admin to forward those logs (assuming they haven't been rotated out of existence by that time), and then wait some more time for DH to analyze the logs and figure out where the failure occurred.

Does this seem reasonable to you?

Imagine going to the store to buy groceries, swiping your credit card, and having to wait five minutes while the Verifone machine sends an email to the store's provider. I think you'd be pissed off, and so would everyone in line behind you. You don't want that transaction to take five minutes. You don't even want it to take one minute. You want it to take a few seconds. It's the same for people using devices in Second Life. Slow, unreliable services are not good for customer satisfaction.

_____________________

[ huns? || words || 1TBS KREW || VHI Aircraft || Whitepaper on Aircraft Physics ]

Make yourself and your objects immune to llPushObject().

Satchmo Prototype

eSheep

Join date: 26 Aug 2004

Posts: 1,323

10-04-2005 21:21

From: Huns Valen

Email rate limits are trivial to get around.

As for DDoS, let's look at a couple scenarios:

Second Life - Write a DDoS script, some kind of navigation script, drop them in some prims and tell them to navigate to various sims and start flooding someone. Effort: Probably a day to two weeks, depending on experience and how many features you build into it. Objects are limited to something like four "generations" so you can't just send one object and expect it to infinitely spawn itself; you have to rez all the objects within four generations. Yield: Packets not more than 1KB in size, dribbling out from a few tens or maybe hundreds of sims, depending on where you start and whether you have pathfinding logic in each object. Consequences: Banned from SL. Need to use a different credit card AND a different network card AND be coming from a different subnet each time.

Yup... but griefers will be griefers... and LL doesn't have the resources to explain to the FBI what a griefer is...

From: Huns Valen

The vast majority of customers are using their services for legit purposes. So, they let the legit people do their biz, and handle the abuse cases as they occur.

But this is what they do, they are hosting providers... LL is not, nor do they have enough sysadmins to monitor, catch and throttle every griefer. I mean look how long griefing in world sometimes goes on...

I'm not really argueing with you... Our game in the Game Dev contest offloaded all of the data to a remote server, and there was heavy use of the Email/XML-RPC loop. I would LOVE to have outgoing XML-RPC, I was just looking for a compelling answer to the Lindens DoS fears... I've sysadmin'd, I've done security consulting and I understand that LL doesn't want to become sysadmins. So to them it's easier not to touch this one and make residents design stuff around the limitations.

_____________________

----------------------------------------------------------------------------------------------------------------
The Electric Sheep Company
Satchmo Blogs: The Daily Graze
Satchmo del.icio.us

Huns Valen

Don't PM me here.

Join date: 3 May 2003

Posts: 2,749

10-04-2005 23:32

From: Satchmo Prototype

But this is what they do, they are hosting providers... LL is not, nor do they have enough sysadmins to monitor, catch and throttle every griefer. I mean look how long griefing in world sometimes goes on...

I guess what it boils down to is that LL has x desire to present itself as a viable business platform (and e-commerce is a huge part of online business today), but there is y amount of trouble (i.e., time to develop it, possible abuse resulting from its availability) that they perceive in the way.

I just really wish they would bite the bullet on this one and give us full bi-directional RPC. What we currently have to work with is an irritating mess. Some people, myself included, have resorted to implementing full network stacks. That's on top of the TCP/IP stack we already have, just so that we can have something approaching reliable data transport.

_____________________

[ huns? || words || 1TBS KREW || VHI Aircraft || Whitepaper on Aircraft Physics ]

Make yourself and your objects immune to llPushObject().

Jesrad Seraph

Nonsense

Join date: 11 Dec 2004

Posts: 1,463

10-05-2005 02:34

I'm surprised some people think that email is "just sufficient right now". It's not. I can't make use of it with my current webhosting to link inworld scripts to webpages. And no I'm not getting another webhosting and learning how to set up email->script treatment with Postfix and everything else just for that, where I could fire up a single REMOTE_DATA_REQUEST.

What about the proposition to just register your domain at LL (or just set up some flag in your DNS entry that LL can read) so LL opens RPC out to this domain ?

_____________________

Either Man can enjoy universal freedom, or Man cannot. If it is possible then everyone can act freely if they don't stop anyone else from doing same. If it is not possible, then conflict will arise anyway so punch those that try to stop you. In conclusion the only strategy that wins in all cases is that of doing what you want against all adversity, as long as you respect that right in others.

Maxx Monde

Registered User

Join date: 14 Nov 2003

Posts: 1,848

10-05-2005 04:27

implementing rate-limiting by crippling your in-world services (script functions) is a half-assed way to keep thing in line. What most colleges and ISPs use are packet shapers, with policies that relegate any non-typical traffic to a 'squeeze' bucket, which is typically *just* enough to allow it to slow to an absolute crawl.

The reason for that is, if you had a virus or something, or even just an aggressive P2P application, you don't want to cut it off completely, just kill the rate *enough* so it won't get hyper and start flooding your local network with scan attempts upon failure to communicate.

So I'd be looking at Packeteer and their packetshapers, if Linden Lab hasn't already. This would take most of the concern of bandwidth abuse out, and perhaps we could get actual two-way communication, instead of trying to build playing card-houses in the dark, with oven mitts on. (oops! Start over! Its just the way it is!)

Seriously, buy some packetshapers, LL, and open up the world a little.

_____________________

Opensim Tutorial - http://opensimuser.wordpress.com/2008/06/15/opensim-install-and-configuration-tutorial/

Run your own simulator on your personal machine!

Satchmo Prototype

eSheep

Join date: 26 Aug 2004

Posts: 1,323

10-05-2005 05:03

This thread rocks....

What if we had the ability to do HTTP Posts instead of XML-RPC, would that be sufficient for outgoing communications?

_____________________

----------------------------------------------------------------------------------------------------------------
The Electric Sheep Company
Satchmo Blogs: The Daily Graze
Satchmo del.icio.us

Huns Valen

Don't PM me here.

Join date: 3 May 2003

Posts: 2,749

10-05-2005 07:36

From: Maxx Monde

implementing rate-limiting by crippling your in-world services (script functions) is a half-assed way to keep thing in line. What most colleges and ISPs use are packet shapers, with policies that relegate any non-typical traffic to a 'squeeze' bucket, which is typically *just* enough to allow it to slow to an absolute crawl.

The reason for that is, if you had a virus or something, or even just an aggressive P2P application, you don't want to cut it off completely, just kill the rate *enough* so it won't get hyper and start flooding your local network with scan attempts upon failure to communicate.

So I'd be looking at Packeteer and their packetshapers, if Linden Lab hasn't already. This would take most of the concern of bandwidth abuse out, and perhaps we could get actual two-way communication, instead of trying to build playing card-houses in the dark, with oven mitts on. (oops! Start over! Its just the way it is!)

Seriously, buy some packetshapers, LL, and open up the world a little.

I wanted to point this out but didn't have a great way to articulate it at the time...

Some automated statistical analysis of outbound RPC traffic could be done to implement this. There are a few dimensions in which you could analyze the data - total outbound requests initiated by user X's scripts grid-wide, number of outbound requests from those scripts per hour or per minute, etc. - and you could set some critical region, i.e. if your requests are too far away from the average on the plus side, you get throttled. You could also limit it to so many transactions per hour times area of the parcel the script is sitting on, but the statistical analysis method would make it easier to clamp people who are A) going nuts with it or B) trying to attack some host.

I don't think there should be a charge per packet sent. That would annoy developers and result in increased charges to people who use a service that talks over RPC.

_____________________

[ huns? || words || 1TBS KREW || VHI Aircraft || Whitepaper on Aircraft Physics ]

Make yourself and your objects immune to llPushObject().

Minsk Oud

Registered User

Join date: 12 Jul 2005

Posts: 85

10-05-2005 07:45

From: Huns Valen

Opening an XML-RPC connection to an instance of Apache at DreamHost should not take more than a few hundred milliseconds at most. Sending an email requires reliance on mail servers, which are significantly slower, and more failure-prone.

Having finally taken a peek at the e-mails I received from SL, I have to agree that the performance of the SL mail pipe is absolutely horrid. Doing a few quick tests with my own systems, I found sending a hundred messages down a connection cached SMTP pipe was faster than establishing a hundred parallel HTTP connections (without sending any data). Did not look at round-trip time, which may well result in favoring the HTTP again.

From: Huns Valen

There are times when Uba and Oak Grove have been incapable of sending email. When that happens, the scripts don't get any acknowledgement, the emails just disappear.

Which is either a problem in the mail system or a result of a limitation in the LSL API (does not report error messages or bounces).

From: Huns Valen

With XML-RPC, the connection is stateful. It has to be opened before data can be sent, so if there is a connection problem, you can catch it right there - either by llWhateverFunction() returning FALSE, or some event returning SOME_CONSTANT_ABOUT_RPC_FAILURE, or whatever.

Assuming a single persistent connection for each XML-RPC connection (honestly I would be tempted to share them, but am not aware of any client libraries that do), you would be notified of failure in the case of the target HTTP daemon being unavailable. Otherwise it should wait until the first message is sent. Just flipped through a few of the commonish clients, and many of them seem to establish a new connection to the server for each message (that was certainly the state when I initially evaluated XML-RPC/HTTP).

From: Huns Valen

I don't control the mail servers at DreamHost. This illustrates another problem with email - you have to rely on yet another set of servers beyond your control.

Very good point, I forgot to factor in the *cough* competence of the average hosting company. Also having to poll a POP or IMAP server would kill the latency, so hopefully they all provide procmail or similar. Given an even comparison I found SMTP faster, but what most people will be exposed to is not an even comparison.

In my own stuff, I actually wound up scrubbing HTTP in favor of a fully bidirectional protocol, as heavy use of publish-subscribe meant the server had to manage too many outgoing connections. Because my connections are static (known server network), the daemon is absolutely braindead and messages are virtually instant. Bulk data transfers, including incremental backups, get shipped via SMTP.

Have not been dealing with the defacto pseudo-standards, common clients, or commercial hosts in a while, so am somewhat out of touch with that end of the real world.

_____________________

Ignorance is fleeting, but stupidity is forever. Ego, similar.

Maxx Monde

Registered User

Join date: 14 Nov 2003

Posts: 1,848

10-05-2005 07:55

From: Huns Valen

I wanted to point this out but didn't have a great way to articulate it at the time...

Some automated statistical analysis of outbound RPC traffic could be done to implement this. There are a few dimensions in which you could analyze the data - total outbound requests initiated by user X's scripts grid-wide, number of outbound requests from those scripts per hour or per minute, etc. - and you could set some critical region, i.e. if your requests are too far away from the average on the plus side, you get throttled. You could also limit it to so many transactions per hour times area of the parcel the script is sitting on, but the statistical analysis method would make it easier to clamp people who are A) going nuts with it or B) trying to attack some host.

I don't think there should be a charge per packet sent. That would annoy developers and result in increased charges to people who use a service that talks over RPC.

Exactly Huns.

That is why I recommended Packeteer's stuff - I don't work for 'em, but I have used them. They have a mode where they just listen and build profiles of the traffic going across the wire, then you shape your policies based on this 'typical' flow. It is a very eye-opening process, you get to see exactly where your bursty traffic is, and who is doing it. Once this is set, you put your policies in place, and then like I said before - anything atypical is never allowed to impact your regular services.

They really should consider this, it could save a lot of trouble down the road.

_____________________

Opensim Tutorial - http://opensimuser.wordpress.com/2008/06/15/opensim-install-and-configuration-tutorial/

Run your own simulator on your personal machine!

Alexander Yeats

Registered User

Join date: 8 Sep 2005

Posts: 188

10-05-2005 08:02

From: Satchmo Prototype

But this is what they do, they are hosting providers... LL is not, nor do they have enough sysadmins to monitor,

Correct me if I am wrong, but the 1250 I paid in US cash as well as the 200 a month is still the same color green as the US cash I pay my hosting providers with each month no?

I dont host my own SIM, LL does. I do not host my content, LL does.

So far I would put a big fat YES stamp on the fact that they do in fact provide hosting services.

Satchmo Prototype

eSheep

Join date: 26 Aug 2004

Posts: 1,323

10-05-2005 09:09

From: Alexander Yeats

Correct me if I am wrong, but the 1250 I paid in US cash as well as the 200 a month is still the same color green as the US cash I pay my hosting providers with each month no?

Right... not that money has anything to do with it, but they do indeed host our private sims. It's just that thier business model isn't centered around letting those sims contact to everything and anything on the net. Of course since they build a virtual world with game engine like creation tools, they would benefit from letting us connect to everything and anything... but not as much as they benefit from stabilizing the grid, bug hunting and adding new features in world.

Don't get me wrong, when we get outgoing communications it's going to spur amazing amounts of creativity. I agree with not paying per connection because that just limits the number of creative immegent behavior that will occur.

Furthermore... I pay that much per month to my hosting company and they don't provide me with a scriptable 3D virtual world... those bastards... I'm outraged (Satchmo dials hosting company support line).

_____________________

----------------------------------------------------------------------------------------------------------------
The Electric Sheep Company
Satchmo Blogs: The Daily Graze
Satchmo del.icio.us

Satchmo Prototype

eSheep

Join date: 26 Aug 2004

Posts: 1,323

10-05-2005 09:45

From: Maxx Monde

Exactly Huns.

That is why I recommended Packeteer's stuff - I don't work for 'em, but I have used them. They have a mode where they just listen and build profiles of the traffic going across the wire, then you shape your policies based on this 'typical' flow. It is a very eye-opening process, you get to see exactly where your bursty traffic is, and who is doing it. Once this is set, you put your policies in place, and then like I said before - anything atypical is never allowed to impact your regular services.

They really should consider this, it could save a lot of trouble down the road.

This is a good approach. Best of all, it's give the good guys the performance and badwidth they desire.

With Packeteer's stuff can you periodically re-profile the network traffic? SL is growing really quick, and new applications like ROAM catch on quickly as well, so wouldn't they have to re-evaluate normal traffic occasionally?

_____________________

----------------------------------------------------------------------------------------------------------------
The Electric Sheep Company
Satchmo Blogs: The Daily Graze
Satchmo del.icio.us

Welcome to the Second Life Forums Archive

Some metrics in support of outbound XML-RPC