Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

SecondLife Architecture

blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 17:59
I couldn't log into SL so i got a little bored and drew up a draft draft diagram of SecondLife architecture from what very little I know.

Here's what I know:

Basically, when you log into you connect to a central asset server and download your inventory.

You then connect to your SIM you were last in. From there you can download prim congifurations and textures but they all come from a squid proxy cache so as to speed up retrieval.

That's about all I know, here are some questions I have and wonder if anyone can fill them out:

1. Is oracle being used anywhere or is it all mysql?
2. What sort of replication are they using for back end database?
3. How does cross sim border communication occur? If I am at a sim border and someone says something or rezzes something across the border, how does that get transfered to my avatar? Does it go first to my sim and then to my avatar or do I subscribe to multiple sims when I'm near a border?
4. What causes Asset Server log jams?
- Does the database simply get backed up?
- Is it a network or router problem usually?
- Is there a custom application server written that is buggy?
- Is it because of a minor code push which has a bug in it an takes everyone out?
- Is there a memory leak in a particular server everyone is accessing?
- Is it because they don't have enough bandwidth / hardware resources to handle incoming new users?
(hardware peak usage should always be at 20-30 percent MAX at all times, the other 70% should remain to handle code error and unforseen spike in usage)

- Everything seems to be UDP
- Are we connected via SSL to all servers at all times? (SSL and UDP don't work together so well)

- Other than inventory download what also happens on logging in?

Anyone have any other questions?

Let me know some answers and I'll add them to the diagram.
Adam Zaius
Deus
Join date: 9 Jan 2004
Posts: 1,483
05-11-2005 18:14
Nope, you have your structure significantly wrong.

For item content (the raw data):
Asset Server ---> Squids ---> Sims ---> Client ---Saving---> Asset Server

For inventory (different to the item itself)
DB Server <---> Client

The DB server also has user profiles, sim IP addresses/coordinate lookups, etc etc. It's the DB server dying, not the Asset server.

-Adam
_____________________
Co-Founder / Lead Developer
GigasSecondServer
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 18:22
From: someone

For item content (the raw data):
Asset Server ---> Squids ---> Sims ---> Client ---Saving---> Asset Server

For inventory (different to the item itself)
DB Server <---> Client

The DB server also has user profiles, sim IP addresses/coordinate lookups, etc etc. It's the DB server dying, not the Asset server.

-Adam


What is "saving" supposed to mean?

So you're saying that the client isn't hitting a squid server to get texture information?

The asset server hits the squid server? That's really weird.

(included below is an updated version of the arch diagram)
I thought all texture information was stored on the asset server? Doesn't the client hit the squid server, check for texture information, and then if it isn't there it goes to the asset server (it should all be there though, at least for that sim).


OK, so the DB Server is different from the Asset Server. Didn't know that existed.

What's the DB Server based on? MySql or oracle?
Adam Zaius
Deus
Join date: 9 Jan 2004
Posts: 1,483
05-11-2005 18:48
Nope.

More like this. (As far as I know from various conversations between lindens, this could be inaccurate, only LL knows for certain)

-Adam
_____________________
Co-Founder / Lead Developer
GigasSecondServer
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 18:56
Ok, I think I can see that the squid isn't on the sims themselves.

However, does the client hit the squid directly or does it go through the sim in order to get the textures?

It would seem that the sim should tell the client what assets to download and it would hit the squid caching servers directly.
Adam Zaius
Deus
Join date: 9 Jan 2004
Posts: 1,483
05-11-2005 18:59
From: blaze Spinnaker
Ok, I think I can see that the squid isn't on the sims themselves.

However, does the client hit the squid directly or does it go through the sim in order to get the textures?

It would seem that the sim should tell the client what assets to download and it would hit the squid caching servers directly.


As far as the client is concerned, all downloads are going to come through the sims. The squids are just there so that you dont have 780 sims hitting 1 server when they need to re-download everything for themselves. I think at last count there was something like a dozen squid servers.

Basically, when downloading a texture, it's being downloaded from the sim you are in, which should have a copy of it. (This is why when the sim has a high 'pending downloads' sometimes textures take up to 30 minutes to appear)

The servers dont hit the squids unless:
- A new object is rezzed, and a copy of it is required so it can be downloaded and replicated to all clients (and so the sim knows what it is, for purposes of collision detection, etc)
- The sim has just rebooted (needs a new copy of everything in it's cache), this is why assets are particularly laggy after updates, since 780 servers are hitting the asset server all at once.
_____________________
Co-Founder / Lead Developer
GigasSecondServer
Icon Serpentine
punk in drublic
Join date: 13 Nov 2003
Posts: 858
05-11-2005 19:01
also, don't forget that the sims use their own custom developed databases which sit in RAM entirely I believe. I might've interpreted it wrong, but it's been up in the ll ops blog for a while (they really don't update that thing often)
_____________________
If you are awesome!
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 19:11
So everytime I rezz something I hit the asset server?

Ugh. I had always assumed that rezzing would be localized to the sim you are on and that we only hit the asset server when we go cross border.

I wonder if temporary on rez hits the asset server as well?

Still, I find it a little weird that the squid caches would be so busy. I wonder if the caching in the sims is too small and requires constant updates against the caching server?

The client must hit the squid caches sometimes though.. what about when you're selecting textures from your inventory?

And god knows why the login server is a problem. I mean really - what does it even DO? Update a profile? That's nothing compared to the SIMs / Asset servers.

People should be caching their names/folders for their inventory on their local hard drives and only going to the login db for an inventory refresh when their local copy is corrupted.
Adam Zaius
Deus
Join date: 9 Jan 2004
Posts: 1,483
05-11-2005 19:13
From: blaze Spinnaker
So everytime I rezz something I hit the asset server?

Ugh. I had always assumed that rezzing would be localized to the sim you are on and that we only hit the asset server when we go cross border.

I wonder if temporary on rez hits the asset server as well?


No, every time you rez something you hit the squids. They are a mirror of the asset server, to prevent having the problems of centralisation. Rezzing plywood cubes are not a problem (those are not 'assets'), rezzing from inventory is however going to go to the squids if the sim has never seen that asset before. (Sims cache anything in them)

Cross border actually doesnt hit this system at all, the neighbouring sims have communication between them, they download those assets between each other.

-Adam
_____________________
Co-Founder / Lead Developer
GigasSecondServer
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 19:20
From: someone

No, every time you rez something you hit the squids.


You mean the sim hits the squids and then I get the info from the SIM?

From: someone

rezzing from inventory is however going to go to the squids if the sim has never seen that asset before.


Ahhh, of course. So when I rez from inventory, I tell my sim the uuid and it goes to the squid to get information about it. Everytime I create a copy of something (or take it into my inventory), I save to the asset server and create a new UUID.

From: someone

Cross border actually doesnt hit this system at all, the neighbouring sims have communication between them, they download those assets between each other.


Ok, I can see that..
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 19:22
Still, that's all cool, but doesn't explain the problem with the login DB. If its inventory, than why aren't they caching names/folders/uuids on clients with hashing codes for integrity?

it doesn't make any sense at all that I can see that the login DB should be slow or hang.
Adam Zaius
Deus
Join date: 9 Jan 2004
Posts: 1,483
05-11-2005 19:34
Right.

The database server is another issue completely, mostly seperated from inventory.

With streaming inventory (which should be caching to disk according to Kelly Linden) the biggest drain on the sims (logins and their inventory fetches) should be gone. However the real problem is that the Database server is used for absolutely everything, for example, all of the following hit it (and this represents maybe 5% of the functions the DB server performs):

- Login
- Open Inventory
-- Do any inventory browsing (checking for updates)
- Open Find Window
-- Do any searching whatsoever
- Open Map[?]
- Friends list and friends online
- Calling Cards[?]
- Teleporting
- Transferring inventory
- Buying things
- Selling things
- Paying things
- Taking assets to inventory
- Website Logins & Account Pages (except transaction history logs)
- Instant Messages (and offline IM's)

The problem is, with 28,000 users - those things begin taking a serious toll on performance, and decentralising data which is bidirectional (read and write) is extremely difficult to do, especially without having massive latency problems.

-Adam
_____________________
Co-Founder / Lead Developer
GigasSecondServer
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 19:37
From: someone

With streaming inventory (which should be caching to disk according to Kelly Linden) the biggest drain on the sims (logins and their inventory fetches) should be gone.


Do you mean the 'database' (for want of a better name..)?

Were we storing inventory on sims before?


Other than that, I got it. Nicely summarized. The bit about the bidirectional read write being hard to decentralize is so very true.

This is a partial answer to Apotheus question as well, I think i'll go direct him to this thread..


Still, it seems to me that each of those functions could be split up into dedicated servers.

I guess the issue is transactional integrity.

Though, IMs don't really need transactional integrity, do they?

Hmmm, tough problem.
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 19:41
Still, it seems to me that each of those functions could be split up into dedicated servers.
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 19:49
The reason I can see this as being an answer to Apoetheus's questions is the 'database' is probably rolling back transactions as it gets over loaded and has a hard time maintaining integrity.

But yeah, what to suggest? I guess they could split the boxes up amongst some databases connected via mqseries or something, but that'll just slow transactions down.

It is a fundamental problem. It's also why PayPal and other companies have to charge a fee for financial transactions, because their is an overhead on each of those transactions in order to maintain ACID properties. Micropayments by their very nature are always hard to do.
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 20:03
Well, they could try websphere / weblogic and enterprise java beans. You can distribute your database and have cross database transactional integrity. I have done this before, though not at the level that SL is doing it...

Something certainly worth looking into, though.
Travis Lambert
White dog, red collar
Join date: 3 Jun 2004
Posts: 2,819
05-11-2005 20:21
Pardon my ignorance, guys - but I find this stuff really interesting :)

What the heck is a Squid? Is that like a Novell iChain box?
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-11-2005 20:23
http://www.squid-cache.org/

It's just a cache, actually. You hit the server with an http request and if the proxy doesn't have an entry it will hit another server, respond to you with the data it finds, and store it in its cache so the next time you hit the cache it doesn t have to go to the other server.

The advantage is that you can have a load balancer + a bunch of squid caches and then spread the load around without killing one server.

The disadvantage is that the cache can sometimes get dirty.. I get the sense that LL gets around this by making lots of copies of things (never dirties the cache, just adds to it)

For example,this is why everytime you save a notecard you get a new UUID and probably why llwrite2notecard is so diffiicult.
Adam Zaius
Deus
Join date: 9 Jan 2004
Posts: 1,483
05-11-2005 20:24
It's an open source proxy cache. http://www.squid-cache.org/

:)

-Adam
_____________________
Co-Founder / Lead Developer
GigasSecondServer
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
05-12-2005 12:19
Proposal for clustering the login / main database:

Basically, replace the current application server, whatever it might be, with a clustered EJB Weblogic (or websphere) 9.0 application server.

Advantages

- State replication via replica aware stubs
- Failover when one ejb server goes down
- Load balancing (remove load from database to EJB servers)

I've done this before and it works well. I have done it with 2-3K people all logging in at the same time, however it was a mostly read scenario. I've run a production environment with about 1000 peak usage in high update and pf course I have load tested 10s of thousands doing high updates as well.

Weblogic 9.0 has come a long way. I recommend checking out http://e-docs.bea.com/wls/docs90/cluster/overview.html. At the very least you'll get some good insights on how to tackle your database problems.

And yes, transactions have ACID properties across clusters.
blaze Spinnaker
1/2 Serious
Join date: 12 Aug 2004
Posts: 5,898
07-05-2005 00:24
Anyone know what the new architecture looks like?
_____________________
Taken from The last paragraph on pg. 16 of Cory Ondrejka's paper "Changing Realities: User Creation, Communication, and Innovation in Digital Worlds :

"User-created content takes the idea of leveraging player opinions a step further by allowing them to effectively prototype new ideas and features. Developers can then measure which new concepts most improve the products and incorporate them into the game in future patches."
Jeska Linden
Administrator
Join date: 26 Jul 2004
Posts: 2,388
07-05-2005 09:50
Moved to Technical Issues for further discussion.
Buster Peel
Spat the dummy.
Join date: 7 Feb 2005
Posts: 1,242
07-05-2005 23:06
This page has the actual pictures.
Buster Peel
Spat the dummy.
Join date: 7 Feb 2005
Posts: 1,242
07-05-2005 23:35
(Slightly more useful) see page 53 or so. :)