Is this news or am I really just slow today?
edit:
The patch specifically rolled out in this release is just a bug fix to code rolled out last week which is not yet active.
We have a little more than 4,000 simulator hosts in Second Life. Each one of these runs up to four regions, or up to sixteen “openspace” regions. If a host isn’t running any regions right now, or if it is running fewer than the max (4 or 16), it has one or more “spare” processes. Right now, spares poll the database every 5-6 minutes to check to see if there are any regions that should be run but which are down. Typically, we have a few hundred spare processes out there, so if a region is down, it comes back up right away.
Unfortunately, all of this “spare polling” puts a fair amount of load on the database. Ideally, we’d like to be able to bring more spare machines online without having to worry about whether or not we’re facing the odd concept of “too many spares”. That would allow us, for example, to anticipate new land being delivered by having plenty of spares available before those in land management actually deliver new islands or new mainland.
The 1.21 release includes a new central process that will manage spares and will manage the assignment of regions to spares. No longer will every spare process out on the grid directly it the central database. Rather, it will contact this management process and ask it (a) what to do, and (b) when to ask again. The code is already out there, but currently it is only running in “load test” mode. That is, we haven’t yet changed the way that regions are assigned to hosts. First, we want to make sure that the management server can handle the load of all of the grid out there polling it. (So far, that looks very good.)
Because of the bug in the new code, the servers in our Phoenix data center (which currently are less than 10% of the total servers) are not yet contacting the manager in load test mode. The roll this week will allow us to complete the load test, and then turn on the functionality that will reduce the load on the database.
This isn’t the end of the road, because there are other things that hit the database, and we’re continuing to work on other avenues of reducing load on our central database.

