Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Rolling Restart for Server 1.30

Tristin Mikazuki
Sarah Palin ROCKS!
Join date: 9 Oct 2006
Posts: 1,012
09-17-2009 10:41
Is the piolet roll done?
AWM Mars
Scarey Dude :¬)
Join date: 10 Apr 2004
Posts: 3,398
09-17-2009 10:50
From: Cincia Singh
Did you check scripts on your estate to see if they were the problem? Did your guests come loaded down with scripts and scripted attachments that contributed to the lag? The practical top load for a sim has been, and continues to be regardless how high you set your estate limit, 50 avatars ... fewer in areas with a lot of shops with poorly written scripts. Toss in a lot of dancing and poseballs and I can see how 45 avatars would lag a sim. Model bots and camped alts also contribute to consuming a sims resources depending on how they're set up. That said I've been at events on many sims, not yours, with 45 avatars and had 20-30 fps steady except when someone TP's into the sim.

Good luck tracking down your lag, because I don't see you or any of us getting any resolution until LL upgrades to cloud servers or from Class 5 to Class Unfreakingbelieveable servers.

Now you are sounding like LL support... in fact like all support.. its always the customer that is at fault. Truth of the matter is, SL is failing, suffering cascade failures due to the strain of forced expansion it was not designed to handle.
_____________________
*** Politeness is priceless when received, cost nothing to own or give, yet many cannot afford -

Why do you only see typo's AFTER you have clicked submit? **
http://www.wba-advertising.com
http://www.nex-core-mm.com
http://www.eml-entertainments.com
http://www.v-innovate.com
DanielRavenNest Noe
Registered User
Join date: 26 Oct 2006
Posts: 1,076
09-17-2009 12:54
From: Lil Linden

It doesn't seem practical to provide an ongoing % of completion during those two days, since we're usually pretty busy making sure that each and every region comes back online after the upgrade. I'd suggest trying to avoid scheduled events at the beginning of the day (~8am Pacific time), because that's when the restarts usually happen.


You are talking about thousands of users, and hundreds of events interrupted by a rolling restart. How would you like it if *your* live performance, wedding, or business meeting was interrupted?

Surely you can keep a browser window open to this thread and post a comment every 25% increment or so (at least on the main roll, the pilot is only 500 regions, so really don't need incremental updates on that part)

Even better would be to tell us the planned firing order of the restarts (by simhost number for example), then we can check for ourselves how far along the restarts have gotten. Even if it does not go exactly to plan, if we know in advance what the plan is, we can work around it: "OK, my region has a high simhost number, so it will be late in the restart, better move that noon meeting".
Abigail Merlin
Child av on the lose
Join date: 25 Mar 2007
Posts: 777
09-18-2009 01:22
From: someone
Rolling Restart, Friday 09/18 8-11AM PDT
Posted by Status Desk on September 17th, 2009 at 04:35 pm PDT
We are having a small additional pilot rolling restart which will follow this schedule:

»Fri. 09/18, 8-11AM PDT : a pilot roll of ~1600 regions

As with all of our server deploys, each region will be restarted once during one of the rolling restart periods. Most regions will be down no more than 5-10 minutes, although some fraction of the regions will take 20-30 minutes to upgrade. If your region stays down for more than 30 minutes, please contact support. Each region will receive warnings starting 5 minutes before that region is restarted.

From: someone

[RESOLVED] Crash issues after rolling restart
Posted by Status Desk on September 17th, 2009 at 02:13 pm PDT
[Sep 17 4:26pm PDT] Our engineers have located an issue, and a fix has been deployed to the grid. We’ve seen a sginificant reduction in region crashes. Please contact support if you are having further crash problems.

[Sep 17 2:16pm PDT] We are currently experiencing a higher-than-normal crash rate after our rolling restart. We are aware of the issue and are working to resolve it. Please check back frequently with the blog for further updates.

Are these 2 related and if so what will be the new version number or was the increased crash rate a misconfiguration that needs a new restart?
If unrelated what is the aditional fix being deployed and what version number?
Zena Juran
Registered User
Join date: 21 Jul 2007
Posts: 473
09-18-2009 06:47
From: Dante Linden
There are at least two approaches to scheduling releases that we could use. One is feature-based ("our next release will have features X, Y, and Z";), and the other is time-based ("every N weeks, we'll release what's been implemented/fixed and tested so far";).

The problem with the first approach is that delays in feature Z mean that features X and Y, which could be ready to deploy, don't get out onto the grid in a timely fashion.

We are using the second approach. We tell all the developers when we'll be doing releases, and if they finish their code and it gets tested in time to make it into the next release, then it can go in. If not, then the code waits until the next release following the completion of that testing. This gets features like X and Y onto the grid at a somewhat regular interval without concern over the fact that feature Z might be one or two months behind. This does mean that the feature list for a release will be more variable than if we were taking the first approach and releasing specifically to get features X, Y, and Z onto the grid.




It does seem that "scheduled" releases take more than one attempt to make it to the grid nowadays.

Is the second approach really working? Are you getting quality features out in a timely manner?
Lil Linden
Linden Lab Employee
Join date: 12 May 2008
Posts: 81
09-18-2009 09:13
From: Abigail Merlin
Are these 2 related and if so what will be the new version number or was the increased crash rate a misconfiguration that needs a new restart?
If unrelated what is the aditional fix being deployed and what version number?


The increased crash rate was observed when we rolled 1.30.0.132925 to the pilot regions. While those sims have a fix in place, they will need to receive 1.30.0.133657 during the final roll next week.
DanielRavenNest Noe
Registered User
Join date: 26 Oct 2006
Posts: 1,076
09-18-2009 09:15
From: Zena Juran
It does seem that "scheduled" releases take more than one attempt to make it to the grid nowadays.

Is the second approach really working? Are you getting quality features out in a timely manner?


I do think their current plan of having a week between pilot and main roll is better than what they had before (a day or two). For "backend" fixes like this one seems to be, where its not obvious to us users what is changing, we can't do a good job testing on the beta grid.

Letting it perk for a week on the production grid will let it get tested by real users under real load. Applying it to ~5% of the grid means you have enough servers switched to see how its working, but not commit the vast majority to the pain if it goes wrong. It's especially good that the perk time covers a weekend, since thats when the peak load on the system is.

We don't know their pre-release testing process. I sure hope it's more than "it compiled", and closer to:

"Log in 40 dancing blingtards via random internet routes (ie not from a rack in the same server room) on a region with 6000 scripts"

because that's what actually will happen on some real production servers.
Lil Linden
Linden Lab Employee
Join date: 12 May 2008
Posts: 81
09-18-2009 09:16
From: Abigail Merlin
can we at least know on what day the odd and even hosts get rolled and in what order?
also can someone confirm that sims are locked to their host between the start of the pilot roll and the end of the final roll?


Yes! Even sims will be rolled first, on Wednesday, and odd sims on Thursday.

During all rolling restarts we do indeed lock all sims to their hosts.
DanielRavenNest Noe
Registered User
Join date: 26 Oct 2006
Posts: 1,076
Thank you Lil
09-18-2009 09:30
That at least narrows it down to a one day window for those of us who know how to check the host number:

Open top menu Help > About Second Life, and read the 5th line that looks like:

"sim8062.agni.lindenlab.com (216.82.37.129:13002)"

8062 would be the server ID number. The rest of that line gives the full DNS name and numerical address/port of the server.

If you want to know if a particular sim has been updated, the next line lists the server version number:

"Second Life Server 1.27.2.129783" or similar for pre-roll, and 1.30.... for post roll.
Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
09-18-2009 09:37
From: Lil Linden
Yes! Even sims will be rolled first, on Wednesday, and odd sims on Thursday.

During all rolling restarts we do indeed lock all sims to their hosts.

< prospero>
...but until LL does lock the sim to the host it's on, your even/oddness has a 50% chance of changing if the sim restarts. Beware relying on the host number not changing.
< /prospero>
_____________________
Sick of sims locking up every time somebody TPs in? Vote for SVC-3895!!!
- Go here: https://jira.secondlife.com/browse/SVC-3895
- If you see "if you were logged in.." on the left, click it and log in
- Click the "Vote for it" link on the left
Lil Linden
Linden Lab Employee
Join date: 12 May 2008
Posts: 81
09-18-2009 09:55
From: Sindy Tsure
< prospero>
...but until LL does lock the sim to the host it's on, your even/oddness has a 50% chance of changing if the sim restarts. Beware relying on the host number not changing.
< /prospero>


Prospero speaks the truth. Should your sim restart or crash during normal operating hours (outside a rolling restart,) it will likely be assigned to a new sim host when it comes back up. In that case, you will have a 50% chance of changing your "even/oddness".
Cincia Singh
Registered User
Join date: 26 Jun 2007
Posts: 79
09-18-2009 09:58
From: AWM Mars
Now you are sounding like LL support... in fact like all support.. its always the customer that is at fault. Truth of the matter is, SL is failing, suffering cascade failures due to the strain of forced expansion it was not designed to handle.


I don't recall addressing you and right now the words escape me to express the depth of my respect for you, your opinions, your technical expertise, and your manners.
Ilana Debevec
Registered User
Join date: 25 May 2007
Posts: 130
09-18-2009 10:54
Well we got 1.30 yesterday as part of the 1st pilot roll and... EVERYONE has noticed a significant different.. FOR THE BETTER. Still getting a pause when someone enters the sim, but it seems the sim recovers faster (1-3 seconds, vs 10-15). Doesn't SEEM to be degrading over time.. even sim crossings (we have 2 right next to each other, both that got 1.30) before I would literally 'ghost' 1/2 into the next sim before snapping back when crossing.. now.. on 4 out of the 10 tests I did, I just had a small pause going across, the rest I rubber-banded no more than 10m worst case.

Someone did something right?
Cincia Singh
Registered User
Join date: 26 Jun 2007
Posts: 79
09-18-2009 11:26
From: DanielRavenNest Noe
"Log in 40 dancing blingtards via random internet routes (ie not from a rack in the same server room) on a region with 6000 scripts".

OMG you have been watching through my windows at my tea parties, haven't you! ;-)
Katheryne Helendale
(loading...)
Join date: 5 Jun 2008
Posts: 2,187
09-18-2009 11:33
From: Ilana Debevec
Well we got 1.30 yesterday as part of the 1st pilot roll and... EVERYONE has noticed a significant different.. FOR THE BETTER. Still getting a pause when someone enters the sim, but it seems the sim recovers faster (1-3 seconds, vs 10-15). Doesn't SEEM to be degrading over time.. even sim crossings (we have 2 right next to each other, both that got 1.30) before I would literally 'ghost' 1/2 into the next sim before snapping back when crossing.. now.. on 4 out of the 10 tests I did, I just had a small pause going across, the rest I rubber-banded no more than 10m worst case.

Someone did something right?
Oh, that is awesome news! Perhaps the problems that have been plaguing the grid lately will finally go away? Oh, if this is so, then I will have Dante's babies! :D

This is also the server version that's supposed to finally enable the HTTP texture pipeline, yes? Can someone confirm that this works?
_____________________
From: Debra Himmel
Of course, its all just another conspiracy, and I'm a conspiracy nut.

Need a high-quality custom or pre-fab home? Please check out my XStreetSL Marketplace at http://www.xstreetsl.com/modules.php?name=Marketplace&MerchantID=231434/ or IM me in-world.
Viktoria Dovgal
Join date: 29 Jul 2007
Posts: 3,593
09-18-2009 16:59
Hey all, in case you missed it, the restarts were pushed closer, to tonight and tomorrow morning, thanks to some fun security fix.

http://status.secondlifegrid.net/2009/09/17/post734/
Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
09-18-2009 17:46
From: Ilana Debevec
Well we got 1.30 yesterday as part of the 1st pilot roll and... EVERYONE has noticed a significant different.. FOR THE BETTER. Still getting a pause when someone enters the sim, but it seems the sim recovers faster (1-3 seconds, vs 10-15). Doesn't SEEM to be degrading over time.. even sim crossings (we have 2 right next to each other, both that got 1.30) before I would literally 'ghost' 1/2 into the next sim before snapping back when crossing.. now.. on 4 out of the 10 tests I did, I just had a small pause going across, the rest I rubber-banded no more than 10m worst case.

Someone did something right?

/me thinks it's too early to declare victory. My home was also in the pilot and, for the short time I was logged in yesterday, it was still stalling like it has been for the past 4 months.

This bug is really, REALLY annoying me.
_____________________
Sick of sims locking up every time somebody TPs in? Vote for SVC-3895!!!
- Go here: https://jira.secondlife.com/browse/SVC-3895
- If you see "if you were logged in.." on the left, click it and log in
- Click the "Vote for it" link on the left
Tristin Mikazuki
Sarah Palin ROCKS!
Join date: 9 Oct 2006
Posts: 1,012
09-18-2009 17:51
From: Viktoria Dovgal
Hey all, in case you missed it, the restarts were pushed closer, to tonight and tomorrow morning, thanks to some fun security fix.

http://status.secondlifegrid.net/2009/09/17/post734/



Whats the one for tonite? odd or even
Tristin Mikazuki
Sarah Palin ROCKS!
Join date: 9 Oct 2006
Posts: 1,012
09-18-2009 18:40
uumm are they going to start at 6pm PST? or some other time zone? lol
and have they started yet? any one know?
Tristin Mikazuki
Sarah Palin ROCKS!
Join date: 9 Oct 2006
Posts: 1,012
09-18-2009 19:46
Well yes they have started....
Lil Linden
Linden Lab Employee
Join date: 12 May 2008
Posts: 81
Rolling Restart Update
09-18-2009 19:54
We've rolled approximately 700 / 2200 regions for tonight's work. I'll update again as we get to the 2/3 mark.

Everything that gets rolled from this point on is going to be an even sim; tomorrow will include the remaining evens & all odds.
Tristin Mikazuki
Sarah Palin ROCKS!
Join date: 9 Oct 2006
Posts: 1,012
09-18-2009 20:17
Thank you Lil ;-)

Are you doing just the pilot sims?
Lil Linden
Linden Lab Employee
Join date: 12 May 2008
Posts: 81
09-18-2009 20:54
From: Tristin Mikazuki
Thank you Lil ;-)

Are you doing just the pilot sims?


We've done the pilot sims, plus a good chunk of the rest. Tomorrow we'll have about 3700 sims to do, mostly odds and the rest of the evens.

Aaaaaannd: we're 66% done. I'll update again when we're finished for the night.
Tristin Mikazuki
Sarah Palin ROCKS!
Join date: 9 Oct 2006
Posts: 1,012
09-18-2009 20:56
Thanks SSOOO much ;-)
Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
09-18-2009 21:11
From: Lil Linden
We've done the pilot sims, plus a good chunk of the rest. Tomorrow we'll have about 3700 sims to do, mostly odds and the rest of the evens.

Aaaaaannd: we're 66% done. I'll update again when we're finished for the night.

So the pilot sims that got done yesterday got redone tonight?

edit: /me sends Lil a big pot of coffee.
_____________________
Sick of sims locking up every time somebody TPs in? Vote for SVC-3895!!!
- Go here: https://jira.secondlife.com/browse/SVC-3895
- If you see "if you were logged in.." on the left, click it and log in
- Click the "Vote for it" link on the left
1 2 3 4 5 6 7 8 9