Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Rolling restart for 1.24.7 2008 Sep 30-Oct 2

Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
10-02-2008 21:19
From: Urantia Jewell
I'm pretty sure that the ping times come after the server id. Please correct me if I'm wrong..

Nope. You on a Mac or Linux, Sharie?

From: someone
14 ae-11-11.car2.Phoenix1.Level3.net (4.69.133.34) 106.981 ms 107.560 ms 107.173 ms
15 LINDEN-RESE.car2.Phoenix1.Level3.net (63.214.168.58) 1311.870 ms 1310.940 ms 1310.973 ms


From: someone
11 81 ms 81 ms 80 ms ae-11-11.car2.Phoenix1.Level3.net [4.69.133.34]
12 81 ms 80 ms 80 ms LINDEN-RESE.car2.Phoenix1.Level3.net [63.214.168.58]

The pings are highlighted. It's really just a format difference.

edit: see about 1/3rd of the way down this random google link: http://www.exit109.com/~jeremy/news/providers/traceroute.html

From: Urantia Jewell
Again, if I'm wrong and the ping times come before the router id then I apologize.

No worries.

edit again: Seems a lot happier this morning.. One timeout, right at the start, and one burp up to 106ms, but otherwise a rock solid 79-81ms.

Ping statistics for 63.214.168.58:
Packets: Sent = 542, Received = 541, Lost = 1 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 78ms, Maximum = 106ms, Average = 80ms
Argent Stonecutter
Emergency Mustelid
Join date: 20 Sep 2005
Posts: 20,263
10-03-2008 06:34
From: Urantia Jewell
I'm pretty sure that the ping times come after the router id. Please correct me if I'm wrong.
OK.

There are many traceroute programs out there, some based on the old BSD code, others written from scratch. They have had hundreds of programmers hacking on them changing the way they operate and tweaking the output for the past 20 years. The only constant is that each line contains one or more round trip times, a hop number, and the name and/or IP address of the host.
Chupacabra Decosta
Second Life Addicted
Join date: 28 May 2008
Posts: 10
10-03-2008 08:21
Another rolling?

Second Life Server 1.24.8.98496

I was caught in a sim restart, after i crash with the sim....Second Life Server 1.24.8.98496
Is SL rolling again?
Dex Mason
Registered User
Join date: 24 Aug 2005
Posts: 96
10-03-2008 09:01
Yup another rolling restart - all sims something to do with an emergency RR to correct it.

should be on the blog soon
flopsie McArdle
Registered User
Join date: 26 May 2006
Posts: 3
no notice?
10-03-2008 09:19
how come we were told on the grid status that 1.24.8 was coming out? if people are angry about the number of rolling restarts recently, imagine how they feel, when you dont even bother to say anything first. LL's commitment to keeping the cominuty informed of whet they are doing has always been spotty, but not even a short notice on the grid status this time?

;'
Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
10-03-2008 09:26
If this really is an 'emergancy' deploy, they probably figure it's more important to get it going ASAP than to update the blog/forums. For something that causes content or L$ loss, or fixes a permissions issue, I'm pretty ok with that.

Once it's moving and Prospero has a chance to breathe again, I hope he lets us know what happened.

edit: http://status.secondlifegrid.net/2008/10/03/post270/ says:
From: Status Blog
There is a secondary Rolling Restart (to server version 1.24.8) in progress. We expect this to complete within 5 hours. Please watch the Status Blog for detailed information, which will be posted after the Rolling Restart has completed.
Prospero Linden
Linden Lab Employee
Join date: 6 Aug 2007
Posts: 315
10-03-2008 09:51
Yes, the rolling restart that is going right now is not something that was planned until last night, and there will be more information as it ends and we're able to gather ourselves together.

Some other answers: re: the network times and such posted above. A good portion of the grid is in Phoenix right now. Phoenix performance is not in general slower than Dallas performance. However, any time you try to region cross or teleport from a region at one colo to another, it will be slower than doing it to a region that's in the same colo. For this reason, we try to make neighboring regions in the same colo as much as possible.

It is incorrect to say that all transactions to Phoenix go through Dallas. *Some* transactions from Dallas to Phoenix go through San Francisco, and vice versa.

However, anybody may have different performance to different data centers-- depending on the network path you go through to get there, and what's going on along all of it.

In any event, all this network stuff is wildly off topic for this thread.

From: someone

And you don't think scripts need to be reset after a rolling you are wrong. Anyone using dance ball sets have te reset them if someone stays on them when sim goes down or they stay invisible.


I know that people are not going to like this response, but the truth is that that's a poorly written dance ball. Something that is really going to be robust for full usage needs to be written to handle unexpected changes of state. I've got a poorly written poseball I use myself, written by me, that pops up a script error whenever somebody logs off while sitting on it. It hasn't bothered me enough to actually *fix* it, but it is true that I haven't handled all the possible error cases yet.

Yes, the memory leak rolling restart was particularly painful. It's also not characteristic of every rolling restart we have.
Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
10-03-2008 10:02
Thanks for the update, Prospero!

From: Prospero Linden
In any event, all this network stuff is wildly off topic for this thread.

Welcome to the forums! :)

(edit: and sorry for all the network babble..)
Sweet Valentine
Cupids I&D
Join date: 18 Feb 2006
Posts: 39
10-03-2008 10:02
this is very unsatisfactory! no warning of a rolling restart,, it is beyond me what you people think, we are paying customers here,, if you want us to respect you please have some for us.
Please warn us before you restart, it takes 2 mins to post to your grid status but still at this point we are not worthy of that. It is bad enough these rolling restarts have broken so much content my vendors are toast again and pose scripts are broken, now no warnings again just sick. .=(((((((((
Abigail Merlin
Child av on the lose
Join date: 25 Mar 2007
Posts: 777
10-03-2008 10:05
From: Sindy Tsure
If this really is an 'emergancy' deploy, they probably figure it's more important to get it going ASAP than to update the blog/forums. For something that causes content or L$ loss, or fixes a permissions issue, I'm pretty ok with that.

Once it's moving and Prospero has a chance to breathe again, I hope he lets us know what happened.


if it is an 'emergancy' rollout I asume it will be a whole grid restart, especialy if it is an exploid fix of showstopper level.
been a while that we had a whole grid restart.
Ann Otoole
Registered User
Join date: 22 May 2007
Posts: 867
10-03-2008 10:06
Hopefully this is not a restart just to allow libsecondlife (sleek, traffic falsification) bots back in.
Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
10-03-2008 10:07
From: Abigail Merlin
if it is an 'emergancy' rollout I asume it will be a whole grid restart, especialy if it is an exploid fix of showstopper level.
been a while that we had a whole grid restart.

It sounds like the whole grid _is_ restarting! There's no talk of pilot rolls or even/odd hosts. It's just not restarting all at once.

You're probably right (and I'm totally guessing) that it's not a nasty exploit. I think I've only seen them pull the plug for something like that once before. I'm almost tempted to wish for it again, just so people who think the rolling restarts are so painful can see what it used to be like on Upgrade Wednesdays.. :)
Kira Cuddihy
Registered User
Join date: 29 Nov 2006
Posts: 1,375
10-03-2008 10:16
Didn't they shut down for several days once before Sindy, when someone tried to hack sl?
_____________________
Sindy Tsure
Will script for shoes
Join date: 18 Sep 2006
Posts: 4,103
10-03-2008 10:29
From: Kira Cuddihy
Didn't they shut down for several days once before Sindy, when someone tried to hack sl?

Dunno! I don't think they've ever shut it down and kept it down that long since I've been here..
Winter Ventura
Eclectic Randomness
Join date: 18 Jul 2006
Posts: 2,579
10-03-2008 10:30
HOW DARE YOU FIX THINGS!!!

Okay I'm done.
_____________________

● Inworld Store: http://slurl.eclectic-randomness.com
● Website: http://www.eclectic-randomness.com
● Twitter: @WinterVentura
Prospero Linden
Linden Lab Employee
Join date: 6 Aug 2007
Posts: 315
10-03-2008 10:34
So, yeah, we're doing this one all in one go, as opposed to doing half one day and half the next.

We're going at the same speed, though, so it'll take 6 hours to complete today, whereas usually it takes about 3 hours on each of the "half" days.
eku Zhong
Apocalips = low prims
Join date: 27 May 2008
Posts: 752
10-03-2008 10:36
is this a up down or sideways?
because some of my sims still werent rolled from the first time .. now the one that was rolled is being rolled again... ???
Jahar Aabye
Registered User
Join date: 14 Mar 2007
Posts: 58
10-03-2008 10:36
Here's a quick addition for danceballs and other scripts that rely on the changed() event (especially CHANGED_LINK) that might be screwed up by a rolling restart:


integer dancing = FALSE;

state_entry()
{

llSetTimerEvent(300.0); //sets a 5 minute timer, slow enough that it shouldn't add much overhead script time

}

changed(integer change)
{

if((change & CHANGED_LINK) == CHANGED_LINK)
{

key av = llAvatarOnSitTarget()

if(av != NULL_KEY)
{

//here is where all the dance stuff probably resides, just add:

dancing = TRUE;

}

else if(av == NULL_KEY)
{

dancing = FALSE;
}

}

}

timer()
{

if((dancing) && llGetNumberOfPrims() == 1) // if the ball is part of a linked set, replace 1 with the total number of prims, as this is testing whether the ball is empty
{

llResetScript();

}

}

}


What this does is two things. It adds a variable (dancing) that gets switched on when someone is using the danceball and switched off when they stand up. It also sets a timer to check every five minutes to see if the dance ball still thinks someone is dancing on it while it is empty. If that happens, the script resets itself. Note that this isn't perfect and probably would have to be adapted for your specific use, but it should still help.


I'm assuming that this rolling restart is due to some SEVERE issue for it to occur across the whole grid suddenly. Either a serious security vulnerability or a serious crash problem would be the only obvious answers, and crash problems are usually announced, so one can guess where I'm placing my bets. I'd just gotten feedback from beta testers last night for a major product I'm working on, and I was planning on doing some major debugging today, but something tells me that a grid-wide rolling restart might not be the best time to be doing script work. Especially since I've already had to rescript half of this stuff after a script revert 2 weeks ago that literally set off a seizure when I found it.
Zena Juran
Registered User
Join date: 21 Jul 2007
Posts: 473
10-03-2008 10:37
From: Prospero Linden
Yes, the rolling restart that is going right now is not something that was planned until last night, and there will be more information as it ends and we're able to gather ourselves together.

Some other answers: re: the network times and such posted above. A good portion of the grid is in Phoenix right now. Phoenix performance is not in general slower than Dallas performance. However, any time you try to region cross or teleport from a region at one colo to another, it will be slower than doing it to a region that's in the same colo. For this reason, we try to make neighboring regions in the same colo as much as possible.

It is incorrect to say that all transactions to Phoenix go through Dallas. *Some* transactions from Dallas to Phoenix go through San Francisco, and vice versa.

However, anybody may have different performance to different data centers-- depending on the network path you go through to get there, and what's going on along all of it.

In any event, all this network stuff is wildly off topic for this thread.



I know that people are not going to like this response, but the truth is that that's a poorly written dance ball. Something that is really going to be robust for full usage needs to be written to handle unexpected changes of state. I've got a poorly written poseball I use myself, written by me, that pops up a script error whenever somebody logs off while sitting on it. It hasn't bothered me enough to actually *fix* it, but it is true that I haven't handled all the possible error cases yet.

Yes, the memory leak rolling restart was particularly painful. It's also not characteristic of every rolling restart we have.



Sure would be nice to have a data center in the eastern USA. And I'm sure people from around the world would love a data center on their continent. Any plans for further expansion Prospero?
Jahar Aabye
Registered User
Join date: 14 Mar 2007
Posts: 58
10-03-2008 10:42
Keep in mind that more datacenters means more chances of having issues teleporting between regions in different datacenters. It also increases the (admittedly rare) chances of having the communications between datacenters get cut (which has happened on occasion). On the other hand, it also means that should one datacenter go down, there are others still online, so the entire grid would be unlikely to go down in the event of a failure at one datacenter.

Still, it would add significantly to the time it takes for different regions to communicate to each other. This could be especially problematic if it did involve regions that were next to each other, since you'd have the problem of a sim potentially having to communicate to child agents who are hosted on a simulator at a different datacenter. To translate that into English: lag.
flopsie McArdle
Registered User
Join date: 26 May 2006
Posts: 3
10-03-2008 10:46
From: Sindy Tsure
If this really is an 'emergancy' deploy, they probably figure it's more important to get it going ASAP than to update the blog/forums. For something that causes content or L$ loss, or fixes a permissions issue, I'm pretty ok with that.

Once it's moving and Prospero has a chance to breathe again, I hope he lets us know what happened.

edit: http://status.secondlifegrid.net/2008/10/03/post270/ says:


that notice came long after my sim restarted, and after id posted. how long would it have taken to post that before the restart? and id rather hear from the people that are in the know, we can all speculate why. my guess which was also wrong, was that giant mutant ants attacked LL as they were getting underway, and everyone knows that its impossible to fight off giant insects, do a rolling restart AND post to the blog.
robertltux McCallen
Registered User
Join date: 17 Nov 2007
Posts: 50
something that would maybe help
10-03-2008 10:49
Why doesn't LL create a Core Beacon object and then give it out (say at the linden office locations)

maybe a cone prim with a rotating hand on top that can be set to

1 Green (normal color) all clear everythings fine
2 Flashing Bright green important blog posting
3 Yellow critical blog posting (security flaw found major client release)
4 Orange Rolling restart day (this server is in the group)
5 Red trouble right here in River City (half hour to rr window)
6 Red Flashing Big trouble grid wide trouble
7 Black (should never happen) colo failure chunks of the grid are shutdown dogs and cats sleeping together

maybe a side project for some of the programmers??
Ann Otoole
Registered User
Join date: 22 May 2007
Posts: 867
10-03-2008 11:00
You broke the minimap. Avatars no longer show on the minimap in sims running this new borked version.

Try again.
Triple Peccable
Registered User
Join date: 7 Jul 2007
Posts: 70
10-03-2008 11:00
From: robertltux McCallen
Why doesn't LL create a Core Beacon object and then give it out (say at the linden office locations)

maybe a cone prim with a rotating hand on top that can be set to

1 Green (normal color) all clear everythings fine
2 Flashing Bright green important blog posting
3 Yellow critical blog posting (security flaw found major client release)
4 Orange Rolling restart day (this server is in the group)
5 Red trouble right here in River City (half hour to rr window)
6 Red Flashing Big trouble grid wide trouble
7 Black (should never happen) colo failure chunks of the grid are shutdown dogs and cats sleeping together

maybe a side project for some of the programmers??

Good idea! The problem is that between the blog, status page, and these forums, it is already difficult to keep the information current. What makes you think an in-world status device would get reliably updated?
Triple Peccable
Registered User
Join date: 7 Jul 2007
Posts: 70
10-03-2008 11:03
From: Ann Otoole
You broke the minimap. Avatars no longer show on the minimap in sims running this new borked version.

Try again.

My mini-map is working fine in the restarted sims (using the RC client anyway).
1 2 3 4 5 6 7 8