Is this level of sim uptime acceptable?
|
Talon Brown
Slacker Punk
Join date: 17 May 2006
Posts: 352
|
07-03-2007 10:09
I own a 512m parcel in an older established sim called Maemilkkot. In the past I've noticed a few occasions where this sim was locked in a restart loop. I asked around and the general consensus was that it was due to griefers. However, after having one of my events there absolutely ruined a few weekends ago by one of these loops that appeared to have no apparent cause at all I wrote a little script which monitors Maemilkkot's status remotely from another sim. (Made no sense to place a crash monitor in a sim that was suspected to be crash-prone.) Sure enough, the monitor showed what I already suspected. These events are more common than I suspected. There was an episode last night and just a few minutes ago I logged in only to see the following IMs from my monitor:
[9:34] Sim Status Monitor: [08:10] Maemilkkot is starting. [9:34] Sim Status Monitor: [08:11] Maemilkkot is up. [9:34] Sim Status Monitor: [08:13] Maemilkkot is starting. [9:34] Sim Status Monitor: [08:14] Maemilkkot is up. [9:34] Sim Status Monitor: [08:16] Maemilkkot is down. [9:34] Sim Status Monitor: [08:17] Maemilkkot is starting. [9:34] Sim Status Monitor: [08:18] Maemilkkot is up. [9:34] Sim Status Monitor: [08:20] Maemilkkot is starting. [9:34] Sim Status Monitor: [08:21] Maemilkkot is up. [9:34] Sim Status Monitor: [08:23] Maemilkkot is crashed. [9:34] Sim Status Monitor: [08:30] Maemilkkot is down. [9:34] Sim Status Monitor: [08:31] Maemilkkot is up.
Note the timestamps are when each status change was noted and the [9:34] is the time I logged in and received them all. Now I ask, does this seem right to you? Is this type of thing occuring in other mainland sims as well or is Maemilkkot special in some manner?
|
Rhyph Somme
Registered User
Join date: 2 Dec 2005
Posts: 263
|
07-03-2007 10:22
What I'm not quite understanding is, if the region is going up and down, I don't seem to see a match in the number of downs vs. ups vs. what the script is labeling as crashes etc. 5 up's, 3 starting, 1 down, 1 crash? I have also never seen a region "kick over" and restart in less than a minute. Also, if a region crashes, scripts cease to operate, how does a monitoring device record the fact that a region crashed, or is down, if it's indeed really down? I think region monitoring devices have never IMO been successful for these reasons.
|
Talon Brown
Slacker Punk
Join date: 17 May 2006
Posts: 352
|
07-03-2007 10:32
As I said, the monitor is not in Maemilkkot so it never goes down with the sim. As for how it obtains its data, it makes a llRequestSimulatorData() dataserver query every 10 seconds and IMs me each time the status returned from said function changes. As for why the number of "ups" doesn't match the number of "downs", I honestly can't say as I don't know how llRequestSimulatorData works behind the scenes. I do know that this phenomenon is definitely occuring though as any other resident of Maemilkkot can attest to as well, if they are around when it happens.
|
Rhyph Somme
Registered User
Join date: 2 Dec 2005
Posts: 263
|
07-03-2007 10:58
Ok, now that you have explained how you are doing this I understand. You are in effect polling (pinging) the region's status. It is answering, why it's changing states it's hard to say. I am also not that familiar with how it works behind the scenes, but I would surmise one of two things. Either your real down time is:
[9:34] Sim Status Monitor: [08:23] Maemilkkot is crashed. [9:34] Sim Status Monitor: [08:30] Maemilkkot is down.
From 8:23 to 8:30 which would seem to me to be a realistic "faster restart" or it is really going down at perhaps just before this entry:
[9:34] Sim Status Monitor: [08:10] Maemilkkot is starting.
and returning to regular service at:
[9:34] Sim Status Monitor: [08:31] Maemilkkot is up.
which is absolutely what I would call "normal" for a full restart time.
Many times, in many cases have I seen a region take up to a full 30 mins to come back online. I think it has something to deal with hunting for an available cpu core to sit on, maybe that's the up's and starting's that are showing there as it attempts to bring itself back online on any given number of available cores. It's of course all speculation, but it would seem they have a pretty smart system in place for the on the fly relocation of a region that has been downed to move itself to the next available core in a pool of spares. Maybe there is some sorta performance fall backs too, if it tries to come up on a server where maybe 3 of the 4 cores are maxed out perf wise, it goes to the next.
Is the region infact visibly going up and down while say perhaps standing in a neighboring region? Or has this yet to be caught in the act visually and just have the logs to go by? You said it interrupted an event you had? Did the region just go offline for some time?
|
Ava Glasgow
Hippie surfer chick
Join date: 27 Jan 2007
Posts: 2,172
|
07-03-2007 11:21
From: Rhyph Somme I have also never seen a region "kick over" and restart in less than a minute. I have. I had a Linden come out and restart a buggy mainland server for me, and was floored by how quickly it came back online. From the "logging off, get out now" warnings, through going offline and turning red on the mini-map, to fully back online took less than a minute. (Actually, I saw this twice in rapid succession, because he accidently rebooted the wrong sim the first time.) Talon, I suggest you open a support ticket, and then follow it up with a live chat if you don't get a response within a day or so. I've had good success with the support portal, so feel free to contact me if you'd like help with that.
|
Talon Brown
Slacker Punk
Join date: 17 May 2006
Posts: 352
|
07-03-2007 11:36
From: Rhyph Somme Is the region infact visibly going up and down while say perhaps standing in a neighboring region? Or has this yet to be caught in the act visually and just have the logs to go by? You said it interrupted an event you had? Did the region just go offline for some time? During the ruined event a few weekends ago what happened was everyone there had their ability to interact with the world just stop entirely. Typing produced no chat, etc. This was followed by the minimap turning red. It was at that point that I relogged into my home sim and Maemilkkot was definitely down at that point as attempted TPs returned a "destination cannot be reached" error. A few minutes later, we were able to get back in and a few people came back only for the same thing to happen all over again. Once that happened, Maemilkkot was offline for a longer duration and we quickly tried, without much success, to regroup in another sim. As for last night's issue, the surrounding regions visually went offline as I was standing on my land there and then the same thing happened. World interaction was lost, I relogged into my home sim and watched the monitor report Maemilkot come back up, go back down, a total of 4 times before it finally stabilized. From: Ava Glasgow Talon, I suggest you open a support ticket, and then follow it up with a live chat if you don't get a response within a day or so. I've had good success with the support portal, so feel free to contact me if you'd like help with that. Thanks for the advice, I'll try that. I've a 4th of July event planned for later tonight and naturally I'm wary of it happening again and ruining that as well.
|
Rhyph Somme
Registered User
Join date: 2 Dec 2005
Posts: 263
|
07-03-2007 12:01
From: Ava Glasgow I have. I had a Linden come out and restart a buggy mainland server for me, and was floored by how quickly it came back online. From the "logging off, get out now" warnings, through going offline and turning red on the mini-map, to fully back online took less than a minute. (Actually, I saw this twice in rapid succession, because he accidently rebooted the wrong sim the first time.) Talon, I suggest you open a support ticket, and then follow it up with a live chat if you don't get a response within a day or so. I've had good success with the support portal, so feel free to contact me if you'd like help with that. You were lucky and typically a warm restart will not always cause a region to hunt for a new core to reside on. It will just go down and come right back up on the same core/server. I should have referenced that originally. And the warning windows are timed on a manually invoked restart, starting with a 2 min warning, a 1 min, a 45 second... and so on. So for a region to go down and back up fully in less than a minute including warnings as you state should not be possible.... and I'm not gonna argue semantics with you.
|
Alicia Sautereau
if (!social) hide;
Join date: 20 Feb 2007
Posts: 3,125
|
07-03-2007 12:06
From: Rhyph Somme You were lucky and typically a warm restart will not always cause a region to hunt for a new core to reside on. It will just go down and come right back up on the same core/server. I should have referenced that originally. And the warning windows are timed on a manually invoked restart, starting with a 2 min warning, a 1 min, a 45 second... and so on. So for a region to go down and back up fully in less than a minute including warnings as you state should not be possible.... and I'm not gonna argue semantics with you. LL can take down servers without that notice, i was once in a sim wich was bugged and a linden showed up and hit the reboot button as we all were kicked offline immidiatly without notice besides a 1 line warning that he`s taking the sim offline
|
Rhyph Somme
Registered User
Join date: 2 Dec 2005
Posts: 263
|
07-03-2007 12:08
From: Alicia Sautereau LL can take down servers without that notice, i was once in a sim wich was bugged and a linden showed up and hit the reboot button as we all were kicked offline immidiatly without notice besides a 1 line warning that he`s taking the sim offline Ohkay 
|
Ava Glasgow
Hippie surfer chick
Join date: 27 Jan 2007
Posts: 2,172
|
07-03-2007 12:25
From: Rhyph Somme the warning windows are timed on a manually invoked restart, starting with a 2 min warning, a 1 min, a 45 second... and so on. In my case the first warning was either 10 or 15 seconds. Because we were standing at a corner (this was related to the 0,0,0 bug), we just stepped over into the next region. That was the first time, when he accidentally restarted the sim we were (originally) in rather than the bugged one. The second time he restarted the correct sim, but we didn't hear the warnings because we weren't in it at the time. Still, the "red on the mini-map" phase was very short, maybe 30 seconds, and then all was usable again. I guess they have the ability to skip the full warning cycle? There wasn't anyone in the sim he meant to restart, so it would have been unnecessary.
|