Second Life Forums Archive - Feature Proposal: Script Error handling and recovery

Lex Neva

wears dorky glasses

Join date: 27 Nov 2004

Posts: 1,361

04-15-2005 19:40

I recently entered a proposal that scripters may want to jump on:

From: someone

Currently, when a script has some kind of critical error, such as a stack/heap collision, no script may restart that script, even with llResetOtherScript().

If the script is in an unmodifiable object (such as a product sold to a person), the object is rendered completely unusable for eternity. You have to give them a replacement, because they cannot manually reset the script.

More annoyingly, in an object with multiple scripts, there is simply no way to find out which script had an error. The defunct script will even still have its "running" checkbox selected.

For most errors or warnings, the scripted object is forced to say things to the outside world. This can be unsightly and unwanted. There is no way for a script to know when it has had any kind of runtime error or warning, or for any other script to find this out, for that matter.

I propose some kind of error handling or recovery that can be done in LSL without user intervention. Maybe a "script_error" event, which is raised in all scripts in the prim and gives useful information about which script died and what the error was. In the case of fatal errors like a stack/heap collision, of course, the script that died won't have this event triggered.

Vote here: http://secondlife.com/vote/get_feature.php?get_id=188

Peter Newell

Registered User

Join date: 17 Feb 2006

Posts: 20

04-23-2006 10:35

This is one feature proposal we shouldn't let sit around any longer. Having a way to capture script errors is so important to any programming language, even more so when dealing with production level products.

It seems a logical extension of the language to allow a script_error event that passes in the error message as an argument. The LSL internals clearly have this somewhere, in order to display the message in the script error box and the little icon.
This event should leave all of the internal error handling in place and just notify the script that there was an error, such that it can handle itself.

Please, check out the proposal and lets vote it into the language.

Peter Newell

Registered User

Join date: 17 Feb 2006

Posts: 20

script_error(string error_message) event

04-23-2006 10:57

Yeah, wouldn't it be great?!? Unfortunately there was a proposal over a year ago by Lex Neva that seemed to slip through the cracks.

You can find the proposal at http://secondlife.com/vote/index.php?get_id=188, please give it a few votes.

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

04-23-2006 14:00

just voted - thanks

(whilst this is very important and seemingly simple from the requirements perspective, I wonder if this is one of those that may be very difficult for them to implement due to design/architecture - would be nice for LL to let us view their internal development comments against the proposals)

Strife Onizuka

Moonchild

Join date: 3 Mar 2004

Posts: 5,887

04-23-2006 17:38

Forum cross posting, not cool.

Moving (from script tips to feature suggestions) and merging.

_____________________

Truth is a river that is always splitting up into arms that reunite. Islanded between the arms, the inhabitants argue for a lifetime as to which is the main river.
- Cyril Connolly

Without the political will to find common ground, the continual friction of tactic and counter tactic, only creates suspicion and hatred and vengeance, and perpetuates the cycle of violence.
- James Nachtwey

Haravikk Mistral

Registered User

Join date: 8 Oct 2005

Posts: 2,482

04-24-2006 03:58

I like the idea of having the event. Surely though we could extend this slightly and simply have a script not break entirely in the first place? ie in the event of a stack-heap collision the script will simply dump all of its variables and switch over to the script_error() event with a number identifying the error type.

If no such event exists in the code then it will give the error as normal, otherwise it will trust the event to do that.

Stack-heap collisions should surely be detectable before-hand anyway? ie if we are at the end of the stack and we try to add a new variable, extend a list or so-on, then we simply can't so throw away all variables.

The script_error() event would probably have to be limited in what it can do, if it flushes variables then there's no way of reporting what caused the error, only what went wrong (but that's the case already anyway). All you really need in the script is the ability to reset, or give a message. Resetting probably shouldn't be mandatory though, as there are scripts which after resetting will just break again anyway.

Argent Stonecutter

Emergency Mustelid

Join date: 20 Sep 2005

Posts: 20,263

04-24-2006 08:32

From: Haravikk Mistral

I like the idea of having the event. Surely though we could extend this slightly and simply have a script not break entirely in the first place? ie in the event of a stack-heap collision the script will simply dump all of its variables and switch over to the script_error() event with a number identifying the error type.

There are security issues in having a trap handled by another script... how about this?

In the script:

CODE


default
{
  state_entry()
  {
    llDebugScriptPin(19842001);
    //..
  }
  //..
}
//..

Then, in another script in the same prim:

CODE


default
{
  state_entry()
  {
    llDebugPinnedScripts(19842001);
  }
  debug_trap(integer n)
  {
    integer i;
    for(i = 0; i < n; i++)
    {
      string reason = llDebugReason(i);
      llOwnerSay("Trapped "+llDebugName(i)+": "+reason);
      list stack = llDebugCallStack(i);
      integer stacklen = llGetListLength(stack);
      if(stacklen)
      {
        llOwnerSay("Call stack:");
        integer j;
        for(j = 0; j < stacklen; j++)
          llOwnerSay("  "+llList2String(stack, j);
      }
      if(llSubStringIndex(llToLower(reason), "heap") != -1)
        llResetDebuggedScript(i);
    }
  }
}

Haravikk Mistral

Registered User

Join date: 8 Oct 2005

Posts: 2,482

04-24-2006 09:00

Oh no, I think maybe I've used the wrong word. I'm under the impression that your debug_script() is basically what the suggestion is, except that it is triggered in all scripts in a prim (so if they are dependant they know the script they use has failed) rather than requiring a PIN. I used the word 'dump' for variables instead of 'delete', I mean it would just clear them out, thus giving it the space to do its own error handling.

ie, I have script A with a big list that another script (B) accesses. The list gets too big so suffers a stack-heap collision, as such the list is destroyed and the script_error() event is called in A which resets itself.
The script_error() event is also called in B, which knows its information source is no longer valid and also resets itself.

If no script_error() events were found in the prim/object then it would give the current error as normal.

Debugging info would be really useful, but just having the ability to detect a script failure would be REALLY handy.
However, a PIN might be handy in case a script is renamed...

Although, that said, it's now got me wondering, do we want this? It just occurred to me that if a script can just reset itself, or be reset by a checker script, then it wouldn't require any checks for its memory running low, as it could just rely on this to reset itself instead, thus having a clean empty list, but in poor practise? Hmm, I'd still like it though, even without debugging info you could do things like IM the free memory to yourself so you can quickly work out what caused it.

Lex Neva

wears dorky glasses

Join date: 27 Nov 2004

Posts: 1,361

04-24-2006 10:46

From: Strife Onizuka

Forum cross posting, not cool.

Moving (from script tips to feature suggestions) and merging.

Whuh? Now this thread is totally confusing. For those of us who weren't following along, what happened?

Lex Neva

wears dorky glasses

Join date: 27 Nov 2004

Posts: 1,361

04-24-2006 10:59

Wow, thanks for noticing and resurrecting this thread, I'd totally forgotten about it. Poor thing languished for a year.

I'm not really sure it's a good idea to try to recover from errors well enough to call a script_error event handler. How do you recover from a divide by zero? When you have a stack/heap collision, how does it know what to free? It may not be obvious that "if a list made you hit the memory boundary, just free the list". What if memory was extremely tight, and the stack allocations required to call a touch_start event triggered the stack/heap collision? You could go willy-nilly freeing stuff until the script_error event was able to run, but then, how do you know that it won't cause another stack/heap error, resulting in an infinite-crash situation? Ugh...

That's why I wrote it how I did: a script_error event that was triggered in all of the other scripts in the prim. That way, the offending script can be safely stopped, and another script can pick up the error and reset the offending script if it decides that doing so is a good idea.

This was a year ago, and I've learned a lot since. There is a possible security risk if someone else places a script into an prim with my script, picking up error information. The llDebugScriptPin() idea above would work to avoid this situation, but I think just giving relatively terse (but still useful) information like script_error(string script_name, string error_message) would work wonders.

Interestingly, now we have the whole debug window/DEBUG_CHANNEL thing. I imagine it might be possible to listen on DEBUG_CHANNEL. There's still no way to determine what script died, though, and while I haven't tested it, I bet there's still no way to reset a script that blew up.

Argent Stonecutter

Emergency Mustelid

Join date: 20 Sep 2005

Posts: 20,263

04-24-2006 11:20

If all you can do from the debugger is reset the script or hide the error messages, rather than investigate the problem and take corrective action, this would be MUCH simpler:

CODE


default
{
  state_entry()
  {
    llTrapAction([
      TRAP_HEAP,TRAP_RESET,
      TRAP_PERMISSIONS,TRAP_IGNORE
    ]);
  }
}

Feature Proposal: Script Error handling and recovery
Lex Neva wears dorky glasses Join date: 27 Nov 2004 Posts: 1,361	04-15-2005 19:40 I recently entered a proposal that scripters may want to jump on: From: someone Currently, when a script has some kind of critical error, such as a stack/heap collision, no script may restart that script, even with llResetOtherScript(). If the script is in an unmodifiable object (such as a product sold to a person), the object is rendered completely unusable for eternity. You have to give them a replacement, because they cannot manually reset the script. More annoyingly, in an object with multiple scripts, there is simply no way to find out which script had an error. The defunct script will even still have its "running" checkbox selected. For most errors or warnings, the scripted object is forced to say things to the outside world. This can be unsightly and unwanted. There is no way for a script to know when it has had any kind of runtime error or warning, or for any other script to find this out, for that matter. I propose some kind of error handling or recovery that can be done in LSL without user intervention. Maybe a "script_error" event, which is raised in all scripts in the prim and gives useful information about which script died and what the error was. In the case of fatal errors like a stack/heap collision, of course, the script that died won't have this event triggered. Vote here: http://secondlife.com/vote/get_feature.php?get_id=188
Peter Newell Registered User Join date: 17 Feb 2006 Posts: 20	04-23-2006 10:35 This is one feature proposal we shouldn't let sit around any longer. Having a way to capture script errors is so important to any programming language, even more so when dealing with production level products. It seems a logical extension of the language to allow a script_error event that passes in the error message as an argument. The LSL internals clearly have this somewhere, in order to display the message in the script error box and the little icon. This event should leave all of the internal error handling in place and just notify the script that there was an error, such that it can handle itself. Please, check out the proposal and lets vote it into the language.
Peter Newell Registered User Join date: 17 Feb 2006 Posts: 20	script_error(string error_message) event 04-23-2006 10:57 Yeah, wouldn't it be great?!? Unfortunately there was a proposal over a year ago by Lex Neva that seemed to slip through the cracks. You can find the proposal at http://secondlife.com/vote/index.php?get_id=188, please give it a few votes.
Greg Hauptmann Registered User Join date: 30 Oct 2005 Posts: 283	04-23-2006 14:00 just voted - thanks (whilst this is very important and seemingly simple from the requirements perspective, I wonder if this is one of those that may be very difficult for them to implement due to design/architecture - would be nice for LL to let us view their internal development comments against the proposals)
Strife Onizuka Moonchild Join date: 3 Mar 2004 Posts: 5,887	04-23-2006 17:38 Forum cross posting, not cool. Moving (from script tips to feature suggestions) and merging. _____________________ Truth is a river that is always splitting up into arms that reunite. Islanded between the arms, the inhabitants argue for a lifetime as to which is the main river. - Cyril Connolly Without the political will to find common ground, the continual friction of tactic and counter tactic, only creates suspicion and hatred and vengeance, and perpetuates the cycle of violence. - James Nachtwey
Haravikk Mistral Registered User Join date: 8 Oct 2005 Posts: 2,482	04-24-2006 03:58 I like the idea of having the event. Surely though we could extend this slightly and simply have a script not break entirely in the first place? ie in the event of a stack-heap collision the script will simply dump all of its variables and switch over to the script_error() event with a number identifying the error type. If no such event exists in the code then it will give the error as normal, otherwise it will trust the event to do that. Stack-heap collisions should surely be detectable before-hand anyway? ie if we are at the end of the stack and we try to add a new variable, extend a list or so-on, then we simply can't so throw away all variables. The script_error() event would probably have to be limited in what it can do, if it flushes variables then there's no way of reporting what caused the error, only what went wrong (but that's the case already anyway). All you really need in the script is the ability to reset, or give a message. Resetting probably shouldn't be mandatory though, as there are scripts which after resetting will just break again anyway.
Argent Stonecutter Emergency Mustelid Join date: 20 Sep 2005 Posts: 20,263	04-24-2006 08:32 From: Haravikk Mistral I like the idea of having the event. Surely though we could extend this slightly and simply have a script not break entirely in the first place? ie in the event of a stack-heap collision the script will simply dump all of its variables and switch over to the script_error() event with a number identifying the error type. There are security issues in having a trap handled by another script... how about this? In the script: CODE default { state_entry() { llDebugScriptPin(19842001); //.. } //.. } //.. Then, in another script in the same prim: CODE default { state_entry() { llDebugPinnedScripts(19842001); } debug_trap(integer n) { integer i; for(i = 0; i < n; i++) { string reason = llDebugReason(i); llOwnerSay("Trapped "+llDebugName(i)+": "+reason); list stack = llDebugCallStack(i); integer stacklen = llGetListLength(stack); if(stacklen) { llOwnerSay("Call stack:"); integer j; for(j = 0; j < stacklen; j++) llOwnerSay(" "+llList2String(stack, j); } if(llSubStringIndex(llToLower(reason), "heap") != -1) llResetDebuggedScript(i); } } }
Haravikk Mistral Registered User Join date: 8 Oct 2005 Posts: 2,482	04-24-2006 09:00 Oh no, I think maybe I've used the wrong word. I'm under the impression that your debug_script() is basically what the suggestion is, except that it is triggered in all scripts in a prim (so if they are dependant they know the script they use has failed) rather than requiring a PIN. I used the word 'dump' for variables instead of 'delete', I mean it would just clear them out, thus giving it the space to do its own error handling. ie, I have script A with a big list that another script (B) accesses. The list gets too big so suffers a stack-heap collision, as such the list is destroyed and the script_error() event is called in A which resets itself. The script_error() event is also called in B, which knows its information source is no longer valid and also resets itself. If no script_error() events were found in the prim/object then it would give the current error as normal. Debugging info would be really useful, but just having the ability to detect a script failure would be REALLY handy. However, a PIN might be handy in case a script is renamed... Although, that said, it's now got me wondering, do we want this? It just occurred to me that if a script can just reset itself, or be reset by a checker script, then it wouldn't require any checks for its memory running low, as it could just rely on this to reset itself instead, thus having a clean empty list, but in poor practise? Hmm, I'd still like it though, even without debugging info you could do things like IM the free memory to yourself so you can quickly work out what caused it.
Lex Neva wears dorky glasses Join date: 27 Nov 2004 Posts: 1,361	04-24-2006 10:46 From: Strife Onizuka Forum cross posting, not cool. Moving (from script tips to feature suggestions) and merging. Whuh? Now this thread is totally confusing. For those of us who weren't following along, what happened?
Lex Neva wears dorky glasses Join date: 27 Nov 2004 Posts: 1,361	04-24-2006 10:59 Wow, thanks for noticing and resurrecting this thread, I'd totally forgotten about it. Poor thing languished for a year. I'm not really sure it's a good idea to try to recover from errors well enough to call a script_error event handler. How do you recover from a divide by zero? When you have a stack/heap collision, how does it know what to free? It may not be obvious that "if a list made you hit the memory boundary, just free the list". What if memory was extremely tight, and the stack allocations required to call a touch_start event triggered the stack/heap collision? You could go willy-nilly freeing stuff until the script_error event was able to run, but then, how do you know that it won't cause another stack/heap error, resulting in an infinite-crash situation? Ugh... That's why I wrote it how I did: a script_error event that was triggered in all of the other scripts in the prim. That way, the offending script can be safely stopped, and another script can pick up the error and reset the offending script if it decides that doing so is a good idea. This was a year ago, and I've learned a lot since. There is a possible security risk if someone else places a script into an prim with my script, picking up error information. The llDebugScriptPin() idea above would work to avoid this situation, but I think just giving relatively terse (but still useful) information like script_error(string script_name, string error_message) would work wonders. Interestingly, now we have the whole debug window/DEBUG_CHANNEL thing. I imagine it might be possible to listen on DEBUG_CHANNEL. There's still no way to determine what script died, though, and while I haven't tested it, I bet there's still no way to reset a script that blew up.
Argent Stonecutter Emergency Mustelid Join date: 20 Sep 2005 Posts: 20,263	04-24-2006 11:20 If all you can do from the debugger is reset the script or hide the error messages, rather than investigate the problem and take corrective action, this would be MUCH simpler: CODE default { state_entry() { llTrapAction([ TRAP_HEAP,TRAP_RESET, TRAP_PERMISSIONS,TRAP_IGNORE ]); } }

Welcome to the Second Life Forums Archive

Feature Proposal: Script Error handling and recovery