Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Compiler flakiness... "syntax errors" triggered by trivial code changes

Something Something
Something Estates
Join date: 26 Sep 2006
Posts: 121
02-01-2007 03:50
I'm trying to use LSL to create a non-trivial scripted object. I really would have preferred to avoid doing any scripting in SL, but in a world where every single action comes out of a script, well, I've concluded it's something you can't really do without.

The problem is bogus syntax errors that arise when you make some completely trivial and syntactically correct change to an existing working program.

You can have a working program, and simply add a dummy else clause to an if statement... and suddenly, a line of code dozens of lines away in a completely different function develops a bogus syntax error. Comment out the dummy else clause and it compiles again. Sometimes rewriting the "syntax error" line in question to use two or three lines, with temporary variables, makes the script compile again. Sometimes not. Sometimes merely rearranging the order of two global functions, cutting and pasting to put one before instead of after another, magically cures these bogus syntax errors. But then you merely create another dummy global function and the syntax errors reappear. Comment out the dummy global function, and the script compiles again.

Basically, it seems that once a script gets to a certain number of lines, any completely trivial change to a working program runs the risk of triggering a syntax error in distant unrelated code. No, the trivial additional code lines in question are not missing any curly brackets (braces) or semicolons, and they don't contain any reserved word keywords. They are truly trivial dummy statements and easy to verify as correct by inspection. The new lines of code do not introduce any actual syntax errors. This is pure compiler brain damage.

Surely there are no database or scaling issues here. Out of all the SL residents, only a vanishingly small percentage are compiling scripts at any given time. No asset servers, no teleporting, no actual interaction with anything in-world in SL, just a compiler for heaven's sake. It can't be a memory issue, because it's a compile-time error, not a runtime error. For a small straightforward C-like language like LSL, surely a first-year computer science student could write a working non-flaky compiler as a homework assignment.

For those of you who have been writing LSL scripts for years, what's the workaround for this compiler flakiness?
Stephen Zenith
Registered User
Join date: 15 May 2006
Posts: 1,029
02-01-2007 04:23
I've not had any problems like that. However, your first problem about the dummy else is because the compiler will think that whatever follows the else is a genuine else.

I'm assuming you're trying to do something like this,
CODE

if (condition)
{
someAction ();
}
else


someOtherAction ();



If you want a dummy else, do something like:

CODE

if (condition)
{
someAction ();
}
else
{
}
someOtherAction ();


Otherwise you're going to throw the whole thing off. The compiler has its flaws, but apart from a few well-documented cases (subtracting constants spring to mind), its syntax checking is ok. Its code generation and optimization are another story however.

Without seeing more concrete examples of code you're having problems with, it would be difficult to be any more specific.
_____________________
Malachi Petunia
Gentle Miscreant
Join date: 21 Sep 2003
Posts: 3,414
02-01-2007 05:56
From: Something Something
...Basically, it seems that once a script gets to a certain number of lines, any completely trivial change to a working program runs the risk of triggering a syntax error in distant unrelated code...
There is a hard limit (16kB if memory serves) for your script and data to live in. Go larger than that and you need to split it between scripts and use some IPC to communicate.

Surprisingly, statement re-ordering can change the byte-code generation and thus push you over a limit you were near already. This might apply. The compiler is notorius for getting confused between out of static space and syntax errors. Even more counterintuitive: comments take up part of that tiny VM space. Good luck.
_____________________
Masakazu Kojima
ケロ
Join date: 23 Apr 2004
Posts: 232
02-01-2007 06:54
In my experience, on Windows, bogus syntax errors in complex scripts are generally caused by exceeding the max parser stack depth (150).

Basically, the parser works by pushing tokens onto a stack until it finds a pattern that it can reduce. For instance, the .y file has things like:
CODE
statement
: // ...
| IF '(' expression ')' statement ELSE statement

Which is the reason why just chaining too many else if's can cause a bogus syntax error. What you are experiencing is probably the same thing:
CODE
if ( something ) // IF ( expression ) 
{
action(); // statement
}
else if ( other_thing ) // ELSE IF ( expression )
{
other_action(); // statement
}
else // ELSE
{
if ( x > y ) { // IF ( expression )
y = x; // statement
// The stack here looks something like,
// STATE_DEFAULT '{' state_entry '{'
// IF '(' expression ')' statement ELSE
// IF '(' expression ')' statement ELSE
// IF '(' expression ')' statement
}
}
Every ELSE IF you add will count against the stack depth this way during all following blocks until the whole IF chain can be reduced. In this example, the stack depth is 22 before any of them are reduced. This also happens a lot with large list definitions.

If this is what you are running into, lslint will detect and report it. The easiest way to work around it is just splitting things up into more functions.

edit to add: Global variables and functions are also added to the stack like this, and cannot be reduced until the whole "globals" section is complete. So for example, if you have 5 functions, and the last one brings the stack up to 151, you can solve the problem by shuffling it to the top, where it can be processed without the other four functions counting against it.

CODE
a()
{
if ( something )
action(); // stack here is 9
}

b()
{
if ( something )
action(); // stack here is 10
}
Kidd Krasner
Registered User
Join date: 1 Jan 2007
Posts: 1,938
02-01-2007 15:56
From: Something Something
... For a small straightforward C-like language like LSL, surely a first-year computer science student could write a working non-flaky compiler as a homework assignment.

For those of you who have been writing LSL scripts for years, what's the workaround for this compiler flakiness?


Oh for the days when that was true. Back when I was in school, parsing and compilers were a standard part of the CS curiculum, though compilers would more likely be done sophomore year. These days, I'm still surprised when I read about experienced professionals who don't understand how to use regular expression pattern matching, let alone implement it.

In any event, I'm not quite sure what sort of background you're coming from. Beginners frequently come across this type of problem because of silly things, like omitting a semicolon, putting it after the closing brace, confusing a bracket for a brace, etc. Even experienced people do it (though in our case, it's due to presbyopia). When this stuff happens to me, and I can't get someone else to check the code (usually the most efficient, and well worth the Doh when the othr person points out the obvious), I use a binary search to cut and paste pieces to isolate the cause. I have found compiler bugs this way, but it's usually my fault.

As far as the parser is concerned: yes, I think an undergraduate CS major should be able to write a parser. But I don't think a CS undergraduate should be expected to write a parser that has good error handling. It's a much more difficult problem. Some parsing algorithms get hopelessly lost.
Kidd Krasner
Registered User
Join date: 1 Jan 2007
Posts: 1,938
02-01-2007 16:13
From: Masakazu Kojima
In my experience, on Windows, bogus syntax errors in complex scripts are generally caused by exceeding the max parser stack depth (150).


Why would this be OS dependent? And why would the stack depth be so low? (Yes, I understand why you may not want to use the entire code stack, but this feels too small.)

From: someone

...Every ELSE IF you add will count against the stack depth this way during all following blocks until the whole IF chain can be reduced. In this example, the stack depth is 22 before any of them are reduced. This also happens a lot with large list definitions.
I'm not sure how you're counting, but intermediate non-terminals can be reduced (with a decent parsing algorithm). See my comments inserted below
From: someone

CODE
a()
{
if ( something )
action(); // stack here is 9 ** I count just 5: the decl, open brace, if, condition, and statement**
}
// ** At this point, the entire function definition should be reduced to a single item, and then everything from the start can be reduced to a decllist and thrown away. The structure is now in the tree, and no longer on the stack.
b()
{
if ( something )
action(); // stack here is 10
}
Masakazu Kojima
ケロ
Join date: 23 Apr 2004
Posts: 232
02-01-2007 19:38
From: Kidd Krasner
Why would this be OS dependent?
See this post. Probably different defaults on the versions of bison used to compile the production binaries.

From: Kidd Krasner
Everything else
I'm talking about the way the current SL parser works. You can download the client source and see for yourself. I'd fix it but I keep hearing that script compilation is going to be moved to the server anyway.

From: Kidd Krasner
I count just 5: the decl, open brace, if, condition, and statement
IDENTIFIER '(' ')' '{' IF '(' expression ')' statement

When the closing brace is encountered the function is reduced to a global_function, and then to a global, and none of the globals can be merged until all of them are parsed because of right recursion.

I agree it is poorly done.
Something Something
Something Estates
Join date: 26 Sep 2006
Posts: 121
02-01-2007 22:15
Stephen: by dummy else clause, I meant "with the curly brackets". Obviously, a plain "else" keyword sitting all alone by itself is not syntactically correct.

Malachi: the memory limit you refer to applies to runtime, no? I'm referring to a compile time error (bogus "syntax errors" in syntactically correct code). Reordering statements within a function could affect the bytecode generated, but reordering entire functions shouldn't, namely, declaring one function before vs. after another, with one order resulting in a successful compile and the other order resulting in a bogus syntax error being declared in a faraway unrelated source code line.

Kidd: sure, cascading errors that can be caused by mismatched parentheses, brackets, or curly brackets (braces), or by missing semicolons. But that's not the case here: the code was correct and the syntax errors were bogus.

Masakazu: thanks for your detailed posting. Reducing the number of else if clauses and splitting up functions did help. Your post was very helpful for understanding why.
Kidd Krasner
Registered User
Join date: 1 Jan 2007
Posts: 1,938
02-11-2007 19:53
From: Masakazu Kojima

When the closing brace is encountered the function is reduced to a global_function, and then to a global, and none of the globals can be merged until all of them are parsed because of right recursion.


Right recursion? I'm not sure whether to laugh or cry.

Oh, well.
AJ DaSilva
woz ere
Join date: 15 Jun 2005
Posts: 1,993
02-25-2007 13:33
Seems like as good a place to ask as any. If anyone could help with this I'd really appreciate it; it's really starting to piss me off.

Anyone know what'd cause the compiler to throw up a syntax error at a literal string?

If I take out the one the error's reported at, it's reported at the next one. Last line before it was a string strVar = llList2String ( strList, "," );.

lslint doesn't show any errors for the script, btw.
Osgeld Barmy
Registered User
Join date: 22 Mar 2005
Posts: 3,336
02-25-2007 13:41
when it happens to me (and it tends to quite abit) its usually something dumb like a missing ; off in left feild way before the compiler puts the cursor
AJ DaSilva
woz ere
Join date: 15 Jun 2005
Posts: 1,993
02-25-2007 13:46
Ah, if only it were that simple. The error is thrown at the next literal string after the line I mentioned, however. If I change the one it's currently failing at it fails at the next one. I've been through the code with a fine-toothed comb and there's no actual syntax errors.
AJ DaSilva
woz ere
Join date: 15 Jun 2005
Posts: 1,993
02-25-2007 14:59
Don't worry, found the problem. Turns out it doesn't like using the "¬" character.
Osgeld Barmy
Registered User
Join date: 22 Mar 2005
Posts: 3,336
02-25-2007 19:05
details do help, btw just so you know eventho it says lsl uses ascii it only works with the charaters thats printed on your (us) keyboard, ive actually ran into that before trying to make some text gfx on hovertext

ive not tried all 256 charaters but the ones that i have tried either silently fail or error out

er my bad not fail but turn into dots
AJ DaSilva
woz ere
Join date: 15 Jun 2005
Posts: 1,993
02-25-2007 22:15
Dots mean the character isn't represented in the font, not the character-set. You can still use those characters in scripts; as list delimiters, for instance.
Osgeld Barmy
Registered User
Join date: 22 Mar 2005
Posts: 3,336
02-26-2007 01:12
its still flakey as hell outside of 32- 126

besides if this were truley ascii compatible shouldnt they make a font that would display all the charaters, i dont claim to make apple pie and leave out the apples.... phht good ol LL heh
Chaz Longstaff
Registered User
Join date: 11 Oct 2006
Posts: 685
08-04-2007 22:35
This is pretty funny. The compiler's a bit buggy, hehe. Oh well, never mind, part of the charm of SL. What software isn't buggy?

FYI I hit the problem encountered by Something Something at 9 "else ifs", though I thought I'd read 23 was the danger point.

LSLint pointed me to the right keywords to search on:

"Parser stack depth exceeded; SL will throw a syntax error here"

and those keywords brought me here.

Thanks guys for the solution. Just posting this as an FYI to anyone pulling his/her hair out in the future. Occasionally, it's not you.

I implemented a workaround solution along the lines of what's suggested here:

http://rpgstats.com/wiki/index.php?title=Ifelse

p.s. Osgoode above, re font, I wouldn't mind a font that made the 5 and 6's look a bit different at 2 in the morning! grin.