Second Life Forums Archive - Efficient Command Parsing

Jamie Marlin

Ought to be working....

Join date: 13 May 2005

Posts: 43

11-29-2005 12:37

Hey all -

I have a question about the most efficient approach for parsing commands in lsl. For the purposes of this discussion, I am defining 'efficient' as execution time / cpu cycle efficient - i.e. which approach executes fastest and (presumably) causes the least overall lag.

I can see two obvious choices; list based and string based. List based (the approach I have been using because it generates relatively clean, elegant code... and because I wanted to play with lists) looks something like:

Dump message string to a list [using llParseString2List() ]
Process list element 0 as command [ llList2String() ]
(if necessary) Dump remaining elements back into a string [using llDumpList2String() ]
and recurse

Alternatively, using strings:

Parse first word using string functions [ llSubStringIndex() and llGetSubString() ]
(if necessary) remove first word from message string and recurse

I have assumed, without knowing, that string parsing implementation inside llParseList2String() is more efficient that doing it in an lsl script, but does anybody know for sure?

Kenn Nilsson

AeonVox

Join date: 24 May 2005

Posts: 897

11-29-2005 18:17

I don't know the answer for sure, but in my eyes it seems that converting to a list and then back to a string is a slight bit more work than straight parsing of a string.

But then...remember that I started my reply post with "I don't know for sure..."

_____________________

--AeonVox--

Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd all be running around in darkened rooms chasing ghosts, eating magic pills, and listening to repetitive, addictive, electronic music.

Ben Bacon

Registered User

Join date: 14 Jul 2005

Posts: 809

11-30-2005 00:58

From: Kenn Nilsson

...converting to a list and then back to a string is a slight bit more work than straight parsing of a string...

agreed.

It depends on context, Jamie. If your commands are something along the lines of:
<keyword> <additional data to be read in one chunk>
e.g. "/1Find Jamie Marlin" or "/TeamShout hey everybody - back to base NOW"
then you should definitely use the substring & index approach (to avoid the to-and-fro that Kenn mentioned)

If the commands consist of a number of "tokens" that need to be looked at indivually
e.g. "SetParams 12 Big Red <1.0,1.0,0.0>"
then parsing the string to a list is gonna be the best bet - in this case, though, you will not be parsing the list back to a string, but extracting each item individually with llList2String.

Obviously these are not the only two patterns (although they are the most common), but hopefullly this helps you decide how to handle other cases as well.

** Note the lack of spaces in the vector in the Big Red example. If you wanted to tokenise strings that contain vectors, for example, but you did want to support extra whitespace, you'd probably find yourself creating a hybrid solution - where you substring the command and vector out, and then list parse the remainder.

Eloise Pasteur

Curious Individual

Join date: 14 Jul 2004

Posts: 1,952

11-30-2005 06:21

Ben's answer is pretty good and true.

It really does depend, which isn't too helpful I'm afraid, but it's the honest truth.

It is fair to say that lists as they get longer get slower to process as well, so there might be a point where it's cleaner to use the string approach but legibility of code might well suffer which also ought to be considered.

I've not done any testing of it, but I use llParseString2List and then take the bits out of the list for things where I have to do this. It also lets you do things like have two or three tier commands that you treat differently to one tier commands - consider a system that sends you emails if certain friends log on. That's pretty easy to write, and you generate a list of friends and send the email. As it gets more complex you might write a command interface to let you add new friends, delete old ones etc. One command might be a 'delete all', one a delete friend #16 and another 'delete friends #16 - #42'. The parsed list approach would let you have delete all (a one tier command), delete friend #16 - two 'tiers' a delete single friend and a friend number, and a three tier delete friends (1 tier), start no (2nd tier), end no. (3rd tier) with basically the same code and certainly not a complete rewrite.

Lists also let you neatly remove the elements you've processed so you always look at element 0, with multiple strings you can do everything I've suggested in the lists approach, but you've got to work out which string you're looking at as you go along, or delete bits in chunks which can be less appealing and less intuitive.

Jamie Marlin

Ought to be working....

Join date: 13 May 2005

Posts: 43

12-01-2005 11:31

Pretty much the answer I expected... the best approach does depend on the exact way the parser is going to be used. Personally, after long and bitter experience, I almost always go in the direction of clear, readable code over 'optomized and tricky', but I am trying to be a good citizen here.

While the best choice is situation dependent, however, for a given set of operations one approach or the other will clearly be more efficient. So, maybe the question should be:

What is the most efficient way to parse the following command string:
Title color <1,0,0> text Jamie's New Title

(Example chosen to have multiple tiers of commands and, just for fun, one of the arguments is multi-word - presumed to run to the end of the command line. We will assume that the command is space delimited)

Thraxis Epsilon

Registered User

Join date: 31 Aug 2005

Posts: 211

12-01-2005 15:36

From: Jamie Marlin

What is the most efficient way to parse the following command string:
Title color <1,0,0> text Jamie's New Title

(Example chosen to have multiple tiers of commands and, just for fun, one of the arguments is multi-word - presumed to run to the end of the command line. We will assume that the command is space delimited)

Well I'd refactor my command to the following criteria

/{channel} title - Clear the current title
/{channel} title <0,0,0> - set the color of the title
/{channel} title text of title - set the text for the title
/{channel} title <0,0,0> text of title - set the color and the text of the title
/{channel} title text of title <0,0,0> - set the color and the text of the title

CODE


vector color = <0,0,0>; // initial color black
string title = "";	// No title

// function to set title and or color without using a list
teTitleSet(string message)
{
	// Do we have color or title or both?
	if(llStringLength(message) == 0)
	{
			// We have neither, clear the title
			title = "";
	}
	else
	{
		// Look for a color vector anywhere in the string and save it to a temporary variable
		string setColor = llGetSubString(message,llSubStringIndex(sentence, "<"),llSubStringIndex(sentence, ">"));
		// Remove color vector anywhere in the string and save the rest to a temporary variable
		string setTitle = llDeleteSubString(test, llSubStringIndex(sentence, "<"),llSubStringIndex(sentence, ">"));
		
		// Do we have a color vector?
		if (setColor <> "")
		{
			//set the title color to the color vector
			color = (vector)setColor;
		}
		
		// Do we have a title?
		if (setTitle) <> "")
		{
			//set the title to the new title
			title = setTitle;
		}
	}
	// Set title and color
	llSetText(title,color,1);	
}


default
{
    state_entry()
    {
        llListen( 1, "", NULL_KEY, "" ); 
    }

	listen(integer channel, string name, key id, string message)
	{
		//Check for our command, case insensitive
		if (llToUpper(llGetSubString(message, 0, 5)) == "TITLE")
		{
				// Send the rest of the string to the Title Set function
				teTitleSet(llGetSubString(message,6,-1));
		} 
	}
}

But then again I did this code while bored at work so it may not be too serious of an answer

Eloise Pasteur

Curious Individual

Join date: 14 Jul 2004

Posts: 1,952

12-01-2005 23:43

I'd do it with a list separated on spaces from the input string and not keeping the nulls.

Then test the first list element is "title" so you can do those tests. Check the next element says "color", then take the vector of the next element so you set the colour most quickly.

Chuck this lot and the next two elements (new & title) away, and dump the remainder to a string using spaces as the separator - it doesn't matter how many words you've got then.

Actually I'd probably do it a little differently, snip the elements one at a time, and check for "color" and "new" + "title" without assuming they're there - so I could change the colour but not the wording and the wording but not the title, or both, but the principle is sound.

I could do exactly the same with strings, searching for the index of the spaces, and snipping the parts off one at a time, overall efficiency - well the string approach will use less memory, I suspect you'd struggle to see any time difference.

Efficient Command Parsing
Jamie Marlin Ought to be working.... Join date: 13 May 2005 Posts: 43	11-29-2005 12:37 Hey all - I have a question about the most efficient approach for parsing commands in lsl. For the purposes of this discussion, I am defining 'efficient' as execution time / cpu cycle efficient - i.e. which approach executes fastest and (presumably) causes the least overall lag. I can see two obvious choices; list based and string based. List based (the approach I have been using because it generates relatively clean, elegant code... and because I wanted to play with lists) looks something like: Dump message string to a list [using llParseString2List() ] Process list element 0 as command [ llList2String() ] (if necessary) Dump remaining elements back into a string [using llDumpList2String() ] and recurse Alternatively, using strings: Parse first word using string functions [ llSubStringIndex() and llGetSubString() ] (if necessary) remove first word from message string and recurse I have assumed, without knowing, that string parsing implementation inside llParseList2String() is more efficient that doing it in an lsl script, but does anybody know for sure?
Kenn Nilsson AeonVox Join date: 24 May 2005 Posts: 897	11-29-2005 18:17 I don't know the answer for sure, but in my eyes it seems that converting to a list and then back to a string is a slight bit more work than straight parsing of a string. But then...remember that I started my reply post with "I don't know for sure..." _____________________ --AeonVox-- Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd all be running around in darkened rooms chasing ghosts, eating magic pills, and listening to repetitive, addictive, electronic music.
Ben Bacon Registered User Join date: 14 Jul 2005 Posts: 809	11-30-2005 00:58 From: Kenn Nilsson ...converting to a list and then back to a string is a slight bit more work than straight parsing of a string... agreed. It depends on context, Jamie. If your commands are something along the lines of: <keyword> <additional data to be read in one chunk> e.g. "/1Find Jamie Marlin" or "/TeamShout hey everybody - back to base NOW" then you should definitely use the substring & index approach (to avoid the to-and-fro that Kenn mentioned) If the commands consist of a number of "tokens" that need to be looked at indivually e.g. "SetParams 12 Big Red <1.0,1.0,0.0>" then parsing the string to a list is gonna be the best bet - in this case, though, you will not be parsing the list back to a string, but extracting each item individually with llList2String. Obviously these are not the only two patterns (although they are the most common), but hopefullly this helps you decide how to handle other cases as well. ** Note the lack of spaces in the vector in the Big Red example. If you wanted to tokenise strings that contain vectors, for example, but you did want to support extra whitespace, you'd probably find yourself creating a hybrid solution - where you substring the command and vector out, and then list parse the remainder.
Eloise Pasteur Curious Individual Join date: 14 Jul 2004 Posts: 1,952	11-30-2005 06:21 Ben's answer is pretty good and true. It really does depend, which isn't too helpful I'm afraid, but it's the honest truth. It is fair to say that lists as they get longer get slower to process as well, so there might be a point where it's cleaner to use the string approach but legibility of code might well suffer which also ought to be considered. I've not done any testing of it, but I use llParseString2List and then take the bits out of the list for things where I have to do this. It also lets you do things like have two or three tier commands that you treat differently to one tier commands - consider a system that sends you emails if certain friends log on. That's pretty easy to write, and you generate a list of friends and send the email. As it gets more complex you might write a command interface to let you add new friends, delete old ones etc. One command might be a 'delete all', one a delete friend #16 and another 'delete friends #16 - #42'. The parsed list approach would let you have delete all (a one tier command), delete friend #16 - two 'tiers' a delete single friend and a friend number, and a three tier delete friends (1 tier), start no (2nd tier), end no. (3rd tier) with basically the same code and certainly not a complete rewrite. Lists also let you neatly remove the elements you've processed so you always look at element 0, with multiple strings you can do everything I've suggested in the lists approach, but you've got to work out which string you're looking at as you go along, or delete bits in chunks which can be less appealing and less intuitive.
Jamie Marlin Ought to be working.... Join date: 13 May 2005 Posts: 43	12-01-2005 11:31 Pretty much the answer I expected... the best approach does depend on the exact way the parser is going to be used. Personally, after long and bitter experience, I almost always go in the direction of clear, readable code over 'optomized and tricky', but I am trying to be a good citizen here. While the best choice is situation dependent, however, for a given set of operations one approach or the other will clearly be more efficient. So, maybe the question should be: What is the most efficient way to parse the following command string: Title color <1,0,0> text Jamie's New Title (Example chosen to have multiple tiers of commands and, just for fun, one of the arguments is multi-word - presumed to run to the end of the command line. We will assume that the command is space delimited)
Thraxis Epsilon Registered User Join date: 31 Aug 2005 Posts: 211	12-01-2005 15:36 From: Jamie Marlin What is the most efficient way to parse the following command string: Title color <1,0,0> text Jamie's New Title (Example chosen to have multiple tiers of commands and, just for fun, one of the arguments is multi-word - presumed to run to the end of the command line. We will assume that the command is space delimited) Well I'd refactor my command to the following criteria /{channel} title - Clear the current title /{channel} title <0,0,0> - set the color of the title /{channel} title text of title - set the text for the title /{channel} title <0,0,0> text of title - set the color and the text of the title /{channel} title text of title <0,0,0> - set the color and the text of the title CODE vector color = <0,0,0>; // initial color black string title = ""; // No title // function to set title and or color without using a list teTitleSet(string message) { // Do we have color or title or both? if(llStringLength(message) == 0) { // We have neither, clear the title title = ""; } else { // Look for a color vector anywhere in the string and save it to a temporary variable string setColor = llGetSubString(message,llSubStringIndex(sentence, "<"),llSubStringIndex(sentence, ">")); // Remove color vector anywhere in the string and save the rest to a temporary variable string setTitle = llDeleteSubString(test, llSubStringIndex(sentence, "<"),llSubStringIndex(sentence, ">")); // Do we have a color vector? if (setColor <> "") { //set the title color to the color vector color = (vector)setColor; } // Do we have a title? if (setTitle) <> "") { //set the title to the new title title = setTitle; } } // Set title and color llSetText(title,color,1); } default { state_entry() { llListen( 1, "", NULL_KEY, "" ); } listen(integer channel, string name, key id, string message) { //Check for our command, case insensitive if (llToUpper(llGetSubString(message, 0, 5)) == "TITLE") { // Send the rest of the string to the Title Set function teTitleSet(llGetSubString(message,6,-1)); } } } But then again I did this code while bored at work so it may not be too serious of an answer
Eloise Pasteur Curious Individual Join date: 14 Jul 2004 Posts: 1,952	12-01-2005 23:43 I'd do it with a list separated on spaces from the input string and not keeping the nulls. Then test the first list element is "title" so you can do those tests. Check the next element says "color", then take the vector of the next element so you set the colour most quickly. Chuck this lot and the next two elements (new & title) away, and dump the remainder to a string using spaces as the separator - it doesn't matter how many words you've got then. Actually I'd probably do it a little differently, snip the elements one at a time, and check for "color" and "new" + "title" without assuming they're there - so I could change the colour but not the wording and the wording but not the title, or both, but the principle is sound. I could do exactly the same with strings, searching for the index of the spaces, and snipping the parts off one at a time, overall efficiency - well the string approach will use less memory, I suspect you'd struggle to see any time difference.

Welcome to the Second Life Forums Archive

Efficient Command Parsing