Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Discussion: LindenAIML

Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
10-12-2005 08:42
NOTE: Upon viewing the posted code, I note that the BBS software is inserting spaces in the text that don't belong there. So cutting and pasting from the window won't work. Some editing is required. IM or EMAIL me if you want the Linden AIML code without having to find and fix this in the code. Code also available at sourceforge thanks to luciftias: http://sourceforge.net/projects/lindenaiml/

The utility of using AIML as an alternative to conventional "natural language processing" algorithms for spoken language actuation of scripts, objects, and devices in Second Life is fairly obvious. With that in mind I wanted to attempt to develop a very stripped down version of the AIML interpreter in Linden Scripting Language. And by stripped down, I mean STRIPPED DOWN. This isn't a port to LSL. It's a non-XML markup that is somewhat similar to AIML in intent, incorporating a small number of AIML and AIML-like tags. Here's the code...it's messy and a I welcome attempts to clean it up and improve the running.

CODE
string gName="filename";
key gQueryID;
list AIMList;
list REPLYList;
string data;
string newmessage;
integer gLine;
integer handle;
integer touched=0;

integer myParse(string message)
{
float last_score=-1.0; //so every score after is an improvement over this
float score;
integer lenaimlist; //length of AIMList

integer begin; //index of message string marking beginning of matched pattern
integer end; //length of matched pattern, so ending index actually begin+end-1
integer begina; //as above
integer enda;
integer begin2;
integer end2;
integer lenclust;
integer i=0; //loop (line) number
integer starindex; //beginning index of *ed expression in message
list msglist; //string message as parsed to list, separated by " "
list clusters; //first cluster of pre * words
list cluster2;//cluster of post * words
list cl; //individual words in clusters parsed to individual list entries
list cl2; //individual words in cluster2 blah blah blah
list headmsglist;
list tailmsglist;
string last_message; //not used yet
string starstring; //string corresponding to *ed characters
integer check_line; //no longer implemented
string resp_line;
integer ind;
string string1;
integer ind2;
string unparsed_clusters_string;
list unparsed_clusters;
list starlist;
string unparsed_reply;
integer end_tag;
string parsed_reply;
list temp_reply_list;
integer reply_index=-1;
integer s_ind;
integer error_cond=0; //use later
string newmsg;
string tempmessage="";
integer j;
integer chat_channel=0;
//begin actual program
integer punc_index=llStringLength(message);
punc_index--;
if(llGetSubString(message,punc_index,punc_index)=="." || llGetSubString(message,punc_index,punc_index)=="!" || llGetSubString(message,punc_index,punc_index)=="?")
{
message=llDeleteSubString(message,punc_index,punc_index);
}
lenaimlist=llGetListLength(AIMList);
reply_index=-1;
while(reply_index==-1 && i< lenaimlist)//for(i=0;i<lenaimlist;i++)
{
clusters=[];
cluster2=[];
cl=[];
cl2=[];
//override loop: if this string occurs anywhere in input, ignore all other parsing and respond with template following override
if(llSubStringIndex(llList2String(AIMList,i),"<override>")!=-1)
{
//strip out <override> and </override>

ind=llSubStringIndex(llList2String(AIMList,i),"<override>");
string1=llDeleteSubString(llList2String(AIMList,i),ind,ind+9);
ind2=llSubStringIndex(string1,"</override>");
string override_string=llDeleteSubString(string1,ind2,ind2+10);//this now represents string to match to message
if(llSubStringIndex(message,override_string)!=-1)
{

reply_index=i;
last_score=10000; //so this will override other messages.
}
}


//find clusters/cluster2 in AIMList entry
if(llSubStringIndex(llList2String(AIMList,i),"<pattern>")!=-1)
{
//strip out <pattern> and </pattern> from every line
ind=llSubStringIndex(llList2String(AIMList,i),"<pattern>");
string1=llDeleteSubString(llList2String(AIMList,i),ind,ind+8);
ind2=llSubStringIndex(string1,"</pattern>");
string unparsed_clusters_string=llDeleteSubString(string1,ind2,ind2+9);//this now represents string to match to message
if(llSubStringIndex(unparsed_clusters_string,"*")!=-1 && llSubStringIndex(unparsed_clusters_string,"#")==-1) //parse only lines using wildcard this way. non-wildcard parse follows
{
unparsed_clusters=llParseString2List(unparsed_clusters_string,["*"],[]); //break into 2 clusters, before * and after *



lenclust=llGetListLength(unparsed_clusters);
//unparsed_clusters=llDeleteSubList(unparsed_clusters,lenclust,lenclust);//what does this do?
clusters=llList2List(unparsed_clusters,0,0); //Before * cluster
cluster2=llList2List(unparsed_clusters,1,1);//after * cluster

cl=llParseString2List( (string) clusters,[" "],[]); //breaks out individual words in clusters
cl2=llParseString2List( (string) cluster2,[" "],[]); //breaks out individual words in cluster2
msglist=llParseString2List(message,[" "],[]); //breaks up input message into list of individual words.
if(cl2==[])
cl2=["asdfasfads"]; //insert unmatcheable value into cl2
if(cl==[])
cl=["hgsjgh"];

//Case of cluster *
if(llListFindList(msglist,cl)!=-1 && llListFindList(msglist,cl2)==-1)
{

begin =llListFindList(msglist,cl); //find cl in msglist
end=llGetListLength(cl);
end--;
tailmsglist=llDeleteSubList(msglist,begin,begin+end); //return only * and post * words

starindex=llListFindList(msglist,tailmsglist); //locates *
end2=llGetListLength(msglist);
end2--;
//if(end>=0 && begin>=0 && starindex >=0 && end2>=0) //so no negative indexes
// newmsg=llDumpList2String(llListReplaceList(msglist,["*"],starindex,starindex+end2)," "); //replaces phrase with * in message
//what to replace * with in reply
starlist=llList2List(msglist,starindex,starindex+end2);

//score similarity of matched pattern with message string
score= (float) llGetListLength(llListReplaceList(msglist,["*"],starindex, starindex+end2))/(float)llGetListLength(msglist);
if(llListReplaceList(msglist,["*"],starindex,starindex+end2)==cl+"*")
{
reply_index=i;
last_score=score;//this becomes last best score
starstring=llDumpList2String(starlist," ");
}
}

//Case of * cluster2
if(llListFindList(msglist,cl)==-1 && llListFindList(msglist,cl2)!=-1)
{
begin=llListFindList(msglist,cl2);
end=llGetListLength(cl2);
end--;
headmsglist=llDeleteSubList(msglist,begin, begin+end);
starindex=llListFindList(msglist,headmsglist);
end2=llGetListLength(msglist);
//if(end>=0 && begin >=0 && starindex>=0 && end2>=0)
// new
starlist=llList2List(msglist, starindex, starindex+end2);
// starstring=llDumpList2String(starlist," ");
score = (float) llGetListLength(llListReplaceList(msglist,["*"],starindex,starindex+end2))/(float)llGetListLength(msglist);
if(llListReplaceList(msglist,["*"],starindex,starindex+end2)=="*"+cl2)
{
reply_index=i;
last_score=score;
starstring=llDumpList2String(starlist," ");

}
}

//Case of clusters * cluster2
if(llListFindList(msglist,cl)!=-1 && llListFindList(msglist,cl2)!=-1)
{
//first delete clusters from message

begin=llListFindList(msglist, cl);
end=llGetListLength(cl);
end--;
headmsglist=llDeleteSubList(msglist,begin,begin+end);

//then delete cluster2 from tailmsglist
begina =llListFindList(headmsglist, cl2);
enda=llGetListLength(cl2);
enda--;
starlist=llDeleteSubList(headmsglist,begina,begina+enda);
starindex=llListFindList(msglist,starlist);
end2=llGetListLength(starlist);
end2--;
// starstring=llDumpList2String(starlist," ");
score=llGetListLength( llListReplaceList(msglist,["*"],starindex,starindex+end2))/(float)llGetListLength(msglist);

if(llListReplaceList(msglist,["*"],starindex,starindex+end2)==cl+"*"+cl2 )
{
reply_index=i;
last_score=score;
starstring=llDumpList2String(starlist," ");
}
}
}
else
{
if(unparsed_clusters_string==message)
reply_index=i;
last_score=100;
}
}
if(llSubStringIndex(unparsed_clusters_string,"#")!=-1)
{
//case of second wildcard
}
i++;

}

if(reply_index!=-1) //if a match exists
{
if(llSubStringIndex(llList2String(AIMList,reply_index+1),"<channel>")!=-1)
{
ind=llSubStringIndex(llList2String(AIMList,reply_index+1),"<channel>");
string1=llDeleteSubString(llList2String(AIMList,reply_index+1),ind,ind+8);
ind2=llSubStringIndex(string1,"</channel>");
string chat_channel_string=llDeleteSubString(string1,ind2,ind2+9);
chat_channel=(integer) chat_channel_string;

}
unparsed_reply=llList2String(REPLYList, reply_index);
end_tag=llSubStringIndex(unparsed_reply,"</template>");
end_tag--;
parsed_reply=llGetSubString(unparsed_reply,10,end_tag--); //from end of <template> to </template>
if(llSubStringIndex(parsed_reply, "<star/>")==-1)
{
if(llSubStringIndex(parsed_reply,"<srai")==-1)
{
llSay(chat_channel,parsed_reply);

reply_index=0;
}
else
{
error_cond=-1;
integer lenp=llStringLength(parsed_reply);
integer termin=lenp--;
termin--;termin--;termin--;termin--;termin--;termin--;termin--;termin--;
// integer starindex2=llSubStringIndex(parsed_reply,"<star/>");
// starindex2--;
//if(starindex2!=-1)
//{
// parsed_reply=llGetSubString(parsed_reply,6,starindex) +" " +starstring;
//}
//else
//{

newmessage=llGetSubString(parsed_reply,6,termin);
// }
}
}
else // if <star/> expression exists, insert antecedant
{
if(llSubStringIndex(parsed_reply,"<srai")==-1)
{
temp_reply_list=llParseString2List(parsed_reply,[" "],[]); s_ind=llListFindList(temp_reply_list,["<star/>"]);
list new_reply_list=llListReplaceList(temp_reply_list,[starstring],s_ind,s_ind);
parsed_reply=llDumpList2String(new_reply_list," ");

llSay(chat_channel,parsed_reply);

reply_index=0;
}
else
{
error_cond=-1;
integer lenp=llStringLength(parsed_reply);
integer termin=lenp--;
integer starindex2=llSubStringIndex(parsed_reply,"<star/>");
starindex2--;

if(starindex2!=-1)
{
newmessage=llGetSubString(parsed_reply,6,starindex2) +starstring;

}
else
{
newmessage=llGetSubString(parsed_reply,6,termin);

}

}

}
}
else
{
llSay(0,"I'm afraid I don't understand what you said...yet");
//llemail line here
}
return error_cond;
}


default
{
state_entry()
{
gQueryID=llGetNotecardLine(gName,gLine);//request first line
gLine++; //increase line count
}

dataserver(key query_id,string data)
{
if(query_id==gQueryID)
{
if(data!=EOF)
{
if( llGetSubString(data,0,3)=="<pat" || llGetSubString(data,0,3)=="<ove" || llGetSubString(data,0,3)=="<cha")
{
if(gLine==0) //for now ignore all but pattern or template lines
{
AIMList=(list) [data];
}
else
{
AIMList=AIMList + [data];
}
}
if(llGetSubString(data,0,3)=="<tem" || llGetSubString(data,0,3)=="<cha" )
{
if(gLine==0) //for now ignore all but pattern or template lines
{
REPLYList=(list) [data];
}
else
{
REPLYList=REPLYList + [data];

}

}
}
gQueryID=llGetNotecardLine(gName,gLine); //request next line
gLine++;
}
}

touch_start(integer total_number)
{


if(touched==0)
{
handle=llListen(0,"",llGetOwner(),"");
llSay(0,"AIML on");
touched++;
}
else
{
llListenRemove(handle);
llSay(0,"AIML off");
touched=0;
}

}

listen(integer channel, string name, key id, string msg)
{

integer error_cond=myParse(msg);
if(error_cond==-1)
{
//message=newmessage;

myParse(newmessage);
}
}
}



I've also included a file to be incorporated as a notecard in an object containing the interpreter. This file, referred to as "filename" in the above code, contains the AIML tagged text to be used as a script for the interpreter.

<category>
<pattern>How are you *</pattern>
<template><star/> is fine</template>
</category>
<category>
<pattern>Are you well * bot</pattern>
<template>This <star/> bot is great</template>
</category>
<category>
<pattern>Who is *</pattern>
<template>I don't know, who is <star/></template>
</category>
<category>
<override>fish</override>
<template>I hate fish</template>
</category>
<category>
<pattern>Are there too many * in this sim</pattern>
<template>I don't know much about <star/></template>
</category>
<category>
<pattern>Are you OK * bot</pattern>
<template><srai>Are you well <star/> bot</srai></template>
</category>
<category>
<pattern>Reply on channel 1</pattern>
<template>OK</template>
<channel>1</channel>
</category>

A brief explication of the tags. As of this writing, the <category> tag is not used. In the future it will contain flags about topic and such like. But right now it's simply included to keep the LindenAIML file looking somewhat familiar to AIML users.

The <pattern> tags contain patterns to match user utterances.
The <template> tags contain hypothetical replies to the patterns.
The <channel> tag (not included in every entry contained in <category></category> pair) allows the user to determin what channel the reply will occur on. So the bot can issue commands to other devices listening on those channels.

<override> represents a special case of <pattern>. In the example, if the user says anything that contains the word "fish", the reply will be that in the <template></template> tags immediately following the <override></override> pair.

Finally, end of utterance punctuation is stripped from the user message at this point, so it is not necessary to include punctuation at the end of patterns (although intra-utterance punctuation is not stripped out).

on edit: it is important to note that LindenAIML is sensitive to the format of the AIML file. No extra spaces between tags, and line-format should be as above.

On ANOTHER edit: I also notice that there are some variables that are no longer used anymore in the code (like score and last_score, I believe). I'll clean this kind of thing up over the next few days.

UPDATE 10/14/2005 4:32 PM EASTERN: changed for loop to while loop to increase pattern matching speed, added a feature that allows you to switch unit on and off with repeated touches. Also now only listens to owner.
PROBLEM: see note at beginning of post...
Nada Epoch
The Librarian
Join date: 4 Nov 2002
Posts: 1,423
Original Thread
10-12-2005 19:06
/15/2c/65411/1.html
_____________________
i've got nothing. ;)
Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
Some more tags.
10-13-2005 07:49
I note that I failed to describe what a few tags do. For those of you familiar with AIML, this shouldn't be a problem. But for those who are not, here are some more:

<srai></srai> is the "syntactic reduction" tagpair. If a pattern produces a template with syntactic reduction tags enclosing the response, that responses is fed to the algorithm as a new user message to parse. The upshot of this: Patterns followed by <srai> tags are regarded as equivalent to the text contained in the <srai> tagpair, which should be defined elsewhere in the file (and by convention, previously in the file). Too many of these can slow operation considerably. Be warned.

the <star/> tag is an isolated tag and represents the insertion point for phrases denoted by the wildcard "*" in the pattern to match. So If I say: "I hate you bot", and the pattern to match is "I * you bot", the template reply might be "Well, I <star/> you too.", resulting in a reply: "Well, I hate you too."

Any more questions, feel free to IM me, or my bro Luciftias Neurocam.
Ziggy Puff
Registered User
Join date: 15 Jul 2005
Posts: 1,143
10-13-2005 08:14
I think this is a great idea. I just got done setting up an ALICE bot on my server and writing the LSL code to communicate with it so I have a speaking AI object. That was just for fun, I never considered using the AI to do anything more than have a conversation. Your idea could be used for a lot of things.
Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
10-13-2005 08:21
From: Ziggy Puff
I think this is a great idea. I just got done setting up an ALICE bot on my server and writing the LSL code to communicate with it so I have a speaking AI object. That was just for fun, I never considered using the AI to do anything more than have a conversation. Your idea could be used for a lot of things.


I specifically set about working on this project because I hated all the work I would have to go to in setting up an ALICE bot on my server, just to implement a few natural language commands for me here in SL.

I'm currently using it to design a "servant" called "Dogsbody" that is capable of effecting various jobs around my property, and helping me manage my inventory.

One caveat....like everything else on SL, LindenAIML is subject to server load. So a very long AIML file, (particularly one with a lot of <srai> tags) has the potential to run quite slowly when server load is high. For most uses of LindenAIML, I don't think this is a huge issue. But Ya pays your money, Ya takes your chances!
Ziggy Puff
Registered User
Join date: 15 Jul 2005
Posts: 1,143
10-13-2005 09:05
It could also be used to do a greeter/concierge bot. New visitors to your sim could ask about the services/attractions available, where to find something, stuff like that.

A couple of non-AIML suggestions. You could use touch_start to toggle the listener on and off, that way the bot isn't processing all chat around it all the time. Probably also turn the listener off on a timer if no one has said anything for N minutes. Use some visual indicator to let people know if the bot is currently 'awake' or 'asleep'.

Also, you could open a listener just for the person who touched it. Hopefully that'll be a little less laggy, and the bot won't get confused with a normal human conversation going on around it. This becomes more relevant if the bot might be used with multiple people around it.
Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
10-13-2005 09:34
From: Ziggy Puff
It could also be used to do a greeter/concierge bot. New visitors to your sim could ask about the services/attractions available, where to find something, stuff like that.

A couple of non-AIML suggestions. You could use touch_start to toggle the listener on and off, that way the bot isn't processing all chat around it all the time. Probably also turn the listener off on a timer if no one has said anything for N minutes. Use some visual indicator to let people know if the bot is currently 'awake' or 'asleep'.

Also, you could open a listener just for the person who touched it. Hopefully that'll be a little less laggy, and the bot won't get confused with a normal human conversation going on around it. This becomes more relevant if the bot might be used with multiple people around it.


Good ideas all. I'd also appreciate it if people ran this script and let me know what works, what doesn't and what they'd like to see in the future. I have a feeling the biggest complaint will be that LindenAIML is so sensitive to format of the AIML file (extra spaces between tags and text, the one tagpair for line constraint...capitalization...).
Logan Bauer
Inept Adept
Join date: 13 Jun 2004
Posts: 2,237
10-13-2005 10:46
Nice! I'll have to check this out when I get some time, very good idea! :)
Luciftias Neurocam
Ecosystem Design
Join date: 13 Oct 2005
Posts: 742
Code edits...
10-14-2005 07:53
Az....I just wanted to let you know I updated the code above so that LindenAIML stops reading the pattern list when a sufficient match is found. As it stands now the script, even after it finds a match, continues scrolling through the list. Improves performance by about 20-30% this morning.

I gave you the new code on a notecard. Edit your post up top when you can, thanks!


CBH
Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
HELP!!! and updates
10-14-2005 13:49
Hey guys, I made the changes specified by Luciftias above. I am having a problem with posting the code, though...specifically this is happening:

parsed_reply=llGetSubString(unparsed_reply,10,end_ tag--);

See the space between "end_" and "tag"? That should be one variable. But the act of posting it splits it up. this doesn't happen consistently for every variable with a "_" in it. Any ideas?

Oh well, the script works, and if you want a copy without having to fix those strange breaks, IM me.
Ziggy Puff
Registered User
Join date: 15 Jul 2005
Posts: 1,143
10-14-2005 13:52
Put some whitespace into the line, and the forum won't insert its own whitespace. I learnt this quite recently.

CODE
parsed_reply = llGetSubString(unparsed_reply, 10, end_tag--);


And use PHP tags instead of code, it performs some syntax coloring which makes the code easier to read.
Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
10-14-2005 13:57
From: Ziggy Puff
Put some whitespace into the line, and the forum won't insert its own whitespace. I learnt this quite recently.

CODE
parsed_reply = llGetSubString(unparsed_reply, 10, end_tag--);


And use PHP tags instead of code, it performs some syntax coloring which makes the code easier to read.


That will require considerable editing...so forgive me if I don't get it done right away. ;)

Also, I incorporated your suggestions from yesterday. Thanks!
Luciftias Neurocam
Ecosystem Design
Join date: 13 Oct 2005
Posts: 742
10-17-2005 11:43
Anyone who is interested...I've started a sourceforge project for LindenAIML. Join if you want. Or not. But the package (source and docs culled from Az' posts -thanks Az.) thus far is available at:

http://sourceforge.net/projects/lindenaiml/

I've also added a second wildcard feature, using the "#" characters for strings of the format

"string1 * string2 # string3"

The other usable case for 2 wildcards:

"* string #"

I hope to finish tonight. I don't believe that

"String * #" is doable, because the machine will have a really difficult time distinguishing it from
a simple "string *"
Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
Try out my LindenAIML concierge
11-01-2005 19:52
At our place in Miata (12,13), I've set up a concierge in the lobby of my office building, written in LindenAIML. It's designed to answer questions about the language, but not having written a major chatbot before, I've only included about 15 patterns in the concierge's LAIML file. So I need people to ask it questions it can't answer. Patterns without matches are forwarded to my LAIML network, so I can target better replies for the bot.
Luciftias Neurocam
Ecosystem Design
Join date: 13 Oct 2005
Posts: 742
Stack-Heap Collisions in 1.7
11-07-2005 07:10
Since 1.7 was installed, I've noticed that long lists of LindenAIML can cause sporadic stack-heap collisions. The fix for this is coming shortly, but for anyone who's playing with this, it's as follows: create a listener object that relays pattern to several linked objects non-overlapping containing portions of lists. Messages are passed back to the listener object regarding which list-container object has found a match. Script then finishes normally.

I'm still testing this, but so far there doesn't seem to be a limit on the number of objects that contain lists and do the string parsing.
Navillus Batra
Registered User
Join date: 4 Jul 2006
Posts: 22
HttpRequest AIML Interface
08-13-2006 19:37
We have written a script that interfaces via the HttpRequest protocols with a AIML based chatbot hosted on Pandorabots free service. To give it better 'awarness' of Second Life we have developed a set of keywords that are programmed into the AIML, but are parsed by the script to complete function calls. We have also started working on a Second Life specific aiml file, but it will always be a work in progress because of its importance.

We have setup several webpages detailing our SL Chatbot.

http://www.wetwarehacker.com/secondlifechatbot.html