Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

LindenAIML

Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
10-12-2005 08:42
The utility of using AIML as an alternative to conventional "natural language processing" algorithms for spoken language actuation of scripts, objects, and devices in Second Life is fairly obvious. With that in mind I wanted to attempt to develop a very stripped down version of the AIML interpreter in Linden Scripting Language. And by stripped down, I mean STRIPPED DOWN. This isn't a port to LSL. It's a non-XML markup that is somewhat similar to AIML in intent, incorporating a small number of AIML and AIML-like tags. Here's the code...it's messy and a I welcome attempts to clean it up and improve the running.

CODE

//LindenAIML developed by Azrael Baphomet and Luciftias Neurocam. Distributed under the terms of the GNU Public License.
//As a courtesy, please report mods to Luciftias Neurocam or Azrael Baphomet.


string gName="filename";
key gQueryID;
list AIMList;
list REPLYList;
string data;
string newmessage;
integer gLine;


integer myParse(string message)
{
float last_score=-1.0; //so every score after is an improvement over this
float score;
integer lenaimlist; //length of AIMList

integer begin; //index of message string marking beginning of matched pattern
integer end; //length of matched pattern, so ending index actually begin+end-1
integer begina; //as above
integer enda;
integer begin2;
integer end2;
integer lenclust;
integer i=0; //loop (line) number
integer starindex; //beginning index of *ed expression in message
list msglist; //string message as parsed to list, separated by " "
list clusters; //first cluster of pre * words
list cluster2;//cluster of post * words
list cl; //individual words in clusters parsed to individual list entries
list cl2; //individual words in cluster2 blah blah blah
list headmsglist;
list tailmsglist;
string last_message; //not used yet
string starstring; //string corresponding to *ed characters
integer check_line; //no longer implemented
string resp_line;
integer ind;
string string1;
integer ind2;
string unparsed_clusters_string;
list unparsed_clusters;
list starlist;
string unparsed_reply;
integer end_tag;
string parsed_reply;
list temp_reply_list;
integer reply_index=-1;
integer s_ind;
integer error_cond=0; //use later
string newmsg;
string tempmessage="";
integer j;
integer chat_channel=0;
//begin actual program
//remove punctuation
integer punc_index=llStringLength(message);
punc_index--;
if(llGetSubString(message,punc_index,punc_index)=="." || llGetSubString(message,punc_index,punc_index)=="!" || llGetSubString(message,punc_index,punc_index)=="?")
{
message=llDeleteSubString(message,punc_index,punc_index);
}
lenaimlist=llGetListLength(AIMList);
for(i=0;i<lenaimlist;i++)
{
clusters=[];
cluster2=[];
cl=[];
cl2=[];
//override loop: if this string occurs anywhere in input, ignore all other parsing and respond with template following override
if(llSubStringIndex(llList2String(AIMList,i),"<override>")!=-1)
{
//strip out <override> and </override>

ind=llSubStringIndex(llList2String(AIMList,i),"<override>");
string1=llDeleteSubString(llList2String(AIMList,i),ind,ind+9);
ind2=llSubStringIndex(string1,"</override>");
string override_string=llDeleteSubString(string1,ind2,ind2+10);//this now represents string to match to message
if(llSubStringIndex(message,override_string)!=-1)
{

reply_index=i;
last_score=10000; //so this will override other messages.
}
}


//find clusters/cluster2 in AIMList entry
if(llSubStringIndex(llList2String(AIMList,i),"<pattern>")!=-1)
{
//strip out <pattern> and </pattern> from every line
ind=llSubStringIndex(llList2String(AIMList,i),"<pattern>");
string1=llDeleteSubString(llList2String(AIMList,i),ind,ind+8);
ind2=llSubStringIndex(string1,"</pattern>");
string unparsed_clusters_string=llDeleteSubString(string1,ind2,ind2+9);//this now represents string to match to message
if(llSubStringIndex(unparsed_clusters_string,"*")!=-1 && llSubStringIndex(unparsed_clusters_string,"#")==-1) //parse only lines using wildcard this way. non-wildcard parse follows
{
unparsed_clusters=llParseString2List(unparsed_clusters_string,["*"],[]); //break into 2 clusters, before * and after *



lenclust=llGetListLength(unparsed_clusters);
//unparsed_clusters=llDeleteSubList(unparsed_clusters,lenclust,lenclust);//what does this do?
clusters=llList2List(unparsed_clusters,0,0); //Before * cluster
cluster2=llList2List(unparsed_clusters,1,1);//after * cluster

cl=llParseString2List( (string) clusters,[" "],[]); //breaks out individual words in clusters
cl2=llParseString2List( (string) cluster2,[" "],[]); //breaks out individual words in cluster2
msglist=llParseString2List(message,[" "],[]); //breaks up input message into list of individual words.
if(cl2==[])
cl2=["asdfasfads"]; //insert unmatcheable value into cl2
if(cl==[])
cl=["hgsjgh"];

//Case of cluster *
if(llListFindList(msglist,cl)!=-1 && llListFindList(msglist,cl2)==-1)
{

begin =llListFindList(msglist,cl); //find cl in msglist
end=llGetListLength(cl);
end--;
tailmsglist=llDeleteSubList(msglist,begin,begin+end); //return only * and post * words

starindex=llListFindList(msglist,tailmsglist); //locates *
end2=llGetListLength(msglist);
end2--;
//if(end>=0 && begin>=0 && starindex >=0 && end2>=0) //so no negative indexes
// newmsg=llDumpList2String(llListReplaceList(msglist,["*"],starindex,starindex+end2)," "); //replaces phrase with * in message
//what to replace * with in reply
starlist=llList2List(msglist,starindex,starindex+end2);

//score similarity of matched pattern with message string
score= (float) llGetListLength(llListReplaceList(msglist,["*"],starindex, starindex+end2))/(float)llGetListLength(msglist);
if(llListReplaceList(msglist,["*"],starindex,starindex+end2)==cl+"*")
{
reply_index=i;
last_score=score;//this becomes last best score
starstring=llDumpList2String(starlist," ");
}
}

//Case of * cluster2
if(llListFindList(msglist,cl)==-1 && llListFindList(msglist,cl2)!=-1)
{
begin=llListFindList(msglist,cl2);
end=llGetListLength(cl2);
end--;
headmsglist=llDeleteSubList(msglist,begin, begin+end);
starindex=llListFindList(msglist,headmsglist);
end2=llGetListLength(msglist);
//if(end>=0 && begin >=0 && starindex>=0 && end2>=0)
// new
starlist=llList2List(msglist, starindex, starindex+end2);
// starstring=llDumpList2String(starlist," ");
score = (float) llGetListLength(llListReplaceList(msglist,["*"],starindex,starindex+end2))/(float)llGetListLength(msglist);
if(llListReplaceList(msglist,["*"],starindex,starindex+end2)=="*"+cl2)
{
reply_index=i;
last_score=score;
starstring=llDumpList2String(starlist," ");

}
}

//Case of clusters * cluster2
if(llListFindList(msglist,cl)!=-1 && llListFindList(msglist,cl2)!=-1)
{
//first delete clusters from message

begin=llListFindList(msglist, cl);
end=llGetListLength(cl);
end--;
headmsglist=llDeleteSubList(msglist,begin,begin+end);

//then delete cluster2 from tailmsglist
begina =llListFindList(headmsglist, cl2);
enda=llGetListLength(cl2);
enda--;
starlist=llDeleteSubList(headmsglist,begina,begina+enda);
starindex=llListFindList(msglist,starlist);
end2=llGetListLength(starlist);
end2--;
// starstring=llDumpList2String(starlist," ");
score=llGetListLength( llListReplaceList(msglist,["*"],starindex,starindex+end2))/(float)llGetListLength(msglist);

if(llListReplaceList(msglist,["*"],starindex,starindex+end2)==cl+"*"+cl2 )
{
reply_index=i;
last_score=score;
starstring=llDumpList2String(starlist," ");
}
}
}
else
{
if(unparsed_clusters_string==message)
reply_index=i;
last_score=100;
}
}
if(llSubStringIndex(unparsed_clusters_string,"#")!=-1)
{
//case of second wildcard
}
}
if(reply_index!=-1) //if a match exists
{
if(llSubStringIndex(llList2String(AIMList,reply_index+1),"<channel>")!=-1)
{
ind=llSubStringIndex(llList2String(AIMList,reply_index+1),"<channel>");
string1=llDeleteSubString(llList2String(AIMList,reply_index+1),ind,ind+8);
ind2=llSubStringIndex(string1,"</channel>");
string chat_channel_string=llDeleteSubString(string1,ind2,ind2+9);
chat_channel=(integer) chat_channel_string;

}
unparsed_reply=llList2String(REPLYList, reply_index);
end_tag=llSubStringIndex(unparsed_reply,"</template>");
end_tag--;
parsed_reply=llGetSubString(unparsed_reply,10,end_tag--); //from end of <template> to </template>
if(llSubStringIndex(parsed_reply, "<star/>")==-1)
{
if(llSubStringIndex(parsed_reply,"<srai")==-1)
{
llSay(chat_channel,parsed_reply);

reply_index=0;
}
else
{
error_cond=-1;
integer lenp=llStringLength(parsed_reply);
integer termin=lenp--;
termin--;termin--;termin--;termin--;termin--;termin--;termin--;termin--;
// integer starindex2=llSubStringIndex(parsed_reply,"<star/>");
// starindex2--;
//if(starindex2!=-1)
//{
// parsed_reply=llGetSubString(parsed_reply,6,starindex) +" " +starstring;
//}
//else
//{

newmessage=llGetSubString(parsed_reply,6,termin);
// }
}
}
else // if <star/> expression exists, insert antecedant
{
if(llSubStringIndex(parsed_reply,"<srai")==-1)
{
temp_reply_list=llParseString2List(parsed_reply,[" "],[]); s_ind=llListFindList(temp_reply_list,["<star/>"]);
list new_reply_list=llListReplaceList(temp_reply_list,[starstring],s_ind,s_ind);
parsed_reply=llDumpList2String(new_reply_list," ");

llSay(chat_channel,parsed_reply);

reply_index=0;
}
else
{
error_cond=-1;
integer lenp=llStringLength(parsed_reply);
integer termin=lenp--;
integer starindex2=llSubStringIndex(parsed_reply,"<star/>");
starindex2--;

if(starindex2!=-1)
{
newmessage=llGetSubString(parsed_reply,6,starindex2) +starstring;

}
else
{
newmessage=llGetSubString(parsed_reply,6,termin);

}

}

}
}
else
{
llSay(0,"I'm afraid I don't understand what you said...yet");
//llemail line here
}
return error_cond;
}


default
{
state_entry()
{
gQueryID=llGetNotecardLine(gName,gLine);//request first line
gLine++; //increase line count
}

dataserver(key query_id,string data)
{
if(query_id==gQueryID)
{
if(data!=EOF)
{
if( llGetSubString(data,0,3)=="<pat" || llGetSubString(data,0,3)=="<ove" || llGetSubString(data,0,3)=="<cha")
{
if(gLine==0) //for now ignore all but pattern or template lines
{
AIMList=(list) [data];
}
else
{
AIMList=AIMList + [data];
}
}
if(llGetSubString(data,0,3)=="<tem" || llGetSubString(data,0,3)=="<cha" )
{
if(gLine==0) //for now ignore all but pattern or template lines
{
REPLYList=(list) [data];
}
else
{
REPLYList=REPLYList + [data];

}

}
}
gQueryID=llGetNotecardLine(gName,gLine); //request next line
gLine++;
}
}

touch_start(integer total_number)
{
llSay(0,"Hello User");
integer handle=llListen(0,"","","");
}

listen(integer channel, string name, key id, string msg)
{
//basically a wrapper to pass message to myParse();
//msg=llToLower(msg);
//llSay(0,msg);
integer error_cond=myParse(msg);
if(error_cond==-1)
{
//message=newmessage;

myParse(newmessage);
}
}
}



I've also included a file to be incorporated as a notecard in an object containing the interpreter. This file, referred to as "filename" in the above code, contains the AIML tagged text to be used as a script for the interpreter.

<category>
<pattern>How are you *</pattern>
<template><star/> is fine</template>
</category>
<category>
<pattern>Are you well * bot</pattern>
<template>This <star/> bot is great</template>
</category>
<category>
<pattern>Who is *</pattern>
<template>I don't know, who is <star/></template>
</category>
<category>
<override>fish</override>
<template>I hate fish</template>
</category>
<category>
<pattern>Are there too many * in this sim</pattern>
<template>I don't know much about <star/></template>
</category>
<category>
<pattern>Are you OK * bot</pattern>
<template><srai>Are you well <star/> bot</srai></template>
</category>
<category>
<pattern>Reply on channel 1</pattern>
<template>OK</template>
<channel>1</channel>
</category>

A brief explication of the tags. As of this writing, the <category> tag is not used. In the future it will contain flags about topic and such like. But right now it's simply included to keep the LindenAIML file looking somewhat familiar to AIML users.

The <pattern> tags contain patterns to match user utterances.
The <template> tags contain hypothetical replies to the patterns.
The <channel> tag (not included in every entry contained in <category></category> pair) allows the user to determin what channel the reply will occur on. So the bot can issue commands to other devices listening on those channels.

<override> represents a special case of <pattern>. In the example, if the user says anything that contains the word "fish", the reply will be that in the <template></template> tags immediately following the <override></override> pair.

Finally, end of utterance punctuation is stripped from the user message at this point, so it is not necessary to include punctuation at the end of patterns (although intra-utterance punctuation is not stripped out).

on edit: it is important to note that LindenAIML is sensitive to the format of the AIML file. No extra spaces between tags, and line-format should be as above.
Nada Epoch
The Librarian
Join date: 4 Nov 2002
Posts: 1,423
Discussion Thread
10-12-2005 19:06
/54/d3/65497/1.html
_____________________
i've got nothing. ;)
Azrael Baphomet
Registered User
Join date: 13 Sep 2005
Posts: 93
Missed a few flags
10-13-2005 07:48
I note that I failed to describe what a few tags do. For those of you familiar with AIML, this shouldn't be a problem. But for those who are not, here are some more:

<srai></srai> is the "syntactic reduction" tagpair. If a pattern produces a template with syntactic reduction tags enclosing the response, that responses is fed to the algorithm as a new user message to parse. The upshot of this: Patterns followed by <srai> tags are regarded as equivalent to the patterns contained in the <srai> tagpair, which should be defined elsewhere in the file (and by convention, previously in the file).

the <star/> tag is an isolated tag and represents the insertion point phrases denoted by the wildcard "*" in the pattern to match. So If I say: "I hate you bot", and the pattern to match is "I * you bot", the template reply might be "Well, I <star/> you too.", resulting in a reply: "Well, I hate you too."

Any more questions, feel free to IM me, or my bro Luciftias Neurocam.