Second Life Forums Archive - list access concurrency? - what is the best practice re updating lists?

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-13-2005 17:36

Hi,

I'm looking at having a high score list which can be updated as an event occurs. What do I need to do to ensure there aren't any concurrency issues regarding updating a list, i.e. where there may be multiple updates occuring.

e.g. Is the best practice to take a copy (call it B) of the current high score list A within a function, update it modify it, then return a new list, which the calling code then uses to replace the global list A with the new list B? What if another user had triggered an update during this however?

Is there a way to "synchronise" a portion of code so it only be accessed by one client at a time, with it "blocking" the caller until its ready?

Suggestions?

EDIT - see my post at the following URL, as is where I've posted the 2 design UML diagrams that are referred to http://forums.secondlife.com/showpo...481&postcount=6

Cheers
Greg

Ziggy Puff

Registered User

Join date: 15 Jul 2005

Posts: 1,143

12-13-2005 18:00

What's a 'client'? How are updates triggered?

LSL is single threaded, so only one instance of a script ever runs, which means only one instance of a function executes at a time. So let's say you have a master script that is the keeper of the high scores list. It receives link messages from other prims, parses them, and uses that information to update the master list. In this case, if you're in the middle of processing a link message and you receive another link message, you won't have link_message() triggered again while the old link_message() is still executing. The first link_message() handler has to finish, and then it will be triggered again with the second message that was received. If a listen happens, or a touch_start happens, it;'s the same thing - they all get queued up, and every event is processed in order, to completion, before the next event is handled.

So... you don't really have a concurrent access problem in LSL. Unless you do something like copy global list A into global list B, and then another function (called from a different event) takes list B and copies it back into list A. If you do something like that, then yes, you'll mess up your data. Keep one global list, use une function to update it, and call that function from wherever you want in your script, you'll be safe.

I think

Edit: You could run into trouble if each transaction requires multiple messages being exchanged. In that case, you can be in the middle of processing one transaction and receive a message for another. If that's your situation, ignore everything I said

What I've done in that situation is to set a flag which says "transaction in progress". If the flag is set, and I receive some input that I know isn't a part of the transaction I'm working on (different user key, or whatever, there must be some way to know whether a received message is valid or not), I usually drop the message or return the money or whatever and llWhisper "system is busy, please try again". One example of something like this was a betting game/machine I wrote, where the person had to pay to register the bet, and select the choice he wanted to bet on. That's two operations, so I had to lock the system out between the two. So once the person finished step 1 (paid the money), you set the flag, and wait for the listen/touch/whatever that signals that he's finished step 2 (picked the horse he wants to bet on, whatever). Keep track of the key of the person you're currently working with. If someone else pays in the middle, return it. Also run a timer so you can recover if this person pays, never selects anything, and walks away (you don't want to be stuck in "processing, please wait" forever).

Hope that gives you some ideas.

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-13-2005 19:32

thanks - this is exactly what I was wondering about

You don't happen to know how many requests can successfully queue up? and whether there are timeouts? (I'm just being curious I guess)

Also, when the list is returned to another script, can I assume it is "passed by value"?

Tks

Kala Bijoux

Material Squirrel

Join date: 16 Nov 2004

Posts: 112

12-13-2005 19:42

I remember some discussion in the Wiki or this forum, but not off the top of my head. What do you mean by a timeout? I *believe* (i.e. I could be wrong) that if the queue is full, new events are silently dropped. Which could be bad if that's a money event and the amount gets debited from the avatar's account but your script never gets notified

I'd hope there are checks aroud that, so the debit only happens if the money event handler gets called, or something.

You could test it - put in a money handler that llSays something, start a 0.1s timer in state_entry, then sit and spin in an infinite loop. Wait a while for timer events to fill up the queue, then pay the object and see what happens

Edit for your edit: Returned to another script? You can't directly pass a variable between scripts. You'll have to encode it into a string and send it as a link message, and decode it at the receiving side and build the list again. Did you mean passed to a function? Anyway, yes, all function variables are passed by value. Which means you take up huge amounts of space on the stack each time you pass a big list into another function. This is something they don't teach you in college, but you learn when you start working with embedded systems - global variables can be a good thing

P.S. This is Ziggy, sometimes I forget I'm logged in with my wife's account

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-13-2005 21:20

actually see my post here too if you like /54/96/76992/1.html#post799481/54/96/76992/1.html#post799481

i.e. this is where I've posted the 2 design UML diagrams

Kala Bijoux

Material Squirrel

Join date: 16 Nov 2004

Posts: 112

12-13-2005 21:58

I would almost be inclined to move the data changing operations over into the script that holds the data. That's assuming memory limits won't get in the way. Here's why - with your current model, the data handling isn't an atomic operation. The controller has to send a request out, then wait for an asynchronous response, then work on the data, and then send the update out again. So yes, you do have an issue. Request A comes in for a data item, request B comes in for the same data item, request A updates the data item, request B updates the data item. You've now potentially lost whatever request A did. That depends on the update command - is it a relative operation like increment/decrement, or does the controller do the data change and send out the new value? If it's the latter, then you'll lose A's update.

If, on the other hand, you could change the protocol so the controller says "here's the data item I want changed, here's the new information, this is what needs to be done"... then the data owner script can do the whole thing in one execution thread, and return the new results. This lets the update be an atomic operation, so your data integrity is ensured - each new operation only starts when the previous operation has completed.

For instance, let's it was the person's score or kill count or something, and both operations tried to increment it. If the score at the beginning was 5, both A and B would receive 5, they'd increment it to 6, A would say "save 6", B would say "save 6", and the score would end up at 6 when it should be 7. But if A and B both said "Increment this person's count and give me his current count", A would get 6 back as a result, and B would get 7. Which is probably what you want.

If you wanted to do it your way, you could implement some kind of 'locking' on the data owner script. You'll need to use some kind of key to identify the transaction. So, an example of how the data script might work...

* Script receives request A for a data item. It sends out the current value, marks the item as being in use, saves the key associated with this transaction.

* Script receives request B for the same data item. The transaction key tells it this is a new transaction. It sees that the item is locked, so it adds request B to a list of pending requests for this item. The script at the other end is sitting and waiting for the link message response.

* Script receives the update from transaction A and updates the data item. It unlocks the data item, sees that there's a pending request for it, and sends out the response. Since it waited, it is able to send out the new value for the item.

Can be done, but as you can see, it becomes quite a bit more complex, and you also have to handle lost messages, so you don't end up with an item locked forever. It kinda depends on what the data is, how it can be modified, and so on. For instance... Do you have multiple readers and a single writer, or are there multiple writers? Is it required to have 'time correctness' - i.e. if a read request is received after a write request, do you need the read to receive the updated data, or is it OK if the read returns slightly old data? If there are multiple readers, do they need to always be in sync about what they're displaying? And so on.

Hope that makes some sense

My programming background isn't with databases, so I'm sure there's a lot I haven't considered, and I'm also sure there are proven ways to properly solve these problems.

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-14-2005 01:38

some great points

I guess the main design concept I was trying to put in place is separating the business rules from the data. This way I can keep making updates/improvements to the business rules without having to reset the script. I'm assuming here if you change the script you may have to reset it/lose data (I haven't double checked on this however I admit)

Assuming the business rules may depend on state was the reason for the call to get the existing state, e.g. what was the user up to in terms of his state, what did he just accomplish, then based on the business rules determine what to do next, then save the data back to the data script.

If the running of the business rules lies in a script/function which can service only call at a time (the rest blocked), which was what I heard in another post, hopefully there will not be scope for data consistency issues (?).

Does this make sense? See where I'm coming from re ability to keep high scores across time even when rules are being changed etc. Comments?

tks

Argent Stonecutter

Emergency Mustelid

Join date: 20 Sep 2005

Posts: 20,263

12-14-2005 05:38

Keep the data in another script, but ONLY have the business rules script accessing it and have all transactions serialised (this is transaction #4635) atomic (sell this_product for L$1) and verified (transaction #4635 complete).

That way if you don't get the verification and you don't know if the transaction went through or not, you resend it with the SAME serial number. If the server (data server or business server) gets a repeat transaction, it doesn't perform the transaction again... it just responds with the same status (transaction #4635 was completed).

Ziggy Puff

Registered User

Join date: 15 Jul 2005

Posts: 1,143

12-14-2005 08:19

From: someone

I guess the main design concept I was trying to put in place is separating the business rules from the data. This way I can keep making updates/improvements to the business rules without having to reset the script. I'm assuming here if you change the script you may have to reset it/lose data (I haven't double checked on this however I admit)

Yes, you will, and that's good design, to separate data from control, which is why I was a little hesitant about suggesting that you mix the two. If that is one of your requirements, then you're on the right track. Just make sure you implement serialization somewhere, either in the data script or the business rules script, so you don't start processing a request until you've completed processing the previous one.

Or come up with a way for the rules script to tell the data script what to do, which is generic enough so that the rules can change but the set of operations required won't change. For instance, if you're dealing with just numbers, you could come up with a link message protocol that lets you add/subtract/multiply/divide a saved data item with an arbitrary new value. If that is sufficient, then the rules script can change in that it takes a different decision now, but if all possible decisions can be brought down to a single add/subtract/multiply/divide operation, then your interface won't need to change, and you can still have the data script make the actual update, the command about what update to make changes as the rules change. This is risky though, because if something comes up down the line that you hadn't thought about, you're out of luck. Which isn't very good design

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-14-2005 13:14

Thanks guys - One thing I'm a little confused about re both your suggestions to implement serialisation checks is why this would be necessary if LSL is single threaded. ie

From: Ziggy Puff

LSL is single threaded, so only one instance of a script ever runs, which means only one instance of a function executes at a time.

If the controller script logic (which calls out to the data script multiple times say) can be sure its function has to complete before another users will start, then given this is there any possibility for contention at all? That is if the main rules functionality can just make sure it trys to manually rollback anything which occured during a failed transaction wouldn't this be enough.

Actually hangoff - I just realised that I'm talking about this stuff without trying perhaps a bit too much. The fact is for the data script to tell the main controller function it succeeded or failed it can't do this without sending it through a separate message, which the controller script would pickup somewhere else (i.e. listener) other than within the controller function. But hang on, if one always implemented a function that returned something, this could be used to indicate success/failure. For a getData type function you could rely (which normally had to return data) you could use the concept that if the result was -9999 say then it failed. Comments?

Tks

Ziggy Puff

Registered User

Join date: 15 Jul 2005

Posts: 1,143

12-14-2005 13:51

From: someone

If the controller script logic (which calls out to the data script multiple times say) can be sure its function has to complete before another users will start, then given this is there any possibility for contention at all?

Yes, because the complete data transaction can't be handled in a single function. You have a listener that picks up the message from the remote sensor object, does some processing on it, and sends a link message to the data script. You then have a link message handler that has to wait for thr response to come back. This response is asynchronous, you cannot invoke the link message handler from within the listen handler. So, the listen handler has to return to the system, so that the system can then call the link message handler with the reply that came back from the data script. This return in the middle is what causes your problem - it breaks the atomicness of the operation, now it's split up into two asynchronous operations.

From the point of view of the data handler script, everything is synchronous, but from the point of view of the controller script, it's not. Since the controller has to wait for a response from the data script, there is a chance that a new request can come in while it's waiting for a response from the old request. Now your entire data operation isn't atomic any more. The data script still is, because it gets 2 requests and sends back 2 responses. But the controller script gets the second request while it's still processing the first request, so at that level, the operations aren't serialized.

Basically, in your interactions diagram, there's a gap between "get user's status (LINK)" and "return status (LINK)" where the controller script actually has to give up control and wait for a new event to occur. The event you're waiting for is the link message response from the data script. Instead, let's say you get a new "detected event(CHAT)". Try drawing that flow out, map out the interaction between both scripts, and see what happens. So the controller script sees something like this:

* Atomic operation 1: Receives "detected event(CHAT)" A. Sends out "get user's status (link)" A. Operation finishes, goes back to waiting for new events.

* Atomic operation 2: Receives "detected event(CHAT)" B. Sends out "get user's status (link)" B. Operation finishes, goes back to waiting for new events.

* Atomic operation 3: Receives "return status(LINK)" A. Runs rules/logic. Sends out "update status (LINK)" A. Sends out "update display(CHAT)" A. Operation finishes, goes back to waiting for new events.

* Atomic operation 4: Receives "return status(LINK)" B. Runs rules/logic. Sends out "update status (LINK)" B. Sends out "update display(CHAT)" B. Operation finishes, goes back to waiting for new events.

I'm repeating my old example, but let's say the data item starts out with a value of 5, and the rules/logic require it to be incremented. The problem is, operations 3 and 4 will both receive a value of 5, and they'll both send back an updated value of 6. What you want is for 3 to get 5 and send back 6, then 4 to get 6 and send back 7. If you don't force serialization in some way, that won't happen.

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-14-2005 15:55

Doh! Thanks Ziggy. You're right of course. I got confused between intra-script function calls, inter-script calls (via LINK) and inter-script calls (via CHAT). I'll blame it on not enough hands on script work

I'll have a crack at updating my diagram and re-posting, but first any comments on the below?

Approaches

I'm thinking that there (for me anyway) are 2 sychronisation approach categories that I would use:

Per User Synch - e.g. for updates to a user specific data
Global - e.g. update to a high score board

Mechanism

Assuming that the order/sequence of events in the controller is significant, this would imply I would need to QUEUE events I guess, then play them back in order through the controller as synchronisation allows

. I'm guessing the only way to do this is via another LIST within the controller, which I guess increases memory usage, and perhaps a timer to provide a means to keep checking for new items in the queue (or just check the queue after every transaction completes perhaps would be better).

What are your thoughts here Ziggy(others) re best mechanism to implement queueing in LSL?

Tks

This might sound weird, but I think I"m actually enjoying all these LSL constraints

Ziggy Puff

Registered User

Join date: 15 Jul 2005

Posts: 1,143

12-14-2005 16:13

From: someone

intra-script function calls, inter-script calls (via LINK) and inter-script calls (via CHAT).

That's exactly it. Function calls are synchronous, chat and link messages are asynchronous. Welcome to distributed event-driven programming

From: someone

This might sound weird, but I think I"m actually enjoying all these LSL constraints

Right. It gives you all the constraints of embedded programming, with a fraction of the tools available on any real embedded system.

Actually, I know what you're saying, it's definitely a challenge. Feels too much like my real job sometimes though

Regarding queueing, you have the right idea. You need to be able to detect when there's a conflicting request. You need to be able to save that request. When you're done processing a request, you need to be able to check for any saved requests for that same user. If possible, you should try and organize your data in a way that these check/save/retrieve operations don't end up being searches through long lists. And if you want to go a step further, try and allow for lag/overload - assume that messages will get lost, so a request to the data script may not make it, or a response may not make it back. Your controller should have "I've waited too long on this operation, I need to bail" logic, or one user could end up stuck, with all requests piling up in the queue because the 'current' transaction is dead and you're never getting a response back, but you're still waiting for it. Lost messages are especially bad if it's the update message that gets dropped, but there isn't much you can do about that, since LSL doesn't give you any control over the event queue of a script.

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-14-2005 20:25

hows this look? I've made a couple of assumptions to minimise number of atomic transactions in the controller - an acceptable trade off perhaps?

UML diagram (v2) attached

Argent Stonecutter

Emergency Mustelid

Join date: 20 Sep 2005

Posts: 20,263

12-14-2005 20:25

From: Greg Hauptmann

Thanks guys - One thing I'm a little confused about re both your suggestions to implement serialisation checks is why this would be necessary if LSL is single threaded.

Serialization lets you replay a transaction if the response from the data server is lost. If the message TO the data server was lost, the serial number will be new, the transaction will occur. If the message FROM the data server is lost, the serial number will be an old one, and it'll just acknowledge that it's already seen and performed that one.

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-14-2005 23:16

From: Argent Stonecutter

Serialization lets you replay a transaction if the response from the data server is lost. If the message TO the data server was lost, the serial number will be new, the transaction will occur. If the message FROM the data server is lost, the serial number will be an old one, and it'll just acknowledge that it's already seen and performed that one.

Argent - I guess you're suggesting based on this I should make (i.e. in my sequence) the DATA and DISPLAY scripts transaction number aware?

I guess my design was based on the fact that (a) there could be multiple sensor events going off, so the controller script has to cater for this but (b) I could assume that the communications to the DATA and DISPLAY scripts (via CHAT) were reliable. Can I assume this in LSL/2nd life? Or is this a rare occurance?

Ziggy - any comments from you too on this one?

Ziggy Puff

Registered User

Join date: 15 Jul 2005

Posts: 1,143

12-15-2005 08:14

Your diagram looks fine to me. Argent's suggestions are good too. A lot depends on how secure you want this to be. I've seen link messages get significantly delayed, to the point where different prims processed the messages at very different times, enough to make my movement code look visually wrong. I don't think I've lost any messages, but I haven't done anything that's performing updates very quickly.

Look up sliding window protocols. You could use a window size of 1 or 2, that lets the sender and receiver hold on to old requests/responses until they know that the other side has received it. This lets you re-transmit a message if it gets lost. You could combine that with Argent's transaction number idea to keep track of requests/responses. That's more code and more memory, but it gets you better guarantees against data loss. Honestly, I have no idea how much of a problem data loss will be, and whether this is way overkill or not.

Argent Stonecutter

Emergency Mustelid

Join date: 20 Sep 2005

Posts: 20,263

12-15-2005 12:39

From: Greg Hauptmann

I could assume that the communications to the DATA and DISPLAY scripts (via CHAT) were reliable. Can I assume this in LSL/2nd life?

No. The chat and link message queues are limited, and can fill up, and when that happens messages will be lost. If you have to have reliable messaging you have to implement it yourself the same way TCP is implemented on top of IP.

Greg Hauptmann

Registered User

Join date: 30 Oct 2005

Posts: 283

12-15-2005 12:41

Thanks Ziggy/Argent - problem best for me to jump in now and try this stuff out.

Ziggy - great pointer re "sliding windows" - got some relevant articles first hit in google. Re memory, good point - I could implement all these things but then not have any memory left to do anything with it after the plumbing is in place

Welcome to the Second Life Forums Archive

list access concurrency? - what is the best practice re updating lists?