Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

The Lag Monster Myths

Wayfinder Wishbringer
Elf Clan / ElvenMyst
Join date: 28 Oct 2004
Posts: 1,483
05-12-2005 11:53
About a month and a half ago we purchased a private island sim and began building ElvenGlen, For the first month or so, everything worked fine. We were to have our Grand Opening May 1. The sim was running well, all signs were go. Our sim FPS was averaging 600+ and everything was very smooth.

Near the end of April (I believe the date was April 27, 3pm, but not absolutely positive at this point), we suddenly noticed incredibly heavy lag, the first sign of such since we entered the sim. FPS figures had dropped through the floor (down to 25). With only 3 avatars on the sim, we could hardly move. We immediately called Linden Labs and asked for help. Some measurements were checked and we were told that we had "too many active scripts" and that we needed to reduce these.

This didn't make sense for the simple reason that we had the SAME number of scripts the days prior to then and had experienced no discernable lag at all. Further, the great majority of those "active scripts" were in fact not constantly active... existing in items such as sitting scripts and tree color-change scripts... scripts that only demanded system resources during the brief moments that they were triggered. Something just didn't figure.

We were told that whether a script is constantly active or not, its very existence and the required resources to check it demanded server time. We could buy that. The question was: how much resource did the basic existance of a script require of the server?

We began paying heavy attention to and tracing sim activities and figures. Our RunTasks was at 7.5ms-- a figure we have been told by Linden Labs is very high. As an act of cooperation we cut scripts to the bone and dropped from 754 active scripts to 580. Admittedly, lag did reduce tremendously and RunTasks dropped to 1.0!

The question was: was it because we reduced scripts, or did the lag simply vanish as mysteriously as it had arrived? We were especially curious because now with 580 active scripts, we were only running an average of 350 fps rather than the prior 600+ fps we had been running with 754 active scripts prior to the initial lag.

That evening we cut scripts further, down to 565. The next day we measured the sim with only 2 avatars on the sim, neither one running internal scripting. Not to our surprise lag had INCREASED, dropping to an average of 195 fps. RunTasks jumped from 1.0 to 3.0. Something was definitely not correlating to the number of active scripts being the primary cause of lag.

We ran several series of tests involving other sims. One sim was nearly bare, containing very few items at all. We rezzed 200+ chairs, all containing active sit scripts. We rezzed more than a dozen trees, all containing color change scripts. We rezzed 300+ simple boxes, all with "waterfall" texture movement scripts. The results? NO MEASUREABLE LAG. Neither our personal experiences nor sim figures showed any discernable shift in sim operation or lag.

We then visited sims round about, checking their ActiveScript figures along with their RunTasks and our own perceptions of lag. We visited one sim that was running *1,500* active scripts with a RunTasks figure of 9.5... and there was NO DISCERNABLE LAG, even with 8 avatars on the sim! So much for the active scripts theory!

Consider: we were told that "500 active scripts is considered to be excessive" and that 350 active scripts is the recommended amount. I'm sure that's a nice dream, but let's talk about reality here. 500 active scripts on a sim is 7 scripts PER 1024m (7.8125 to be exact, but one cannot use .8125 of a script, eh?). Are we being told that a person who owns a 1024m piece of land can only use 7 scripts? That can be taken up in a 2-seat couch, a fireplace and a door! Such figures are unrealistic in the reality of game play. If Second Life cannot support more than 500 scripts per sim, then it is not technically advanced enough to meet the needs of its users. Our studies have proven to our satisfaction that the number of active scripts is NOT the primary cause of end-user lag-- and that the "500 active scripts" claim is not founded in statistical fact.

After all this experimenting, we have come to several conclusions and discovered several things:

* "Lag" is discernable only by one method: the resultant experience of the end user. No matter what the figures say, if the majority of users are experiencing signifiant lag..there is lag. Even if the "stats" indicate there should be lag... if the end users experience none-- there is none. "Lag" is an end-user game-playability factor.

* Lag cannot be measured by the experiences of one single user, as client-side situations can influence lag (type of computer, graphics card, cleanness of line, etc). It has to be recognized by a poll of several users standing in the same area-- or at several places on the sim and communicating with one another-- both of which tests we ran extensively.

* Lag cannot be measured by any of the figures provided to us in the stats box. Neither active scripts, Sim FPS, RunTasks ms, or any other figure has consistently or even reasonably correlated to system lag.

* Lag is not caused by "too many active scripts". That is a wide-spread myth that takes the heat off Linden Labs and passes it on to the end users-- where it does not belong. Using reasonable scripts is part of our every-day game play and despite all the claims, 500+ scripts is NOT excessive, either in concept or in analysis of statistical data.

* There is little or no correlation between end user lag and the promient reasons presented to users by Linden Labs. We have the data to prove it and such data has been sent in to Linden Labs. As of this date, no one has replied to our submission.

* There were SOME scripted items that were found to cause extreme and excessive lag, measurably so. These items are well known among experienced SL users:

- Ultra-primmed hair and other avatar devices (such as wings), especially those containing tortured torus shapes.

- The ever popular "AO" devices (that provide avatars with specific movement characteristics). We discovered even one of these could drop local activity by a significantly noticeable level, and several such avatars could drag a sim to a standstill.

- Some system-heavy scripts with heavy special effects could be a significant cause of lag. We found one particular particle generator by Ama Omega that if triggered, would lag an entire sim to absolute standstill. (Fortunately, no one seems to use these-- for obvious reasons). So yes, individual scripts can lag like a fiend. This indicates a need for greater Server-Side control of certain scripting features,

In these studies, we came up with several observations and questions:

* The #1 *discernable* cause of measurable lag-- is avatars themselves. Avatars are by far the most complex, system-resource consuming item on Second Life, hands down.

* Considering this, what good does it do to limit a 512m piece of land to 117 prims... but allow avatars to walk around with 400 prim jewlery and heavy AO scripting? Is it possible that SL has allowed avatars to become TOO complex, to the point that it damages the playability of the game?

* Second Life is in serious need of a "resource measurement" device that can be applied to individual items to determine the degree of system resources required by that particular item. This would help builders during the initial build to streamline their items to use minimal system resources.

* Second Life also needs a device to point out avatars that are using excessive system resources (we used to have those, but now with the advent of version 1.6, any item attached to an avatar cannot be examined for excessive prim or script usage).

* This aside, AVATARS ARE NOT *THE* CAUSE OF LAG ON SECOND LIFE. Note that I mentioned they were the #1 *discernable* cause of lag. It is obvious there is something server-side that is the core-cause of lag problems on Second Life. We know this because lag does not always correlate to the number of avatars on a sim. (We remember that at the first sign of serious lag on ElvenGlen-- there were only 3 avatars on the sim... and that lag continued for hours).

* We must wonder if Linden Labs is placing too many servers on access lines, crowding the bandwidth. That is the only logical conclusion we could arrive at considering the lack of any other apparent statistically-verifiable cause. While bandwidth-crowding is a common practice with website hosts-- those hosts don't charge people the kind of fees we are paying Second Life. There is a lot of difference between a $5 a month hosting fee and $200.oo a month land fees. If SL is crowding bandwidth... at the prices we are paying, such would certainly be a great disservice to the end user. Excessive profit generation at the detrement of the customer is how businesses go out of business.

* We have to wonder why Linden Labs is not more forthcoming with factual information and why we continue to be told things both we and Linden Labs know are not true (such as the "too many active scripts" claim). There are some of us out here who who are actually acquainted with computer concepts and we resent it when we are told things that data has already proven to be untrue.

I present this data not to cause problems for Linden Labs (surely their programmers have enough hassles; I do not envy them) but rather to let the general community know our findings and to state to Linden Labs-- if you'll pardon me being blunt-- that we not total idiots. We are capable of measuring what is going on and drawing rational conclusions. So level with us. We can handle bad news much better than bogus news. ;)

Last Monday on our sim, lag was so bad (with only 15 avatars in the sim and all avatar scripting and heavy prims removed)... that we had to cancel the event we were holding. Prims were actually turning randomly phantom and people were sinking through the arena floors! We were told by members attending the event that they often experience such problems throughout SL. How can we conduct our activities under such circumstances?

When that kind of thing starts happening, the game becomes unplayable and people start leaving. End users can either be enthusiastic supporters of Second Life, or ex-end-users, depending on how we're treated. So my encouragement to LindenLabs: work more with end users. Treat us with greater respect. Let us in on what's going on so we can give you the feedback you need to improve Second Life.
Waves Lightcloud
SexBall Safety Designer
Join date: 22 May 2004
Posts: 193
05-12-2005 12:59
LL you need this guy, he will help keep your azz outa hot water with us, Give your staff the needed break of what’s really going on. In stead of the techno we have a problem Houston crap. Sounds like he can direct fire down range accurately so your code gunners can fire with effect in stead of putting them in front of the end user firing squad trying to be PR guys when they are coder / programmers. We know you have lost control of this monster you built. Get someone like this in your works fresh from the battle fields to direct fire missions to the problem areas.

I normally get a fee for this But I’m willing to waver it, cuz I like you guyz.
Laukosargas Svarog
Angel ?
Join date: 18 Aug 2004
Posts: 1,304
05-12-2005 13:49
Well done Wayfinder. Your experience bears out suspicions I've had for a long while.
Especially about bandwidth. Many times I've been lag free in a sim with 600+ scripts and totally lagged out in a sim with only 100 or less. Way too much rubbish is getting bandied around about lag atm. I too hope LL will come clean.
Osprey Therian
I want capslocklock
Join date: 6 Jul 2004
Posts: 5,049
05-12-2005 15:14
That was a very good post, Way.
Christof Reitveld
Registered User
Join date: 27 Mar 2005
Posts: 51
05-12-2005 15:25
From what i heard, they have different computers running SIMs they are not all the same, if you have a slower computer running your SIM, it lags easier. I know the swap the SIMputers around sometimes, maybe they just gave you the slow one.
_____________________
Christof Reitveld
Owner of SecondPublishing
SecondPublishing

Have a Short Story or Novel you wrote that you wish to sell? Contact me.
Lee Linden
llBuildMonkey();
Join date: 31 Dec 1969
Posts: 743
05-12-2005 16:44
I'm entirely for bringing up a good serious discussion of the causes of performance issues in Second Life, and I'll do what I can to contribute everything I know here. But there are a lot of mistaken and flawed assumptions in the opening to this thread, and I really do want to set them straight.

The first, and most important of all, is that there is no single technical cause of lag. That's because the word "lag" has no discernable definition. (This is basically due to a lack of definition among residents; BELIEVE me, I'd nail one down if I could!)

Even in Wayfinder's post, he writes of a variety of totally different symptoms, with wildly different causes, and describes them all as lag. More importantly, several problems are described as "lag" with no mention of actual problems at all. It seems that "lag" has only one definition (a purely social one): that SOMETHING (among absolutely everything between the server's hard drive and the user's screen) is not performing as it should.

That's a lot to troubleshoot. It's a lot like telling a doctor "it hurts", and when asked for more information, saying "in my body."

Here's a list of just SOME of the things labelled "LAG", but described in helpful way:
====
* The video is jerky or stuttering (low user PC framerate).
* The video pauses for a second or more (slow user PC performance)
* There's times where nothing moves including me (slow user PC performance)
* My characters don't appear at the speed I type them (very low user PC framerate, but reported by someone who doesn't actually notice low framerate)
* My character doesn't start walking for several seconds after I push forward (latency between user PC and server)
* My character won't STOP walking for several seconds after I stop pushing forward (really bad latency between user PC and server)
* After I finish my chat and press Enter, it doesn't appear for a long time (possibly latency between user PC and server, possibly server load)
* When I build and create things, they don't appear (possibly latency between user PC and server, possibly server load)
* My avatar starts and stops pretty quickly, but my actual movement speed is slow (probably server load, likely due to physics)
* My avatar walks, but quickly springs back and forth between points I was walking along (possibly data loss between user PC and server, possibly server issue)
* I can't walk, I can only rotate in place (actual partial disconnection from Second Life)
* I don't have a L$ balance (partial disconnection)
* I don't have my inventory (partial disconnection)
* I can't see the land or other people (partial disconnection)
* I can't send or receive IMs (partial disconnection)
* I got stuck and can't teleport (possibly partial disconnection, particularly depending on how you're "stuck";)
* My teleports time out (possible server database high-load issue)
* Second Life quits (possibly crash, possibly interference from another program, possibly server crash)

...and so on.

Even with the typical cause listed in quotes, we haven't really identified an issue we can fix, as each of those problems has multiple causes. For example:

Causes of low user PC performance:
-----
Too many objects/avatars in view
Low-end video card
Incompatible video card
Old video drivers
Insufficient memory
Other programs running
Too-high Second Life Preferences
Improper bandwidth settings
Wireless network interference (even if the PC isn't wireless but something else is)
Bad internet connection response times/speeds

Causes of latency between user PC and server:
-----
Interfering programs on user PC
Wireless networking
Internet connections that aren't true broadband (Satellite for example)
High usage at ISP's central office (especially on cable)
Lost data at user ISP
Lost data at Linden ISP (possible but I haven't really seen it)
Lost data at any of the thousands of router machines between user's ISP and Linden's (too frequent)

Causes of bad server performance:
-----
Too many avatars
Too many scripted avatar attachments
Intensive scripted attachments (i.e. poorly written animation overrides)
Too many intensive scripts
Too many scripts overall
Too much physics work (possibly avs, or flying scripted birds, or sit/stand interpenetrations, etc.)
Too many child agents (avatars in another region whose view includes this region)
Any user on the region requiring frequent retransmission of data (bad modem user!)
Multiple a-lot-but-not-too-manys (i.e., 15 avatars playing a complex game is worse than 15 avatars not playing that game, or the game rezed inworld but not being played by 15 avatars)

These causes typically have to be broken down even further in order to be effectively "solved". What's the "expected" framerate for John's mall, or Mary's club? How do you troubleshoot a user who refuses to remove the firewall that's frequently been the source of the exact problem they're having? Will removing the top five scripts in the region dramatically help, or is it just the fact that the particular combination of 600 scripts they're using adds up to a lot for the server to work on?

Hopefully, this sheds some light on the length and breadth of scenarios that run through my head when someone says "it lags, please fix it" (give or take the "please";). ;^)

-----

So, the key things in determining lag for any individual server are:
-----
1) The word "lag" isn't descriptive enough to troubleshoot; unfortunately, neither is "it's slow". The more descriptive of the actual thing that isn't behaving as you want, and exactly how it IS behaving, the better.

2) Speaking for myself, I do try to differentiate between client issues vs. server issues. Please don't take this as a sign that I doubt you, or that your computer is somehow inferior. I need to make that distinction so I can make suggestions to improve whatever's not working like it should.

3) Most of the tools and statistics I look at are available to anyone with the Ctrl-Shift-1 hotkey. The Statistics Guide at secondlife.com/help shows you exactly what I'm looking for (as I wrote that guide).

4) The single greatest thing that affects server performance is content. I can't stress this enough. The SPECIFIC content on your sim (including the attachments of your visitors) determines how well it runs. Server differences are more or less statistically insignificant given the exact same content. The single greatest way to improve server performance is to control or reduce content.

5) SimFPS means nothing. I'd wager that if you couldn't read SimFPS, you wouldn't be able to guess it at all. We may actually remove this statistic in the future, because it has no correlation to performance. I don't read the SimFPS value, mostly because it doesn't give me any information I can troubleshoot with. I can't really help look at a "low SimFPS" unless it's under 50, because a low SimFPS alone does NOT mean performance is reduced.

6) All the OTHER statistics in that window, when read as a group, do provide statistical evidence of server issues. Again, this is all covered in the Statistics Guide. Slow movement is usually captured in Agent Updates/sec or Physics FPS; slow chat shows up in Receive Msg, Retransmit, or Ping User/Server; slow-to-respond scripts show up in Run Tasks, etc. Nearly every problem that is actually caused by slow server performance will cause a significant change in some statistic.

7) Other regions can't be used to benchmark your region. This is because the content's totally different, and again, the SPECIFIC content on your sim, taken as a whole, is largely what determines your performance. If your region has 500 scripts taking .2ms each to run, it's not a valid comparison to look at another region with 1500 scripts each taking 0.01ms to run. The attachments on the 15 people in another sim may be less intensive than the attachments on the 8 people in your sim. The birds in your sim may be more physics-intensive than the fish in someone else's.

8) You really do have to look at the whole picture to make a reasonable conclusion about what's hindering your performance. It's not fair to blame a Ping Sim of 500 when your Basic (aka clientside) FPS is 2.0 (which directly results in the high ping). It's not fair to point out that you only have 200 scripts when your top load is actually in Run Agents (avatars and their attachments).

9) And, most of all, I can only help so much. If we've identified something that's "too high" (outside acceptable limits for usage), and the statistics report it's overwhelmingly the only thing running unacceptably, please work with us to get that feature or content to an acceptable level for the performance you want to have.

It seems quite reasonable to me that your island runs smoothly or not based on what's going on there, what you've built, what scripts it's running, who's there and what they brought with them. Maybe the Lindens are wrong (or lying as suggested), the statistics are fictional, hardware isn't limited in what it can process, and you should be able to build whatever you want as long as you find one combination of things you're doing better than someone else... but what I see everyday doesn't lead me to believe that. Plus, I'm a rotten liar. ;^)

[EDIT: Mostly formatting changes for a feeble attempt at readability]
Lee Linden
llBuildMonkey();
Join date: 31 Dec 1969
Posts: 743
05-12-2005 16:47
Christof: There are no slow servers among private islands. All private islands are on one of two separate configurations of hardware. I've tested switching a private island back and forth between these hardware configurations, using the exact same sim content and the same group of people. The difference in performance between the two hardware configurations was smaller than the variance (the largest and smallest numbers each statistic reported over a minute or two). As such, the hardware for all private islands is statistically insignificant in terms of performance, especially since content make so very much more difference!

[EDIT: IN-significant, IN-significant! (IN-conceivable!)]
Derry McTeague
Registered User
Join date: 6 Jan 2005
Posts: 81
05-12-2005 18:16
I think you hit the nail right on the head, and I'd really like to see Live help stop saying its the users conncection or pc,(I can only think that they do believe we're (all idiots)incapable of making the same obeservations You have proven:that Second Life has created a world thet they arent equipped to support, and when there is no apparent reason for any lag in a sim the only logical cause could be their servers.but it seems to me that they are intelligent enough (I hope) to have realized all this already and i just hope they're working to correct the problem and not in denial or too egotistical to acknowledge the real problem.
Prokofy Neva
Virtualtor
Join date: 28 Sep 2004
Posts: 3,698
05-12-2005 19:15
I think Derry has experience with *exactly the same sim* where I often notice lag -- meaning my avatar moves slowly, the frames go by slowly, and I can't type and return the text without effort and pauses. That's what most people mean by lag.

And as we look at the sim and the activities on it, we don't get it. What's to cause this? Is it just a junky server (my own theory after months of testing and observerations). Or is someone "to blame"?

When we see scripts are above 300 --- above 500 -- we have to conclude it is the numbers of scripts.

From: someone
Too many scripts overall


There, you've said it, Lee. You have said it. Like all the Lindens we've ever contacted about this particular sim, and other laggy sims (the sims are lagging not us, because no, we don't have TSO open or any other programs, no our graphic cards have updated the drivers blah blah).

So try to at least entertain the idea that *this sim could have a crappy server*. Try to at least keep an open mind about this. Try to reflect on the notion that a brand-new sim, with land for sale on it that still advertises it on "New fresh and fast mature sim!" could be dragging its ass today *because the server was changed or because the server was sub-optimal*.

I'd like a review of the server itself be part of the checklist of what the Lindens do when dealing with multiple complaints of lag on a sim. I don't want to get a smack in the face about hoochie hair and sex balls -- because they aren't really there for the most part on that sim, and not the reason for its lag.

I strenuously urge you not to take away the FPS numbers visible to customers. That would be an outrage. While we recognize your points about FPS not being the final arbiter of sim performance, it's a good rule of thumb. It's a good thing to check.

When you are lagging, and you check the FPS and see it isn't the FPS down to 50 or 500, and you see the FPS is at 1500, you know that your computer or connection is the problem.

And conversely, when you see that FPS is drug down to 37, you realize that there is war somewhere on the sim with kids shooting scripts or bouncing or doing some other stupid-ass thing, and you can look around for it and take action.

Honestly, please *do not* deprive your paying customers of the ability to check your product's performance!
_____________________
Rent stalls and walls for $25-$50/week 25-50 prims from Ravenglass Rentals, the mall alternative.
Prokofy Neva
Virtualtor
Join date: 28 Sep 2004
Posts: 3,698
05-12-2005 19:18
From: someone
* We have to wonder why Linden Labs is not more forthcoming with factual information and why we continue to be told things both we and Linden Labs know are not true (such as the "too many active scripts" claim).


As you may know, this is the subject of many intense and active threads over in other sections, under GENERAL and also LAND AND ECONOMY.

The whole reason why I and others could even call for a pay-to-play concept where people would be charged for the excessive use of scripts is because it is Lindens, making house calls, who would say "Don't just look at the FPS, look at the number of active scripts and reduce them."

I'm glad that this feedback is getting some scrutiny now.

It helps some of the players realize that it's not just us saying this idea of "too many scripts" -- *it comes from the Lindens themselves*.
_____________________
Rent stalls and walls for $25-$50/week 25-50 prims from Ravenglass Rentals, the mall alternative.
Barmovic Boffin
Registered User
Join date: 21 Jan 2005
Posts: 87
05-12-2005 19:53
I think there is some BIG factor which is far simpler than all the complexity Lee discusses, and which I have never seen explained. I have seen a BIG change in lag (lets just say in client fps) happen just from logging out, and then logging in again, as quickly as possible.

I have seen fps slowly deteriorate over an hour, and then jump back up from 2fps to 25fps at a fast-as-possible relog. I have seen a home sim where I usually get say 11fps, and then one time I tp home, and I get an unbelievable 35fps, steadily. Leave, return, back immediately to the usual 11fps. All with no rhyme or reason, over 30 second timescales (thats a fast log in).

Maybe there is a bug in the client, been there a long, long time. Something that can change from login to login. Maybe something to do with the cache.

But its obviously different from what wayfinder is talking about, because there a group of avatars all see the lag change together. Or is it? Lee suggests that a "bad modem user" can drag the sim down. Does that mean that a bug in the client, active in just one avatars computer, could similarly slow everyone else down? When absolutely nothing anyone can see or detect has changed?

It's a mystery. But I definitely think something is going on completely outside the envelope Lee has described.

But how to prove it? Its all just anecdotal.
Wayfinder Wishbringer
Elf Clan / ElvenMyst
Join date: 28 Oct 2004
Posts: 1,483
05-12-2005 20:36
Lee, thanks for taking the time to reply to this thread. I know how long it took you to write that... and that fact alone is appreciated. I don't agree with much of it. LOL. But I appreciate it.

First thing I want to say is that to my knowledge, no Linden has ever directly lied to me. But I do feel at times some coporate face-saving has been done and I've been told too many times the "too many active scripts" explanation when I knew that simply was not the case. And that's what I meant when I said, "Telling us things that are not true." I mean simply that-- giving us canned explanations that our data has already shown to be false. That's not necessarily lying-- it's just not paying attention to information we've already established and provided.

First, in the definition of lag: there's a difference between definition and cause. Yeah, there are a multitude of things that COULD be causing lag. And maybe it's looking at every one of those things that's preventing seeing the overall picture. My definition of lag is: any time general game play slows down significantly for a group of people gathered in one area. And quite often, across ALL of Second Life.

Question users: How many times have you been online and heard the continual comment over and over again, everywhere you go, "Yeah, lag is really bad tonight"? How many times have you been online and noticed that the entire system seems to be running much more slowly than it was the day before, no matter where you go?

That is my definition of lag. System-wide lag has nothing to do with objects rendered, number of active scripts, number of avs in a sim, internet lines (unless it's SL's internet lines), or any other client-side or sim-specific factor.

Now yes, there could be many reasons lag happens, from Avatar scripts to badly-written particle routines. Yes, individual users could experience lag due to local problems ranging all the way from a flaky microprocessor to ram glitches to a dirty internet line.

HOWEVER, none of that explains the type of lag that I clearly defined in my messages: lag that suddenly hits from nowhere, effects more than one user, and NOTHING on the sim has changed. No new avatars visiting. No new builds. No new scripting. Suddenly the sim is running fine. Suddenly it is not.

I appreciate that content definitely produces lag. I agree 100% that a sim with 12000 prims is going to run somewhat slower than a blank sim. I agree that a sim with 500 scripts is going to run slower than a sim with no scripts. That is just common sense. However, this does not explain why a sim with 12000 prims and 500 scripts will be running along peachy fine, with no discernable lag, then suddenly drop from 500 FPS to 25 FPS for no discernable nor measurable reason and when nothing on the sim has changed to produce that lag.

That is the lag I'm talking about. That is the lag that screams server-side problems. That is the lag that needs to be discovered, targeted and eliminated.

I know that Linden Labs has its hands full. I have spoken with you and with Blue and Jill and other Lindens, and you are always nice, and polite and you do spend your time speaking with us. But I think what a lot of folks are somewhat upset about is that quite often, it doesn't seem Linden Labs LISTENS to us... when we're the ones (as one user very aptly put it) who are out here in the Front Lines. Some of us are computer professionals who can tell just by years of experience when something is going on, and we almost have a feel for where it is.

My solution: Locate the problem. Fix the problem. Make users happy. You never will 100% eliminate lag. But it is possible to correct it to the point that no one complains about it. Like one user stated (and gave me a chuckle)... you folks need to find out where to aim the weapons.
Usagi Musashi
UM ™®
Join date: 24 Oct 2004
Posts: 6,083
05-13-2005 00:54
From: Lee Linden
I'm entirely for bringing up a good serious discussion of the causes of performance issues in Second Life, and I'll do what I can to contribute everything I know here. But there are a lot of mistaken and flawed assumptions in the opening to this thread, and I really do want to set them straight.
[EDIT: Mostly formatting changes for a feeble attempt at readability]


A BIG THANK YOU LEE!!!!!!Its the best information post I seen in a LONG TIME! :)THANK THANK YOU THANK YOU!
Kris Ritter
paradoxical embolism
Join date: 31 Oct 2003
Posts: 6,627
05-13-2005 01:31
From: Lee Linden
There are no slow servers among private islands. All private islands are on one of two separate configurations of hardware. I've tested switching a private island back and forth between these hardware configurations, using the exact same sim content and the same group of people. The difference in performance between the two hardware configurations was smaller than the variance (the largest and smallest numbers each statistic reported over a minute or two). As such, the hardware for all private islands is statistically significant in terms of performance, especially since content make so very much more difference!


I wish that this were driven home a bit more, because the popular belief is that there are two distinctly different sets of hardware, one of which is vastly inferior.

Just about everyone who has approached me about buying my sim has used it as an excuse to try and knock me down a couple of hundred bucks, or outright changed their mind because "it's an old sim".

And now I'm stuck with it for another month, I guess. That'll teach me to be an early adopter and supporter of LL!
Foolish Frost
Grand Technomancer
Join date: 7 Mar 2005
Posts: 1,433
05-13-2005 03:19
Actually, Lee, the term lag does have a measurable definition among the old hacker groups I used to 'know of'...

It's generally defines a state where the response back from a server (your computer sends data, and you get a response back) is noticibly long or interefers with performance.

As far as I can tell, it came from a joke about data taking so long to return, that it was getting JetLag...

The common causes for Lag include:
- Poor network connection, due to problems at some point between the computer and the server. (Lot of hardware to check in that one.)
- Poor server performance, so that it takes a long time to create the response.

Low framerate due to excessive graphics is NOT lag, nor is poor performance due to client system/software issues. They might easily be misdiagnosed as lag, but that's what it is, a misdiagnosis.

Use of pings has been the traditional measurment of lag, as was eyeballing the amount of time it took for a 'character' in an online game to start an action when the player activates it. For example, in a really lagged game of Quake, you might not stop running for several moments after you let go of the forward key. Or might only fire your weapon a half second after you actually attempted to.

Poor framerate, traditionally, cannot cause the above symptoms, but some other client-side problems can. Failing network cards or wiring are an example.


I'm shutting up now.
Laukosargas Svarog
Angel ?
Join date: 18 Aug 2004
Posts: 1,304
05-13-2005 04:00
From: someone
Use of pings has been the traditional measurment of lag

Indeed.

And I also want to thank Lee very much for that detailed post.

What I have noticed especially since the release of 1.6, although this could be coincidence, is massive ping times ! Sims (like the one my place is in ) with content that doesn't change from day to day have performance that goes from great to awful with only me or one other in the sim and with very few if any child agents. These days I often see pings of 12000ms or more even! I would normally expect to see 100 to 200ms being that I'm in the UK. These huge ping times did not happen only a few month ago. Sure it could be something between me and the SL servers but I have a hunch it isnt.

It's no coincidence surely that the "lag" ( for want of a better word ) has gone up it seems with the huge increase in subscribers to SL.
eltee Statosky
Luskie
Join date: 23 Sep 2003
Posts: 1,258
05-13-2005 10:27
From: Lee Linden


5) SimFPS means nothing. I'd wager that if you couldn't read SimFPS, you wouldn't be able to guess it at all. We may actually remove this statistic in the future, because it has no correlation to performance. I don't read the SimFPS value, mostly because it doesn't give me any information I can troubleshoot with. I can't really help look at a "low SimFPS" unless it's under 50, because a low SimFPS alone does NOT mean performance is reduced.


well i agree with alot of the rest of what lee has to say, and have more than a fair share of experience rather (accurately) determining what was causing lag in sims and would be happy to help out... but i have to make a BIG jumping arm waving additional comment to this. Don't cripple something because people misunderstand it. I understand people over-react to sim fps as it is currently displayed, and that life would be a lil easier if you didn't have to sit there and deal with a never-end of people upset it goes from 4000 to 2500, etc..

what i posit is don't display it as 1/n (n being the total ms required per server frame), the obvious 1/n for small numbers of n, and non linearity problems, and misunderstandings always arise... but theres still valuable information there for those of us who *DO* know how to read it properly, aka what the total frame time is... its just currently obscured in a 1/n fraction... why not just call it 'server frame time' and display its number in ms like the other timings.. aka instead of 5000 sim fps, list it as 0.2ms server frame time... instead of 100, it would be 10ms server frame time...

Its the *same* information, but simply displayed in a method that would be a lot less misleading to people.. instead of a small change 'loosing hundreds of sim fps' in a misleading way, it would instead show them their server time went from .3ms to .5 ms (aka 3000 sim fps to 2000 sim fps).

I mean what im asking is i have done alot to help out other people, and help LL, diagnose sim performance problems. Workin with chromal we were able to discover the hardware failings of the first generation of sims, using among other things, sim fps as a correctly re-interpreted statistic... don't take away that information, just remove the 1/n silliness and show n in all its glory, a much more realistic number that is much less open to misinterpretation.

Btw if you want me to help find what specifically is causing the lag in your sims i would be happy to try and help, wayfinder.
_____________________
wash, rinse, repeat
Lee Linden
llBuildMonkey();
Join date: 31 Dec 1969
Posts: 743
05-13-2005 10:31
Whew! Lots of feedback! Okay, here we go...

Derry: The poor Live Helpers... they're out there every day trying as hard as they can armed with only their knowledge. They don't have special tools, they can't view anything about someone's sim or account, and they'll typically be told to fix gridwide problems before the Lindens hear about them! But they try their best, and they're a huge help. I know we're all proud of what they accomplish. It's got to be frustrating working with yet another person who won't describe their problem in any terms except "it lags" and "it's slow".

Prokofy: Believe me, I entertain the notion that a server might be a problem. The most effective thing I've tried in that case, however, is to simply restart it. It's easily enough done; the hardest part is kicking everyone out. That does work in a few cases where the sim's worn itself down from a memory leak or something else, and we haven't automatically detected that and restarted it already.

There's a misconception that we refuse to swap servers, through misguided policies or simple denial. That's really not the case. We've done so, often, and I've handled several myself. The problem is it doesn't really have any effect at all.

I have a chart on my desktop that plots the SimFPS of one sim across every server it was on in six months. (This covers both classes of hardware we currently use, about a 400-sim range at the time.) The result? The difference was very statistically insignificant. The low and high sim numbers were effectively distributed randomly in a tight range. Given the same content, the number of avatars in the region was the only independent variable that changed SimFPS... no specific sim made anywhere near the difference that three or four people could.

I've swapped specific servers across a wide range of numbers. It hasn't made a difference. What makes a difference is that one script that starts looping heavily, that one flying bird that interpenetrates a building, that one person on dialup with a Bandwidth setting of 1000 mucking up the works.

Do we have the tools to pinpoint a specific piece of content that may be causing the problem, let along a specific collection of 50 objects that are taking half the load? No. Do we currently provide the information residents want about how their content affects their performance? No, we're not really doing that at all right now, and that's something we NEED to do.

But content is still king.
Content still makes or breaks any server, no matter what number it's labelled with.
It's not logical at all to conclude that because you lack the tool to identify troublesome content, that troublesome content must not exist.
Any content that includes scripts is NOT STATIC. Scripts are dynamic; any script of any note can be a high or low load, at any time, based on what's going on, what has happened, what part of the script is running. Any script that takes any action without resident interaction, or loops any part of itself, has the potential to take a huge chunk out of performance. Scripts that DO have resident interaction tend to go all-out when interaction actually occurs, so they can be even less predictable.

We keep trumpeting that content is what makes or breaks server performance NOT because we're trying to hide "bad sims", or because we don't know about them, or because we're trying to avoid our responsibility to make what we can, right. We do so because, to be honest, the power is in your hands more than ours. Do you need better tools to benchmark your content? I'll say YES! louder than you will, and I'll push to make that happen.

But while we make the engine, you're in the driver's seat. Maybe things aren't running at Formula 1 precision, and that's something we work on every day. But, you can't ignore the role you play, as the driver, towards how enjoyable of a ride you have.
eltee Statosky
Luskie
Join date: 23 Sep 2003
Posts: 1,258
05-13-2005 10:38
From: Lee Linden
There's a misconception that we refuse to swap servers, through misguided policies or simple denial. That's really not the case. We've done so, often, and I've handled several myself. The problem is it doesn't really have any effect at all.


this is abit misleading, its much more true now than it was, that by in large all sims are running on second, or third generation hardware... tho the different between those two is statistically significant, it is not a power of 10 odd difference, as the difference between first and third gen was (on the same sim on the same day)...

I think its more accurate to say now, that if your sim is VERY slow now, its because of content, there are still slightly different calibers of sims, but they *all* run a well thought out content wise sim, very well.

(assuming of course that all class 1 sims have now been removed from the main grid, which is what i have been told)

*just to note, im not disagreeing with lee here, im just trying to agree, with slightly different words that may be a lil more intuitive to understand what hes trying to say, for most people
_____________________
wash, rinse, repeat
Lee Linden
llBuildMonkey();
Join date: 31 Dec 1969
Posts: 743
05-13-2005 10:52
From: Wayfinder Wishbringer
Question users: How many times have you been online and heard the continual comment over and over again, everywhere you go, "Yeah, lag is really bad tonight"?

I hear that every night in every game I play, no matter how good my experience is and no matter how smoothly the servers are running. Remember, you don't hear from the people running smoothly. There will always be lag complaints because so many people are using the term inappropriately.

From: Wayfinder Wishbringer
How many times have you been online and noticed that the entire system seems to be running much more slowly than it was the day before, no matter where you go?

I'm a glutton for punishment by saying anything about performance in the middle of some very real grid-wide issues (which many folks here are losing sleep to fix, believe me). But problems within Second Life are best tackled one-at-a-time; there's no sense fixing a perceived "bad server" if the problem lies in content (or with a bigger grid-wide issue, let's be honest).

From: Wayfinder Wishbringer
HOWEVER, none of that explains the type of lag that I clearly defined in my messages: lag that suddenly hits from nowhere, effects more than one user, and NOTHING on the sim has changed. No new avatars visiting. No new builds. No new scripting. Suddenly the sim is running fine. Suddenly it is not.

I'm confused; I don't see that as a clear definition at all. You've only stated that SOMETHING isn't running as it should. That's a definition, only in that it's the only sentence vague enough to encompass everything people refer to as "lag". The primary intention of my first post was to point out how a "clearly defined" sentence like this doesn't even begin to touch the surface of a descriptive problem that can be effectively troubleshooted. It is, however, a clear example of how most lag complaints are worded. This is where the challenge of my job lies. ;^)

From: Wayfinder Wishbringer
I know that Linden Labs has its hands full. I have spoken with you and with Blue and Jill and other Lindens, and you are always nice, and polite and you do spend your time speaking with us. But I think what a lot of folks are somewhat upset about is that quite often, it doesn't seem Linden Labs LISTENS to us... when we're the ones (as one user very aptly put it) who are out here in the Front Lines. Some of us are computer professionals who can tell just by years of experience when something is going on, and we almost have a feel for where it is.

I can completely empathize with this. The feeling of not being listened to... of being the person who deals with the subject every day, but not in the way the other speaker does... of being someone with years of experience directly related to the subject, but somehow treated by the other speaker as unknowledgeable, misguided, incorrect, or foolish... of knowing 100% where the conversation needs to go, but being brought back to topics that you feel are distracting at best because they're central to the almost completely unrelated viewpoints that the other person keeps trying to bring up.

It keeps conversations interesting, I guess. ;^)

From: Wayfinder Wishbringer
My solution: Locate the problem. Fix the problem. Make users happy. You never will 100% eliminate lag. But it is possible to correct it to the point that no one complains about it. Like one user stated (and gave me a chuckle)... you folks need to find out where to aim the weapons.

As long as elements of Second Life remain out of our direct influence and control (the content, the Internet, the resident's computer), there will be things that don't perform as they should. As a result, there will always be unhappy people, and there will always be complaints.

Personally, my goal is: Locate problems. Fix problems. Make users happy. I will never 100% eliminate things people aren't happy about. There are some things that cannot be corrected to the point that specific residents are happy. (Some are misguided diagnoses of the problem; some are conflicting viewpoints of what the solution is, etc.)

But I try to do all that I can, to the best that I can, every day, and be proud of the results. I know that goes for a lot of folks at Linden.
Lee Linden
llBuildMonkey();
Join date: 31 Dec 1969
Posts: 743
05-13-2005 11:04
Foolish: Correct, that is the classic definition, and the one I follow. However, I gave up long ago on promoting the "true" definition, when my day consists of fixing the "social" definition. ;^)

Laukosargas: That's why I try to promote ALL the statistics; the more numbers you read, the more you know about which causes can be eliminated. A 12000 ping on the transatlantic connection does unfortunately happen at times; a traceroute can help identify at what point in the connection it all goes downhill. There are scaling issues, to be sure, but remember that each sim only speaks to four others, no matter how big the world gets. Of course, the currently-centralized servers that hold global data are a different matter, but I know they're also a key focus for us right now.

Eltee: I'd personally lock SimFPS at 50 and never show it any higher than that. Because, to be honest, I handle a lot of support requests that involve that number changing and nothing else. There's no description of REAL performance issues, nothing is working incorrectly, it's just that someone read that number and didn't like it. I don't consider it to have much real value because by the time it actually MEANS something, all the other statistics are already visibly reporting what the EXACT problem is.

To be honest, I wouldn't be trumpeting "there's no real difference" if there were ANY difference. I'm not saying that 1vs2 is 10x difference and 2vs3 is 1.5x difference, so I'm going to ignore it. I'm saying that there's maybe a 10% difference with a 25% margin of error. Yes, that totally means that one specific case of a class3 might report 15% less SimFPS than another specific case of a class2. These are vague estimates based on my personal anecdotal evidence, not statistics we have. But, hopefully it illustrates the extent to which I mean that that is no discernable difference in hardware. It's very much about perception.

And for historical reference: The class1's were the "slow" ones, which took a noticable performance hit when many agents showed up. We delegated nearly all of those to off-grid or Linden tasks (i.e. estates residents can't see). All private islands are on our two current hardware configurations, aka class2 and class3. The hardware is different but we haven't seen any performance change. Class2 and class3 cover every sim number from about 115 and higher, currently. A sim160 and a sim425 are the exact same hardware; again, it's about perception.

[EDIT: Included more sim class info.]
Wayfinder Wishbringer
Elf Clan / ElvenMyst
Join date: 28 Oct 2004
Posts: 1,483
05-13-2005 11:04
From: Lee Linden

But content is still king.
Content still makes or breaks any server, no matter what number it's labelled with. We do so because, to be honest, the power is in your hands more than ours. Do you need better tools to benchmark your content? I'll say YES! louder than you will, and I'll push to make that happen.
But while we make the engine, you're in the driver's seat. Maybe things aren't running at Formula 1 precision, and that's something we work on every day. But, you can't ignore the role you play, as the driver, towards how enjoyable of a ride you have.


Lee, pardon me being the Devil's Advocate here, and pardon my no-beating-around-the-bush comments, but I have to take strong exception to these statements as presented. I don't mean or want to be rude, but it's time to cut to the core.

Yes... content is important. I will be the first to admit that bad content will lag a sim to uselessness. Bad scripting, AVs with AO devices, heavily primmed torus builds, etc etc can all be responsible for poor sim performance. Agreed 100%.

That doesn't mean they ARE. Because when ALL of Second Life slows down at once... when teleports are crashing... when people are falling through floors that suddenly go phantom on them... when everywhere you go everything is running rock-bottom slow but the night before everything was working fine--- that has nothing to do with content. That has nothing to do with end users being in the "Driver's Seat". It has nothing to do with the way we are building our sims or the way we are scripting or anything else the users are doing. It has to do with SERVER PROBLEMS (and in that, I include everything from a server going bad to messed up trunk lines to whatever things might affect the internal operation of Second Life).

And this is what I and many other people are hollering about. We are tired of being told that LAG IS PRIMARILY OUR FAULT-- IT'S DUE TO OUR CONTENT-- when we already have the data to prove that's a load of hooey.

FACT: When a sim is running just fine, when content doesn't change, when no new avatars are on the sim and no new builds or scripting is being done and the sim suddenly goes from 90 to 9... that is NOT content. When we're standing around typing messages to someone, and suddenly out of nowhere those messages take 30 seconds to appear on the screen-- not just for one user but for ALL of them... that is not content.

This is our problem Lee-- users being told that it is OUR fault these slowdowns are occurring-- when it is not. Linden Labs is passing the buck rather than finding the true source of these problems and fixing them... and that is what makes users angry. And that's what prompted this post in the first place-- as well as the several messages following from users that had come to the same conclusions experienced by my group's experiments.

It's one thing to have a belief that content may be causing the problem (and certainly, in some cases it does). However, it's another thing entirely to totally ignore factual data proving otherwise. The claim of "content-fault" does not explain the current super-lag problems we've been experiencing since the onset of v1.6-- and the lag bouts that existed even prior to that.

I'm not trying to get on your case Lee-- absolutely not. You're a kewl person. I (and many others) are trying to get Linden Labs to stand up and work to find out where the REAL problem lies. Hunt down the lag monster and KILL the thing!
Lee Linden
llBuildMonkey();
Join date: 31 Dec 1969
Posts: 743
05-13-2005 11:17
From: Wayfinder Wishbringer
That doesn't mean they ARE. Because when ALL of Second Life slows down at once... when teleports are crashing... when people are falling through floors that suddenly go phantom on them... when everywhere you go everything is running rock-bottom slow but the night before everything was working fine--- that has nothing to do with content. That has nothing to do with end users being in the "Driver's Seat". It has nothing to do with the way we are building our sims or the way we are scripting or anything else the users are doing. It has to do with SERVER PROBLEMS (and in that, I include everything from a server going bad to messed up trunk lines to whatever things might affect the internal operation of Second Life).

And this is what I and many other people are hollering about. We are tired of being told that LAG IS PRIMARILY OUR FAULT-- IT'S DUE TO OUR CONTENT-- when we already have the data to prove that's a load of hooey.

As I said, there's a challenge to discussing this at all in the face of larger problems. It's very easy to fall back to those issues when discussing problems with a specific region.

I've made no attempt to hide that there are currently larger grid issues, and we are in fact burning the midnight oil to make them right and ensure they don't get worse or come back in the future. That's been discussed.

From: Wayfinder Wishbringer
FACT: When a sim is running just fine, when content doesn't change, when no new avatars are on the sim and no new builds or scripting is being done and the sim suddenly goes from 90 to 9... that is NOT content. When we're standing around typing messages to someone, and suddenly out of nowhere those messages take 30 seconds to appear on the screen-- not just for one user but for ALL of them... that is not content.

As I said, the second scripts are involved, content is dynamic, not static. Scripts do change, their performance does change, and the sim performance changes as a result. I can't effectively troubleshoot a scenario in a known sim where scripts exceed 500 and performance changes, without any other information. As I've stated a few times, the other statistics are the important ones, and will point out IF the scripts are taking to long to process, or IF someone's lagging up the sim, or IF there's a larger grid issue.

You're in control because I've pointed out the tools that are available to identify where the problem lies. If I'm in the sim, I can read them myself, but if you're asking that a past case of unsatisfactory performance be analyzed, I need you to use the tools yourself to help me. Without those tools, I can't pinpoint a cause, I can't make suggestions, I can't determine which developer or operations person I need to talk to, and I can't fix things. If I'm not given the Time(ms) values, specifically for Run Agents and Run Tasks, I can't ignore the fact that the scripts are a very likely suspect.

And again, this all applies to the specific complaint of SimFPS changing in a region, NOT larger issues that affect you throughout Second Life. Let's handle those separately when discussing this.
eltee Statosky
Luskie
Join date: 23 Sep 2003
Posts: 1,258
05-13-2005 11:22
From: Lee Linden
Eltee: I'd personally lock SimFPS at 50 and never show it any higher than that. Because, to be honest, I handle a lot of support requests that involve that number changing and nothing else.


Yeah i know how frustrating that could be but i would really hate to loose access to what is frankly a rather useful diagnostic, not the sim fps, but the actual total milisecond server frame time which is the inverse thereof...

thats why i suggest *DO* get rid of sim fps as a line item in the alt+1 statistics, as with all inverse fractional numbers 1/n it is far too often misunderstood, all you need to do tho is replace it with the actual milisecond value that was used to calculate it, and give it a new label, 'sim frame time' or some such.. it will give those of us who do understand performance metrics the same information we had before, only we can skip an extra step of math in our heads... and at the same time it will get rid of the 'OMG my sim dropped 200 fps!' complaints from people who don't understand what it means in the first place.

The real number that matters is how long the server spends processing each frame... just display that, its far harder to mis-interpret it, and more meaningful as an overall debug value since it can be compared linearly, instead of fractionally. aka a sim running 5ms is doing 4ms more work than a sim running at 1ms not something people would get terribly excited about, and thats clearly visible, whereas a sim running at 200 fps looks like its doing *800* more work than a sim running at 1000 sim fps, even tho they are exactly identical


-edit-
also of course in 'sim fps' land that 4ms will be reported differently depending on what yer comparing too.. a sim running at 10ms versus 14 will still have 4ms difference, but now that will be '30' sim fps, where as in the previously illustrated, the exact same object, which is adding 4ms to runtasks, would make an 800 sim fps 'difference' to a sim starting at 1ms and that causes *all* kinds of confusion about how much thing x does or does not 'lag' in the first place
_____________________
wash, rinse, repeat
eltee Statosky
Luskie
Join date: 23 Sep 2003
Posts: 1,258
05-13-2005 11:33
Seriosuly tho wayfinder, IM me in world, i have some tools and basic know how on how ta figure out what actually is causin yer specific lag situations.. its not 100% but often it can get you a better answer, faster, than a liason or code guru linden comin in 'cold' to the sim can. (ultimately obviously LL has better tools than we have access to, sure, but at the same time, you know yer sim, and whats in it, and can often have a good idea of why things are goin bad, before LL can spend the time to diagnose it for you)

I have a tool for locating active objects, some techniques and basic knowledge about what active objects tend to cause what kind of performance degredation that i'd be happy to share, and i'm offerin ya some free 1:1 time ta try an sort it out heh
_____________________
wash, rinse, repeat
1 2 3 4 5 6 7 8