Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Constant packet loss is rendering SL unusable

Kermitt Quirk
Registered User
Join date: 4 Sep 2004
Posts: 267
11-07-2004 19:00
I originally posted this message in the Technical Issues forum under a thread called "Out of Date Intel Chipset Driver" because I suspect that error may have something to do with my problem. However, after a whole weekend I don't have a single reply, which either means no one has any ideas how to solve this, or it just got missed by everyone because of where I posted it.

So I'm posting it again under a new thread cause I'm really stuck here and I'm likely to give up on SL very shortly if I can't find a solution. I've also posted this to support, but so far all I've had is an auto-reply and from what I've read in these forums I aint holding my breath for a response.

If anyone has any ideas whatsoever about what I can do to either solve this or help to determine what the problem is, PLEASE, PLEASE, PLEASE, PLEASE reply.

FYI... I made one other change over the weekend in an effort to fix this, but it also made no difference. I usually run my internet via a Linux (Red Hat) server, but I've removed that and now have the cable modem connected directly to my Windows PC.



Original Post follows...
-----------------------------------------------------------------------------------------------------

I've been having this exact same problem and unfortunately I don't think I can just ignore it. I've been getting constant packet loss in SL lately and this is the only thing that seems to be a possible cause. Here's the full story...

Just recently I've had to rebuild my PC because it decided it would start to freeze and not boot at seemingly random intervals. I suspect this was remanents of problems caused by a bad lightning strike I had in my area earlier in the year (which also fried some other hardware immediately at the time). So anyway, I got everything tested and determined that both the mainboard and CPU were dodgy. So I got myself a nice new mainboard and CPU which also meant an upgrade (from 2.6Ghz to 3.2Ghz, among other speed increases I gained from the updated mainboard). So now I've completely rebuilt my PC, fresh install of Windows, and all the latest drivers. Now when I go into SL, about 90% of the time the packet loss will be fluctuating around 10%. It's making SL so slow, and objects are taking to long to load that, I've pretty much got to the point where I can't be bothered going in there. The first thing I thought is that it was a network driver conflict because of the Intel Chipset error I was getting. So I updated that driver (SL still gives the error message as others in this thread have stated) but that appeared to make no difference.

One thing that has changed in my new configuration is that my new mainboard has the network on-board so my next step was to disable that, and re-install the network card that I used with my old mainboard. Still the packet loss continues. No-one else is seeing this packet loss so I know it's a local issue and it seems to have nothing to do with how busy SL is. Every forum thread I've read about packet loss seems to indicate a slow PC but there is no way that my PC should be struggling with this stuff.

Does anyone have any ideas at all that I can try to solve this problem? I'm really running out of ideas here. Anyone.... please, please help. The SL withdrawl will start to set in any day now.

...oh, and just for good measure, here's a few specs on my PC...
ASUS mainboard
Intel P4 3.2Ghz (1mb cache)
1Gb RAM
Ge-Force FX-5200 (128mb)

...and I'm on a 512/uncapped cable connection which I have all to myself.
Catherine Omega
Geometry Ninja
Join date: 10 Jan 2003
Posts: 2,053
11-07-2004 19:33
So you don't have a wireless LAN anywhere in there?

Have you tried turning down your maximum bandwidth slider in Preferences/Network?
_____________________
Need scripting help? Visit the LSL Wiki!
Omega Point - Catherine Omega's Blog
Kermitt Quirk
Registered User
Join date: 4 Sep 2004
Posts: 267
11-08-2004 01:32
Nope... no wireless. ATM it's cable modem connected directly to a 10/100 network card in my Windows PC that's running SL.
Also, I just tried changing the bandwidth and that didn't make any difference except to make my bandwidth meter start racing into the red also (which makes sense since that's effectively trying to squeeze more data down a smaller pipe)
Thanx for the suggestions though.

I keep coming back it it being some sort of hardware conflict since it worked fine before I replaced the mainboard and CPU. That's why I thought the Intel Chipset error was a likely culprit. Also SL seems to be the only network app that's having problems.
Morgaine Dinova
Active Carbon Unit
Join date: 25 Aug 2004
Posts: 968
11-09-2004 05:33
Kermitt, take a quick look in

Application Data->SecondLife->logs->SecondLife.log file

just in case the client is logging any issues with UDP packet loss or sequencing.

I've only ever seen one such thing in there, an out-of-sequence packet. Maybe it's logging more networking problems for you.
_____________________
-- General Mousebutton API, proposal for interactive gaming
-- Mouselook camera continuity, basic UI camera improvements
Meiyo Sojourner
Barren Land Hater
Join date: 17 Jul 2004
Posts: 144
11-09-2004 05:46
Be careful tho. The log files can be misleading... apparently even for LL support. My GF recently spent a month not being able to get onto SL because everyone kept telling her that her problem was her connection.. contact her ISP etc. I studied the logs quite a bit and tried many things. But the logs all showed that the client was losing all the packets once it came around the time that the clothing and such was being downloaded. She was using a graphics card that wasn't the best in the world so we upgraded that and her mobo. The new mobo she got was pretty much the exact same she had except the new one had an AGP slot... so I feel pretty safe saying that the issue was with the graphics card. Everything is totally fine now and nothing was changed as far as her connection goes.

-Meiyo
_____________________
I was just pondering the immortal words of Socrates when he said...
"I drank what??"
Kermitt Quirk
Registered User
Join date: 4 Sep 2004
Posts: 267
11-10-2004 00:33
Well I'm definately getting out-of-sequence packets. Plenty of em. Couldn't see anything in the logs that looked too private so I've attached it. It's a log of about 15 mins in SL, and I was just cruising around and trying to get some downloading happening. Maybe you can see something in there that I can't since I really don't have anything to compare it to. Packet loss didn't seem quite so bad tonight in that it was fluctuating a bit more than usual instead of just staying solid red, but still way too slow.

As for my grafix card, that's the same one I had since the day I joined SL. It didn't get replaced with the rebuild. It's not the flashest card in the world but I certainly wouldn't call it old. GForce FX-5200 8xAGP, 128mb. Although I wouldn't rule out something like that being a problem, but maybe on the software side more than the hardware. When I rebuilt my PC it was a fresh install of Windows, and I d/l'd all the latest drivers for everything. I know the grafix driver would have been newer (I'm using Nvidia detonator drivers) and I also moved DirectX up from 9.0b to 9.0c. Not to mention the fact that I also put XP SP2 on for the first time, and who knows what that could've done.
Catherine Omega
Geometry Ninja
Join date: 10 Jan 2003
Posts: 2,053
11-10-2004 00:41
So this only happened after you'd reinstalled your computer? What motherboard do you have? What version of chipset drivers are you running?
_____________________
Need scripting help? Visit the LSL Wiki!
Omega Point - Catherine Omega's Blog
Kermitt Quirk
Registered User
Join date: 4 Sep 2004
Posts: 267
11-10-2004 04:52
Yip it only happened after I reinstalled. I had to rebuild my PC and replaced just mainboard, CPU and a fresh HD. Mainboard is an ASUS P4P800 SE.
The chipset drivers you ask about, I presume you mean the Intel Chipset drivers that I referred to in my original post? They're 6.2.1.1001, which should be the latest. I d/l'd the latest version of those just after I reinstalled SL for the first time after the rebuild because it was complaining they were old.
Morgaine Dinova
Active Carbon Unit
Join date: 25 Aug 2004
Posts: 968
Does a single lost UDP packet freeze the stream?
11-10-2004 07:40
TCP provides a sliding and overlapped data/ACK window into a byte-oriented data stream. LL's UDP-based implementation presumably employs something very similar so that single lost UDP packets don't freeze all further streaming until the mishap is resolved.

If we assume that LL does this already, then multiple UDP data/ACK pairs will be in transit concurrently --- which is great. But if this is so, then the sliding window doesn't seem to be wide enough.

Does anyone know whether the client-server UDP protocol is already windowed, and if so, what the window size is?

Edit: on reflection, this wasn't a good place to post the question.
_____________________
-- General Mousebutton API, proposal for interactive gaming
-- Mouselook camera continuity, basic UI camera improvements
Kermitt Quirk
Registered User
Join date: 4 Sep 2004
Posts: 267
11-11-2004 02:10
Strangest thing has happened today that now leads me to believe this is not a hardware problem. Today I came home from work much earlier than usual, and logged on to SL. For at least 2 hours it worked perfectly. Not a single bit of packet loss. I assumed it must've been something in the update and went about my building. Then, just after 7pm (1am SL time) it started again. From there on it was exactly the same as I've been describing.
I did a trace route to some of the Linden servers and there appears to be packet loss in Los Angeles and then again on the final hop to the Linden servers. I'm really not a network guy so I couldn't take too much from the results I was seeing though.

I will attempt to get into SL early again tomorrow and try to confirm if it's constantly ocurring at the same time. If not I'll definately be able to try it in the weekend. I'm afraid that it's looking like something well out of my control though so maybe this means I have to just wait it out :(

Edit: Logged in again tonight at 10:50pm (4:50am SL time). Been in now for about 15 mins and again I'm getting no packet loss at all.

Another Edit: Remained in SL building for about an hour with no packet loss appearing. Also redid the trace route just before I logged out and it showed no packet loss in the areas I metioned earlier.
Kermitt Quirk
Registered User
Join date: 4 Sep 2004
Posts: 267
11-12-2004 00:28
Next night I got in just after 5pm, and the packet loss started just after 6pm. An hour earlier than last night.
Echo Dragonfly
Surely You Jest
Join date: 22 Aug 2004
Posts: 325
11-12-2004 09:23
I too have been suffering high packet loss as of late. But, just out of curiosity, I disabled my firewall and logged on. Guess what, no packet loss whatsoever in 4hrs. of game time.
I had my firewall configured to allow SL to communicate any way it wanted, but evidently it was still blocking something. Anywhoo, just a thought.
Damien Phoenix
Second Life Resident
Join date: 30 Oct 2004
Posts: 5
11-13-2004 13:15
From: Morgaine Dinova
TCP provides a sliding and overlapped data/ACK window into a byte-oriented data stream. LL's UDP-based implementation presumably employs something very similar so that single lost UDP packets don't freeze all further streaming until the mishap is resolved.

If we assume that LL does this already, then multiple UDP data/ACK pairs will be in transit concurrently --- which is great. But if this is so, then the sliding window doesn't seem to be wide enough.

Does anyone know whether the client-server UDP protocol is already windowed, and if so, what the window size is?

Edit: on reflection, this wasn't a good place to post the question.


So basically you're asking, how much unacked data can be outstanding at a time?

Equally important would be loss detection speed and recovery. If they employ a simple timeout check, it would REALLY slow things down when loss occurs, assuming that there aren't a lot of multiple streams (i.e. there's a single stream and only, say, 256KB can be unacked at a time)

<Nerd mode off!>
Kermitt Quirk
Registered User
Join date: 4 Sep 2004
Posts: 267
11-14-2004 18:36
Well over the past few days the packet loss has come and gone, but hasn't been anywhere near as bad as I had been experiencing. I've even had wierd things like, spend a whole day in Olde London (which you would maybe expect some packet loss since it's been so busy) but I get no packet loss at all, then I go to Miru for a quiet game of bingo and wham! The packet loss is back. So I'm pretty sure it has nothing to do with how busy the sim is. Generally it still comes around the same time but that seems to have been varying more lately too.

While speaking to some of the other people in Bingo last night, some of them where saying that other people have been having general network problems around California which would agree with the trace I did last week that showed packet loss in Los Angeles. So it sounds like there's either a dodgy server somewhere around there, or maybe someone is overloading something. For now it's not so bad that SL is unusable at least so I guess I'll just have to put up with it.

The only remaining question then would be if Morgaine is right about the SL code not handling the lost packets as well as it should. The only other thing I could suggest is that SL just demands so much bandwidth compared to many other network apps and that's the only reason why it appears to suffer so badly.

Not sure if anyone has actually noticed but I am actually in Australia btw. The network trace I did showed about 18-19 hops to get to the Linden servers from here. I'd be very interested if anyone else in Aussie (especially Brisbane) has been seeing any similar problems, or even better, could do a trace route and see if they also can see packet loss around the California area.
Morgaine Dinova
Active Carbon Unit
Join date: 25 Aug 2004
Posts: 968
03-11-2005 08:30
From: Kermitt Quirk
... then I go to Miru for a quiet game of bingo and wham! The packet loss is back. So I'm pretty sure it has nothing to do with how busy the sim is.
You're almost certainly right about that.

While the UDP networking code server-side would be expected to discard queued packets under conditions of extreme congestion (for self-preservation), this is certainly not in the normal safe working area of the graph, and probably rings the pagers of the support staff and maybe even gets Philip out of bed. ;) The "normal" packet loss is a statistically expected property of all communications over the Internet, and as such, must be fully embraced within the design of any Internet-based service. I expect that this is so within SL code --- anything else would be simply incorrect, inherently.

From: someone
Not sure if anyone has actually noticed but I am actually in Australia btw. The network trace I did showed about 18-19 hops to get to the Linden servers from here. I'd be very interested if anyone else in Aussie (especially Brisbane) has been seeing any similar problems, or even better, could do a trace route and see if they also can see packet loss around the California area.
Your point is extremely relevant. I've worked in the software departments of many large computer companies, ISPs, and application service providers (I'm freelance), and not once have I seen a test environment which tests the functionality over a long simulated IP path employing multiple hops. If SL truly intends to become global, its test environment needs this.
_____________________
-- General Mousebutton API, proposal for interactive gaming
-- Mouselook camera continuity, basic UI camera improvements