Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Debug worked, but now what?

Haplo Voss
Registered User
Join date: 18 Nov 2006
Posts: 137
04-18-2007 04:33
Well, I've been pouring through the posts to try and find aolutions, but none of them have worked for me so far, so I thought I'd try just posting my results and see if anyone has a fresh perspective....

If I disable sound.. all is hunky dory...
If I enable sound.. it's luck of the draw... could be 5 min, could be 5hours... then total lockup.
SL is the only program of any kind I have ever run in Linux that has ever actually locked my system
up to the ponit I couldn't use at least an alternate session and top or use alt-ctl-backspace, or something
that would allow me to kill it and move on....

So... I suppose it's a two part question... is there a solution anyone knows of that they have been successful with?

If not, then has anyone figured out a way to run SL "in it's own space" so if it dies... it doesn't take my box with it and I can move on without hanging the system? A crash isn't so bad if you can just restart SL...

----------
** SuSe Linux Enterprise Desktop 10.0 (Novell)
(I've dinked with it long enough now... it's basically the same as openSuSe as far as changing values, etc. so any advice that would be helpful.. would apply here also)
** ATI Radeon 9550 / 256MB
** AC'97 on board audio (detected as VT8233/A/8235/8237 AC97 Controller)
** AMD64 - 2GHz - Running 32-bit version of SLED - Tried this to see if 64-bit OS was an issue.
** 2GB RAM
** Realtek built in ethernet
-----------

If you need any other specs just let me know. I appreciate anything anyone can help me with.

I have tried the individual sound modules in SL configuration file - ALSA, ESD, OSS - all with identical results - ESD perhaps running the longest most often before a crash.

Thank you and Take care!
- Hap
Roslyn Korobase
Registered User
Join date: 3 Mar 2007
Posts: 23
04-18-2007 07:30
Well a complete lock like that suggests a driver fault. I think many of us have seen sound related crashes but that usualy just takes out SL not the kernel. No matter how badly an app behaves it should never crash the kernel that way, if it does its the fault of the driver.

I assume you cannot ctrl+alt+f1 to get to a virtual terminal? and thats its completely dead

The only way you can debug a lockup like this is with a rs232 remote connection so you can get a kernel crash dump. You need to plug in a rs232 cable between the two systems boot with a kernel parameter of "console=/dev/ttyS0" and have a terminal emulator, minicom etc running on the other system. When the kernel crashes you *should* see a backtrace.

Another thought, with sound enabled and SL running is it possible to switch to a virtual terminal ctrl+alt+f1, log in and just sit there. I can't remember if this stops SL from doing anything (if you can still hear sound then its working, i think the X server is more intelegent that that though), if you get a hard lock you should get the backtrace on the screen(if no backtrace you can force one with ctrl+alt+sysrq+p). It might point you in the correct direction

regards
Angel Sunset
Linutic
Join date: 7 Apr 2005
Posts: 636
04-18-2007 08:49
On apparently totally locked systems, I have found it was X that was locked.

Getting in with ssh or rlogin enabled me to restart X from the command line.

That took me a while (and a few forums) to find out :D

I agree, it sounds like a driver problem. I had a lockup issue with my system (when I still had a 6800 GS card from NVidia), and I had to force the driver to use 4x AGP instead of 8x. KDE had the lockup issue in my case, SL worked fine under fvwm. Setting AGP to 4x made KDE stable - so in my case, it was a card/driver/motherboard problem.

Since it seems to be sound related, and an ATI card, I cannot offer any useful advice though :( Maybe using the settings in xorg.conf for ATI (/263/85/175959/1.html) would help?
_____________________
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kubuntu Intrepid 8.10, KDE, linux 2.6.27-11, X.Org 11.0, server glx vendor: NVIDIA Corporation, server glx version: 1.5.2, OpenGL vendor: NVIDIA Corporation, OpenGL renderer: GeForce 9800 GTX+/PCI/SSE2, OpenGL version: 3.0.0 NVIDIA 180.29, glu version: 1.3, NVidia GEForce 9800 GTX+ 512 MB, Intel Core 2 Duo, Mem: 3371368k , Swap: 2570360k
Haplo Voss
Registered User
Join date: 18 Nov 2006
Posts: 137
04-18-2007 15:11
Thanks for the responses ...

It is in fact a complete system hang. My first experience with one...

There is no possibility of restarting the X server because you cannot get to another session via alt+ctl+f1/f2/f3/f4/f5/f6

(And alt+ctl+backspace also does not work to restart the X server via hotkey combo obviously)

I haven't tried switching to a virtual session and waiting for it to hang however... good point. I will see what I get from there.

I am assuming it is sound related - I thought it was video at first, so I have tried my current ATI 9550, a 9600, an nvidia X300 I think it was, and an nvidia GeForce (not sure what model.. took it back because it had compatibility issues period w/ linux itself)

All video cards worked great, loved the video results, couldn't complain a bit, but the same problem and same variables occurred with each one, and in some instances when the crash occurrs you will get what I call the "Burp Effect" from the sound engine - the very last sound that came from the card keeps repeating sort of like a very fast broken record - and the system locks hard.

Thanks again for the info... will see what I can do.

Also remember that if I disable all sound in the config file... I never have issues of any kind.

Take care!
Hap
Kerik Rau
Registered User
Join date: 8 Mar 2007
Posts: 54
04-18-2007 18:01
you should have saved a copy of dmesg, usually if any critical errors are reported it will be added there.
Haplo Voss
Registered User
Join date: 18 Nov 2006
Posts: 137
04-19-2007 12:39
Good idea... I am not at the machine, but when I tried to retrieve any info... it appears the ockup is so quick and complete that no error trapping is even reported. All there ever is - are general s


I am going to try a good ol' SB 128 tonight and see if it helps any at all.

Thanks for the continued suggestions. Hoping to get it straightened out.. man I don't want to have to load windows just for SL!! LOL

Take care,
John
Haplo Voss
Registered User
Join date: 18 Nov 2006
Posts: 137
04-20-2007 06:09
it was the soundcard...

I disabled on-board sound of course and yet another PCI soundcard that is an actual, striaght from the factory Creative Labs 128, not a cheapy knockoff.

Set it up, started up SL, no problems since whatsoever.

Thanks again for the input and info.

-Hap
Morgaine Dinova
Active Carbon Unit
Join date: 25 Aug 2004
Posts: 968
X server woes and future
04-20-2007 08:21
As a general point, Angel Sunset is right that X server lockup is often mistaken for machine lockup (although not in this case). The problem is that X grabs all responsibility for mouse and keyboard handling from the VT when it starts up, so you can't switch back to the VT console nor anything else by k/b or mouse once X has locked ... and it does, occasionally, when you're playing with ambitious OpenGL that skirts the bleeding edge. That's a problem that's midway between Xorg and nVidia/ATI, so will probably not be resolved in the near term. And X is too complex to be entirely bug-free.

The problem is compounded by the fact that the X server (Xorg's at least) has no way of being restarted without losing the client connections to its existing instance. So, even if you realize that it's just an X hang and not a machine hang, and you ssh into the box from another one, unfortunately you have no option but to lose all existing clients when you restart Xorg, since you have to kill the old X first.

Ironically, this is often not an issue because when Xorg locks up, it is frequently so solidly wedged that you can't kill it with any signal number at all, so the only remedy is a reboot .... which of course loses the clients anyway. :P I've experienced instances of bleeding edge graphics totally turning off the graphics card even on a fully open-source Matrox driver, and the effect of this was to totally lock up the X server in a tight 100% CPU loop with no response to k/b or mouse *nor* signals, while the rest of the system was still running and accessible, if slowly. Bad.

It's not an ideal situation, but it won't be resolved until X is redesigned at its core around a multi-server process model that doesn't keep all its eggs in one basket. And for good measure, a semi-persistent model that picks up old orphaned X clients wouldn't go amiss either, although that's a tough nut to crack.

Just general observations about our ropey X situation, which in some respects harks back to MS Windows' old monolithic structure which we quite rightly condemned. Well, X needs modernizing too.

Glad to hear that you resolved your issue though, Haplo. :-)
_____________________
-- General Mousebutton API, proposal for interactive gaming
-- Mouselook camera continuity, basic UI camera improvements