Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Q: How many Lindens does it take to make a local cache work?

Introvert Petunia
over 2 billion posts
Join date: 11 Sep 2004
Posts: 2,065
02-22-2006 18:46
A: More then they have.

Okay, so the link pointed to may be a little technical, let me use a more mundane analogy. Take your rolodex, give the cards a really good shuffle and put them back in the rolodex, now get me the phone number for Tony's Pizza. How long will your search take? On average, you'll have to look at half the cards to find Tony's. The larger your rolodex, the longer it will take. Unhappily, this analogy is almost exactly what the link describes.

The other part of the linked post says that if only they put all the "A"s together in the "A" section of the rolodex and so on, even if you kept them unorderd within their section, it would still take you only 1/26th of the time to find Tony's Pizza.

So, if true, my long-standing claim that the local cache was slower than no cache is likely true. Furthermore, the larger you make your cache, the worse the performance will be.

Hey Linden Lab, this stuff was old hat in 1973 when Knuth wrote the bible on it. I'd be happy to send you a copy gratis if you need one.
Khamon Fate
fategardens.net
Join date: 21 Nov 2003
Posts: 4,177
to correct the gender
02-22-2006 19:33
Ha ha ha I was reading "take your Rolex" and thinking what the hell is he talking about. They could borrow my copy of Knuth except I don't think the old tattered pages would survive the mailing.

The local cache has never worked well. I've always had a better experience clearing it every day.

A better question is "how many residents does it take to help LL build a better caching algorithm?"
_____________________
Visit the Fate Gardens Website @ fategardens.net
Iron Perth
Registered User
Join date: 9 Mar 2005
Posts: 802
02-22-2006 19:39
Great Post, Introvert.
Siggy Romulus
DILLIGAF
Join date: 22 Sep 2003
Posts: 5,711
02-22-2006 19:40
42
_____________________
The Second Life forums are living proof as to why it's illegal for people to have sex with farm animals.

From: Jesse Linden
I, for one, am highly un-helped by this thread
Adam Zaius
Deus
Join date: 9 Jan 2004
Posts: 1,483
02-22-2006 19:40
I've always wondered why LL just doesnt harness the local filesystem. It's designed /exactly/ for this kind of storage and retrieval, and in most cases, usually does a good job.

Sure it's easier to pull things out of the cache, but GLIntercept being around kind of makes it a redundant point.
_____________________
Co-Founder / Lead Developer
GigasSecondServer
Introvert Petunia
over 2 billion posts
Join date: 11 Sep 2004
Posts: 2,065
02-23-2006 02:52
I just wanted to let you know that I'm still flabbergasted. Isn't that a fun word to say?

I expected Lee Linden to jump in here with one of his patented cardboard cutout denials. I kind of miss his "you don't know what the hell you are talking about, you ignorant customer!" abuse. Why hast thou forsaken me?
Mack Echegaray
Registered Snoozer
Join date: 15 Dec 2005
Posts: 145
02-23-2006 06:36
From: Adam Zaius
I've always wondered why LL just doesnt harness the local filesystem. It's designed /exactly/ for this kind of storage and retrieval, and in most cases, usually does a good job.


Filesystems like NTFS/FAT/ext3 are incredibly inefficient at handling large numbers of small files. Unbelievably so in fact. This is why so many programs invent their own mini filesystem-in-a-file (eg OLE Structured Storage) .... because the regular FS just can't cope with say 10,000 files of say a kb or two each being handled at once.

It's this sort of problem that motivated Hans Reiser to design ReiserFS, which has as one of its unique selling points the ability to handle huge numbers of very small files, very efficiently (http://www.namesys.com/).

If it's actually true that the cache contains when it's full, something like 26,000 assets then it doesn't surprise me at all that this causes huge problems. Designing a filing system that can handle such large numbers of small files with high performance is extremely hard. You can't just use the operating system because OS filing systems can't cope with it. So you have to invent your own.

It *can* be solved, Reiser4 managed it pretty well, but it's difficult. Web browsers like Firefox default the cache size to only 50mb, and though I use it all the time my Firefox cache only has 5mb of data in it - but I've seen people say they have a SL cache up to a gigabyte in size!

Oh, something else that confuses me .... the original post said "I have a theory why it is slow when full and fast when empty" .... this doesn't sound like hard facts about how the cache works. The guy himself said it was a theory. Maybe he's right! None of us know. But jumping to conclusions based on incomplete evidence is bad.

Final thing, am I the only one who is a bit tired of Introvert Petunias constant negativity? OK, the software isn't working well for you, we get the picture. Hanging around and flaming the 'Lab probably won't magically make their team more productive. If I were you, I'd relax and go do something else for a few months. Come back in the summer and see if the new versions have better performance - if not, too bad, if so then great!
Khamon Fate
fategardens.net
Join date: 21 Nov 2003
Posts: 4,177
02-23-2006 06:50
Adam said "GLIntercept" so this thread has to be locked. Where are the mods? Lock this thread now before anybody gets any bright ideas.
_____________________
Visit the Fate Gardens Website @ fategardens.net
Mack Echegaray
Registered Snoozer
Join date: 15 Dec 2005
Posts: 145
02-23-2006 09:31
And actually it looks like they're using Berkeley DB, which I think has pretty good performance in general ... it certainly uses some quite sophisticated algorithms.
FlipperPA Peregrine
Magically Delicious!
Join date: 14 Nov 2003
Posts: 3,703
02-23-2006 11:04
I posted a bunch of ideas I had on the topic here, and tried to basically also make a local hard drive cache a backup system:

/108/12/81593/1.html
_____________________
Peregrine Salon: www.PeregrineSalon.com - my consulting company
Second Blogger: www.SecondBlogger.com - free, fully integrated Second Life blogging for all avatars!
Polka Pinkdot
Potential Slacker
Join date: 4 Jan 2006
Posts: 144
02-23-2006 11:18
From: Mack Echegaray
Filesystems like NTFS/FAT/ext3 are incredibly inefficient at handling large numbers of small files. Unbelievably so in fact. This is why so many programs invent their own mini filesystem-in-a-file (eg OLE Structured Storage) .... because the regular FS just can't cope with say 10,000 files of say a kb or two each being handled at once.


The most common (and easiest) solution to that problem is just to create folders in your cache and load each folder up based on the asset tag or some other hash. So, if a tag looks like: 32829-02392-32824-298492 for example you would create say 100 subdirectories named 00 to 99. This asset would go in directory 92 based on the asset id.

With 26,000 assets, the filesystem would only have to deal with 260 assets per directory on average, which is well within the range of what modern filesystem can do (even primitive filesystems like fat don't break a sweat at that range).

That said, I'm thinking LL probably already does this and the linear searches they were talking about were filesystem directory lookups, which I could believe. 26,000 objects in a single directory is a major performance problem for most current operating systems. However, the solution I suggested above is not exactly new, people have been doing it for ages on mail and news systems. It doesn't even require the extra complexity of a database for good performance, the filesystem itself works great in the case (especially since assets don't change and only one program accesses them at any given time, so we don't have the kind of problems you normally need a "real" database to solve).
Iron Perth
Registered User
Join date: 9 Mar 2005
Posts: 802
02-23-2006 13:32
From: Mack Echegaray
And actually it looks like they're using Berkeley DB, which I think has pretty good performance in general ... it certainly uses some quite sophisticated algorithms.


Yeah, I suspect they're using something. The idea that they're doing O(N) lookups is a bit unbelievable.
Introvert Petunia
over 2 billion posts
Join date: 11 Sep 2004
Posts: 2,065
02-23-2006 14:15
From: Mack Echegaray
Final thing, am I the only one who is a bit tired of Introvert Petunias constant negativity? OK, the software isn't working well for you, we get the picture. Hanging around and flaming the 'Lab probably won't magically make their team more productive. If I were you, I'd relax and go do something else for a few months. Come back in the summer and see if the new versions have better performance - if not, too bad, if so then great!
Agreed in full; I say we burn the fucker! Or maybe just put him on your ignore list? Let me know if you need help managing your forum ignore list.
Khamon Fate
fategardens.net
Join date: 21 Nov 2003
Posts: 4,177
02-23-2006 14:27
I'm sorry Introvert, did you say something?
_____________________
Visit the Fate Gardens Website @ fategardens.net
Siggy Romulus
DILLIGAF
Join date: 22 Sep 2003
Posts: 5,711
02-23-2006 14:29
From: Mack Echegaray

Final thing, am I the only one who is a bit tired of Introvert Petunias constant negativity? !


Yep, must be the only one.

Although Intro's posts are usually a lil negative in nature, they generally contain info that is usefull. This sets them aside from 'this sucks bah!' posts.

They are 'this sucks - here is a reason why, maybe xyz can solve the problem' -- which is why I classify 99.9% of his posts as constructive critisism.

In a land of cheerleading I find Intoverts posts similar to an anchoring rock in an ocean of diahrea.

If anything I find the 'yah yah everything is wonderful and if you don't like the taste of sand in your mouths go away for a month' posts far more annoying.
_____________________
The Second Life forums are living proof as to why it's illegal for people to have sex with farm animals.

From: Jesse Linden
I, for one, am highly un-helped by this thread
Starax Statosky
Unregistered User
Join date: 23 Dec 2003
Posts: 1,099
02-23-2006 14:38
From: Siggy Romulus
Yep, must be the only one.

Although Intro's posts are usually a lil negative in nature, they generally contain info that is usefull. This sets them aside from 'this sucks bah!' posts.

They are 'this sucks - here is a reason why, maybe xyz can solve the problem' -- which is why I classify 99.9% of his posts as constructive critisism.

In a land of cheerleading I find Intoverts posts similar to an anchoring rock in an ocean of diahrea.

If anything I find the 'yah yah everything is wonderful and if you don't like the taste of sand in your mouths go away for a month' posts far more annoying.



That's why I learnt to love Blaze. I realized he was just filling a role. We're all just filling a role on these forums. Somebody has to be negative and somebody has to be positive. It's the way we humans work. You take Introvert away and somebody else will take his role.

Infact, if we all secretly agreed to be positiive then we'd see Mack slowly become more negative. But Mack himself is just playing the role of kicking Introvert's ass. There would've been a few out there that was ready to jump into the role. But Mack just happened to be the first.

Now on with the show!!!!


YOU'RE FUCKING BASTARDS!!!
Mack Echegaray
Registered Snoozer
Join date: 15 Dec 2005
Posts: 145
02-23-2006 15:53
From: Polka Pinkdot
With 26,000 assets, the filesystem would only have to deal with 260 assets per directory on average, which is well within the range of what modern filesystem can do (even primitive filesystems like fat don't break a sweat at that range).


I think you're forgetting the overhead of the directories themselves ....

From: someone
That said, I'm thinking LL probably already does this and the linear searches they were talking about were filesystem directory lookups, which I could believe. 26,000 objects in a single directory is a major performance problem for most current operating systems.


Well, if you look at your own cache directory there aren't that many files in there at all. It's all stored in "data.db2.x.????" where the ? marks are some number.
Siobhan OFlynn
Evildoer
Join date: 19 Aug 2003
Posts: 1,140
02-23-2006 16:43
You know what? Sometimes I actually feel bad for Linden Labs! It must be hell having such smart (ass) clients :p :D

I love the SL forums, I really do. Thanks for making me laugh today :)
_____________________
From: Starax Statosky
Absolute freedom is heavenly. I'm sure they don't have a police force and resmods in heaven.


From: pandastrong Fairplay
omgeveryonegetoutofmythreadrightnowican'ttakeit


From: Soleil Mirabeau
I'll miss all of you assholes. :(
Argent Stonecutter
Emergency Mustelid
Join date: 20 Sep 2005
Posts: 20,263
02-24-2006 07:07
From: Adam Zaius
I've always wondered why LL just doesnt harness the local filesystem. It's designed /exactly/ for this kind of storage and retrieval, and in most cases, usually does a good job.
Actually, it doesn't. There's dozens of open source projects out there that started out using a local filesystem for storage that ended up replacing them with a database because once you get hundreds of thousands of objects the local filesystem becomes a major bottleneck unless you do something like running it in a RAMdisk.

On the other hand, there *are* a lot of good open source databases they could use, including the one LL is using themselves for the asset server, as well as ones like Metakit and SQLite designed for single-process access.
Argent Stonecutter
Emergency Mustelid
Join date: 20 Sep 2005
Posts: 20,263
02-24-2006 07:10
From: Polka Pinkdot
The most common (and easiest) solution to that problem is just to create folders in your cache and load each folder up based on the asset tag or some other hash. So, if a tag looks like: 32829-02392-32824-298492 for example you would create say 100 subdirectories named 00 to 99. This asset would go in directory 92 based on the asset id.
That's what squid does. It's still got more overhead than a real database, and INN has gone to a big circular file for its data. On Windows NT it would also cause insane amounts of fragmentation.
Khamon Fate
fategardens.net
Join date: 21 Nov 2003
Posts: 4,177
02-24-2006 07:20
Fragmentation is the enemy. I will note that SL runs smoother when I regularly defraging the NTFS drive where the cache is stored on my Windows machine. It consistently runs smoother on ext2 and 3 partitions. That's seems odd considering that the Linux client currently uses only 16M of video memory, so is constatly loading and reloading textures from the cache file.
_____________________
Visit the Fate Gardens Website @ fategardens.net
Doc Nielsen
Fallen...
Join date: 13 Apr 2005
Posts: 1,059
02-24-2006 09:24
From: Introvert Petunia

I expected Lee Linden to jump in here with one of his patented cardboard cutout denials. I kind of miss his "you don't know what the hell you are talking about, you ignorant customer!" abuse. Why hast thou forsaken me?


Ah, obviously you haven't heard about the recent upgrade. Cardboard cutout denial (V0.8a), developer code name Lee Linden, has been upgraded to Cardboard cutout denial (V1.00b), developer code name Zero Linden, which features enhanced 'smart ass' technology, an improved denial module and a faster bs response for an improved customer experience.
_____________________
All very well for people to have a sig that exhorts you to 'be the change' - I wonder if it's ever occurred to them that they might be something that needs changing...?
Kyushu Tiger
Registered User
Join date: 12 Nov 2005
Posts: 92
02-24-2006 12:23
So would someone be better off using a smaller cache, or even no cache at all? I have seen conflicting reports on this...


Kyushu
Argent Stonecutter
Emergency Mustelid
Join date: 20 Sep 2005
Posts: 20,263
02-24-2006 13:50
From: Khamon Fate
Fragmentation is the enemy. I will note that SL runs smoother when I regularly defraging the NTFS drive where the cache is stored on my Windows machine. It consistently runs smoother on ext2 and 3 partitions. That's seems odd considering that the Linux client currently uses only 16M of video memory, so is constatly loading and reloading textures from the cache file.
Just about any modern UNIX does so much better a job of managing virtual memory and file access than NT that it's no surprise to me at all.

Defragmentation? That's something you used to have to worry about back in the '80s, isn't it?
Introvert Petunia
over 2 billion posts
Join date: 11 Sep 2004
Posts: 2,065
02-24-2006 15:19
From: Mack Echegaray
And actually it looks like they're using Berkeley DB, which I think has pretty good performance in general ... it certainly uses some quite sophisticated algorithms.
Have you any reason to believe this other than the index.db2* and data.db2* file names? I just checked and they do not appear to be in any Berkely DB format from Berkely DB v2 to v4.3.
1 2