Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

Search is definitely not right here... Try this at home...

Rygel Ryba
Registered User
Join date: 12 Feb 2008
Posts: 254
06-30-2009 07:29
Okay. So I was trying to run some tests to find out REALLY how much of the grid is going to be affected. The 5% number just never seemed right to me.

So, I decided I would search a few things and see the difference in results. I started with "Animation". I figured I could search it with adult off and get a number, then search it with adult on - and it would have the original number plus all the places that have been filtered by either being on adult land or by keyword filtering. Not a perfect answer but I might be able to get some ballpark numbers.

But here is the silly thing...

PG, Mature only: "Results 1 - 10 of about 6920 for animation (0.102364 seconds)"
PG, Mature, Adult: "Results 1 - 10 of about 5840 for animation (0.124952 seconds)"

How can THAT be? There are fewer results when I ADD to the possible pool and less when I restrict the search?

Something is wonked.
Elanthius Flagstaff
Registered User
Join date: 30 Apr 2006
Posts: 1,534
06-30-2009 07:32
The clue is in the word "about".
_____________________
Visit http://ninjaland.net for mainland and covenant rentals or visit our amazing land store at Steamboat (199, 56).

Also, we pay L$0.15/sqm/week for tier donated to our group and we rent pure tier to your group for L$0.25/sqm/week.

Free L$ for Everyone - http://ninjaland.net/tools/search-scumming/
Rygel Ryba
Registered User
Join date: 12 Feb 2008
Posts: 254
06-30-2009 07:33
Should still be at least equal to the one with tighter restrictions. Never less.
Marcush Nemeth
Registered User
Join date: 3 Apr 2007
Posts: 402
06-30-2009 07:37
Unless the queery times out after only a few miliseconds, and any results after that are omitted. In which case, a PG search would be faster, since it'll have a shorter index list, while an Adult search takes longer, and will have more garbage between the unfiltered results. Just depends how the database was set up and how the queries are assembled.
Rygel Ryba
Registered User
Join date: 12 Feb 2008
Posts: 254
06-30-2009 10:57
Ahhh. yep. That makes sense... the time is bigger on the bigger search. Cool. Convenient how they make it impossible for us to test their math, huh? lol
DanielRavenNest Noe
Registered User
Join date: 26 Oct 2006
Posts: 1,076
06-30-2009 11:21
I think Marcush has hit on the right answer. I had noticed before for words with not a lot of results,, doing each rating singly then all three, the totals add up. But doing a search with many results, they do not.
Phil Deakins
Prim Savers = low prims
Join date: 17 Jan 2007
Posts: 9,537
06-30-2009 12:41
I don't see a timeout coming into it at all. In fact, I'm sure it doesn't. The reason for the unexpected numbers is probably along these lines...

When processing a query, the Google (web) engine goes through the database until it has a large enough number of results - called a results set. Like the GSA, it only lists 1000 results but the results set is much larger. When Google started out, the size of the results set was about 40,000. It then sorts the results set according to some ranking factors and takes the top 1000 for display (if people want to look that deep).

The reason they use the word "about" is because the system doesn't know how many matching results there are in the database, because it stops extracting results when it has a sufficient number. The system knows how many entries it looked at to get the number of results in the results set and, from that, it calculates how many there are likely to be in the whole database. That's why the number ends in 0, except for tiny numbers of results. In short, the "about" number is only an estimate, based on how many entries were looked at, of the total number in the database, to get the number of matches in the results set.

I don't know how the database works for adult and non-adult pages, but the pages do have a recently added meta tag that looks like it is used for filtering. My best guess is that the GSA simply ignores entries that have a certain value in that meta tag when processing non-adult queries. How this results in non-adult results showing a higher "about" number than when adult results are included is anyone's guess. It depends on how the GSA counts the number of examined entries - does it only count those it examined that have the right meta tag value or does it count all the entries it examined, including those that were skipped? Does it estimate the number with respect to the whole database or does it also estimate the number of entries that are likely *not* to be skipped? We can't know the answers to those questions, and it's very unlikely that LL can know the answers to them.

LL isn't preventing anyone from getting the numbers, as someone suggested.
_____________________
Prim Savers - almost 1000 items of superbly crafted, top quality, very low prim furniture, and all at amazingly low prices.

http://slurl.com/secondlife/Seymour/213/120/251/
Marcush Nemeth
Registered User
Join date: 3 Apr 2007
Posts: 402
06-30-2009 17:15
Also a very good assumption, especially since it explains the rounded numbers from bigger searches. Been a while since I did any serious database stuff, so I admit I'm not too accustomed to the google engine. More used to the database management side of indexes , resources and building queries in SQL7, Oracle and MySQL and the like :)
Void Singer
Int vSelf = Sing(void);
Join date: 24 Sep 2005
Posts: 6,973
07-01-2009 00:03
not that I wouldn't put it past LL to do some type of imbecilic cross comparisons of content type that would limit results if they somehow didn't fall in ALL categories requested based on some ridiculous criterion.
_____________________
|
| . "Cat-Like Typing Detected"
| . This post may contain errors in logic, spelling, and
| . grammar known to the SL populace to cause confusion
|
| - Please Use PHP tags when posting scripts/code, Thanks.
| - Can't See PHP or URL Tags Correctly? Check Out This Link...
| -