Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

php scraping of market data - need advice

Sol Columbia
Ding! Level up
Join date: 24 Sep 2005
Posts: 91
05-30-2006 14:19
Hey all,

I've been trying for a few days to create a php script which will go to https://secondlife.com/currency/market.php and scrape the daily summary data for a project I'm working on (since that info isn't available via download). My goal is to automate this process and track it in a database. I have everything working except this scraping element and I'm frustrated after trying several tactics and spending a lot of hours trying to figure out a workable method. I'm hoping you all can help me out with some suggestions for a new direction or possibly some code since I'm at my wit's end.

My latest tactic has been to try to use php's cURL functionality. I get it to work on other pages, but I'm getting nothing when trying to get the one page I want, namely that market data. The following code is what I think would work, but does not.

CODE


<?php

$url = "https://secondlife.com/account/login.php";

$post_request = "form[type]=second-life-member&form[nextpage]=/currency/market.php&";
$post_request .= "form[persistent]=Y&";
$post_request .= "form[username]=Sol&form[lasntame]=Columbia&form[password]=mypasswordhere";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"$url");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_request);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

$data = curl_exec($ch);
curl_close($ch);

print($data);

?>




Anyhow, if anyone knows anything about what I'm doing wrong, or has any suggestions on a different track, I'd really appreciate it, and thank you very much in advance!
_____________________
Geuis Dassin
Filming Path creator
Join date: 3 May 2006
Posts: 565
05-30-2006 14:25
see if this helps

/15/d4/99525/1.html
Sol Columbia
Ding! Level up
Join date: 24 Sep 2005
Posts: 91
05-30-2006 14:31
Bleh! How the hell did I miss that? =/

Thank you much for the link, jumping into it now.
_____________________
Eddy Stryker
libsecondlife Developer
Join date: 6 Jun 2004
Posts: 353
05-30-2006 20:06
For reference, MC Seattle is a dead account I was using before I managed to recover the password to my original account Eddy Stryker. Some important bits you were missing:

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);

https sites are SSL encrypted, and unless you want to jump through the hoops of having a CA file on hand and pointing curl to it, it's easier to just skip the peer verification completely.

curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookies.txt');

You got the follow location part right, since the site does a redirect after you login. But to maintain state between the login and the redirect a login cookie is used which curl needs to save, the above line will work fine (on a UNIX-based system at least).

I also included the nextpage variable in the POSTFIELDS to redirect straight to the market page so you can do the scraping without any special tricks. Now if you instead wanted to scrape the LindeX market info, you might want to redirect with

nextpage%5D=%2Fcurrency%2Fsell.php

If you've gotten to the actual scraping of the LindeX data, you'll notice it's two big tables with structure and content all mashed together in 1998 style HTML. Here's some of the code out of my C# app, inside of a function called getBuyOrders():

CODE

while (data.IndexOf("bg_dashes_w_ltblue") > 0) {
data = data.Remove(0, data.IndexOf("\t<tr>") + 5);
string lineItem = data.Substring(0, data.IndexOf("</tr>"));

////
i = lineItem.IndexOf(">") + 2;
string exchangeRate = lineItem.Substring(i, lineItem.IndexOf("<", i) - i);

if (exchangeRate.IndexOf("$") < 0) {
data = data.Remove(0, data.IndexOf("</table>"));
continue;
}

exchangeRate = exchangeRate.Substring(1, exchangeRate.IndexOf(" "));

lineItem = lineItem.Remove(0, lineItem.IndexOf("</td>") + 5);
////

////
//i = lineItem.IndexOf(">") + 1;
//string buyers = lineItem.Substring(i, lineItem.IndexOf("<", i) - i);

lineItem = lineItem.Remove(0, lineItem.IndexOf("</td>") + 5);
////

////
i = lineItem.IndexOf(">") + 1;
string volume = lineItem.Substring(i, lineItem.IndexOf("<", i) - i);
volume = volume.Remove(0, 2);
volume = volume.Replace(",", "");
////

data = data.Remove(0, data.IndexOf("</table>"));

Linden linden = new Linden();
linden.rate = double.Parse(exchangeRate);
linden.volume = int.Parse(volume);
openOrders.Add(linden);
}