Welcome to the Second Life Forums Archive

These forums are CLOSED. Please visit the new forums HERE

llXorBase64StringsCorrect woes =(

Haravikk Mistral
Registered User
Join date: 8 Oct 2005
Posts: 2,482
04-09-2008 12:50
Okay, after an entire day of ironing out bugs with my encryption scheme, I've come across one which I'm now struggling to debug.

To start with, please assume that my algorithm for generating keys in both LSL and PHP is 100% secure. I'm using a number of tricks basically resulting in an encryption key that changes with each message. No security through obscurity here; just a key long enough that by the time a brute force attack cracks it it won't be of any use to the would-be hacker (it would in fact be a minimum of several months; my keys are useless after 10 minutes maximum).

Anyway, the issue I'm encountering now is not with the generation of the key, that works fine. The issue seems to be in decrypting the strings in PHP; namely that they sometimes don't decrypt the whole-way. That is, an arbitrary chunk of the string will be decrypted correctly, while everything after this point will be gibberish.
I've verified that the issue is on the PHP end as it decrypts fine in LSL, and it isn't the in-transit part as the PHP script is receiving the whole encrypted message, and using the correct key to decrypt it.

Here is the base64 xor function I am using to encrypt/decrypt messages on the PHP end:

CODE
function xorBase64($s1, $s2) {
$s1 = base64_decode($s1); $l1 = strlen($s1);
$s2 = base64_decode($s2);
if($l1 > strlen($s2))
$s2 = str_pad($s2, $l1, $s2, STR_PAD_RIGHT);
return base64_encode($s1 ^ $s2);
}


Any help is much appreciated!

Also possibly of note; the message being encrypted is URL encoded (successfully, no truncation). I can't imagine the URL standard characters could really cause any issues, but I had trouble with @ symbols before.
Although it doesn't seem to be an particular symbol, as the script is currently sending identical messages (with different encryption keys), sometimes it will randomly succeed though the failure rate is currently very high.
_____________________
Computer (Mac Pro):
2 x Quad Core 3.2ghz Xeon
10gb DDR2 800mhz FB-DIMMS
4 x 750gb, 32mb cache hard-drives (RAID-0/striped)
NVidia GeForce 8800GT (512mb)
Haravikk Mistral
Registered User
Join date: 8 Oct 2005
Posts: 2,482
04-09-2008 15:28
Urgh, I think I may have it. Apparently '/' and '+' are two characters in the base 64 character-set, two of the worst characters to feed into PHP it would seem =)

Never seem to get more than one of them at a time, and only maybe 1 in every 3 messages I send end up with such an erroneous character, so the overhead of a search/replace shouldn't hinder me too much!
_____________________
Computer (Mac Pro):
2 x Quad Core 3.2ghz Xeon
10gb DDR2 800mhz FB-DIMMS
4 x 750gb, 32mb cache hard-drives (RAID-0/striped)
NVidia GeForce 8800GT (512mb)
Haravikk Mistral
Registered User
Join date: 8 Oct 2005
Posts: 2,482
04-10-2008 09:52
Urgh, I'm still having some issues. Namely; what are we supposed to do with equals signs?

If I understand correctly, then the equals '=' sign in a base 64 string is used for padding data that does not fit correctly in the 24-bit chunks that base 64 uses. However, an equals-sign is not within the base64 character-set, so what should happen to it when it's being processed by PHP? Can these cause issues with the PHP function I posted?
_____________________
Computer (Mac Pro):
2 x Quad Core 3.2ghz Xeon
10gb DDR2 800mhz FB-DIMMS
4 x 750gb, 32mb cache hard-drives (RAID-0/striped)
NVidia GeForce 8800GT (512mb)
Hewee Zetkin
Registered User
Join date: 20 Jul 2006
Posts: 2,702
04-10-2008 10:51
base64_decode() should take care of the padding characters for you. The equals ('=') sign IS a part of the Base64 character set. See http://tools.ietf.org/html/rfc2045#section-6.8
Ollj Oh
Registered User
Join date: 28 Aug 2007
Posts: 522
04-10-2008 21:50
the base64() functions are UGLY, the different positions use different charsets AND there are many other poor caveats.

If you need your own set of characters just make your own base64integer procedure, all it takes is memory for a string of 64 letters and 2 short procedures that use string functions to find the position of the a given letter in that string or the other way around.
Hewee Zetkin
Registered User
Join date: 20 Jul 2006
Posts: 2,702
04-10-2008 21:57
Hmm. Character sets. That gets me to wondering. If PHP and LSL are using different character sets (no idea), you might run into trouble decoding characters outside the single-byte ASCII range. Could that have anything to do with it?

Also, I hear LSL is going to go to UTF-16 when MONO is deployed. That's just darn scary. How is that going to affect this kind of thing? Will they translate to a UTF-8 encoding for HTTP requests, e-mails, and Base64 encoding/decoding? :( :confused:
Haravikk Mistral
Registered User
Join date: 8 Oct 2005
Posts: 2,482
04-11-2008 05:39
I feel stupid. Base64 conversion seems to be solid, at least for the moment (there's talk of Mono changing LSL's charset!), as the base 64 range is ASCII and PHP only uses ASCII internally (you have to use special functions if you want unicode support).

Seems the problem was actually in one of my algorithms, it was calculating values JUST big enough to overflow a 32-bit signed integer. Which is why the issue was so difficult to trace as it came up very rarely. I've had no IMs from the scripts so far, and I've had a number of copies running at higher than normal speeds to try and catch it so fingers crossed it's working. Now I just need to wait for Mono to come along so I can put all my meaningful error messages back in, a script that just says "Wait" isn't very user friendly =)
_____________________
Computer (Mac Pro):
2 x Quad Core 3.2ghz Xeon
10gb DDR2 800mhz FB-DIMMS
4 x 750gb, 32mb cache hard-drives (RAID-0/striped)
NVidia GeForce 8800GT (512mb)
Hewee Zetkin
Registered User
Join date: 20 Jul 2006
Posts: 2,702
04-11-2008 11:23
Glad you found the problem. :)

You're right that Base64 itself encodes everything in a subset of the ASCII characters. However, the data that is actually encoded BY the Base64 is composed of the binary character values of the original string, which depend on the character set. That's what I'm worried about. Hopefully both ends will expect that when encoding/decoding character data, a UTF-8 character set should be used (despite whatever character sets are used natively by the development/runtime platform) and will perform translation automatically as appropriate. I'm just not entirely confident about that. Anyone know?