Practical Web Page Steganography - RGB, ISO 8859-1, and 1337sP33K

by Glutton

Steganography is Greek for hidden writing.

The concept has actually been around for ages, with the idea that adding a "security by obscurity" layer to an encoded message would make it even harder to crack.  There are legends of Greeks covering hidden messages in wax or writing it in invisible ink.  In our day we tend to think in technological terms.

There was a rumor that the 9/11 hijackers used digital steganography to communicate but this was discovered to be totally untrue.  Steganography even made it into a (((Hollywood))) movie, with Morgan Freeman using it in Along Came a Spider.

The idea as it is traditionally presented is this: A 24-bit JPEG has 8-bits for each color.  If you swap out one bit for each pixel, you can use that bit to hide data with a negligible loss of color.

All very interesting, but not practical because of the need of specialized software.  Plus oftentimes web-based images go through some sort of resampling, resizing, or compression.

For example, if you upload an image to eBay, you don't see the original photo in your listing.  You see a copy of it.  Whether this affects the functionality of the steganography or not is unknown but nevertheless it adds to the worry.

Then there is the fact that the authorities have exhaustively researched steganography because of the supposed 9/11 connection.  They probably have image-snarfing bots snooping the net, searching for those telltale dropped bits.

RGB Stego

There is an easier way.

Computers display color using RED, GREEN, and BLUE, with each of the three colors represented as a value between 0 and 255.

As it happens, this is also the range for the standard ISO 8859-1 character set that is embedded in all TrueType and Type 1 fonts.

For example, 36 is the code for $.

Say I have a single pixel of color, with the value of:

Color   Decimal        ASCII
        Value          Character
Red     99             c
Green   97             a
Blue    116            t

Well, with that one pixel I spelled cat!

With three bytes per pixel, you can fit an incredible 15,552 characters into a typical one inch square graphic!

Before you get all excited, here are some difficulties.

First, unlike the dropped-bit steganography, that one-by-one image won't look like anything except mush.

Second, without specialized software, it would take forever to encode a 15,000 Letter note in Photoshop!  It would also be a drag to decode; you'd have to open the graphic in Photoshop and check the RGB values for every pixel.

And finally, once the bad guys figure out what you're doing, they can decode your message as easily as your intended audience can.

Before I get into possible solutions, here area couple of other ideas for concealing messages on the web:

Metadata:  This is merely text appended to a file, visible or not depending on the processes used.

The technology was developed in association with a couple of newspaper groups in order to embed copyright data, cutlines, credits, and so on.  Digital cameras add a record of their model number and sometimes F-stop and ASA/ISO settings to metadata.

In Windows, you can edit this information for files saved in Photoshop, TIFF, JPEG, EPS, and PDF formats.  In Mac OS, you can add file information to files in any format.  The text is embedded in the file using a format called Extensible Metadata Platform (XMP).

Now how does this help us?

Well, there is room for comments among the fields, so short messages could be attached to JPEGs and placed on a web page.  For this to work you'd need to have a prearranged plan for which image to nab.  Maybe you have an album of innocuous vacation photos but one special one in which you have embedded the message.  Since anyone can look at metadata if they know how, you could even encrypt the data for added security.  Now, why not just email if you plan on using PGP?  Well, if the bad guys intercept an email containing an encrypted message, they'll know you're up to no good.  Sneaky is good.

HTML Stego:  Even easier than RGB steganography, HTML's color palette can be used to create ranges of 0 to 255.

In the good old days, there were a finite number of colors that everyone could view on the web.  So colors were and are represented by six hexadecimal digits - FFFFFF is white, for example.

The first two digits are RED, the second two are GREEN, and the final two digits represent BLUE.

Sixteen times sixteen equals 256, and there you have your character ranges.  All you have to do is create apparently decorative blocks of color using the <table> feature, but these are actually your hidden message.

Or you could color snippets of text with your code colors, requiring readers to "View Source" to see their values.

The advantage of HTML steganography is that you don't need anything but your wits and a text editor to encode or decode!

Solutions to Problems

Mush:  Your coded RGB message looks out of place on your web page.  Shrink it down to one pixel by one pixel and it will be an innocuous dot in one obscure corner of your page.  Or float a butt ugly logo over it using CSS layers.  Or make the coded portion of your message a strip a pixel wide at the bottom of your decoy image.

Time Consuming:  I mentioned the 15,552 characters to illustrate, but your message need not be War and Peace.  A simple message of 120 characters would need only 40 pixels.  If you were really ambitious, you could write a program that analyzes the color values of graphics and returns as outputted text a string of 0-255 numbers.

Insecure:  Simply scramble the ISO 8859-1 character set and voilà!  You have a substitution cypher.  One of the weaknesses of a substitution cypher is its susceptibility to being cracked by guessing the letters based on their frequency.

However, those cyphers are based on a 26-letter hash.  We have 256 characters!  So how can we use this to our advantage?  Well, how about our native language of 1337sP33K?  Don't groan, there are numerous glyphs in the ISO 8859-1 character set that resemble other letters.  Take the most easily guessed letter, E.

We can substitute 3, É, é, Ë, ë, Ê, ê, and so on.  All perfectly readable once decoded, but to the codebreaker trying to crack a substitution cypher, it's a huge stumbling block.  Or of course you could encrypt the message with PGP and make it all but unbreakable.

Conclusion

Sometimes the most difficult code to break is the one you can't see.

While not perfect solutions, the ideas presented here can help keep your communications private in a world in which someone, it seems, is always watching and listening.

Return to $2600 Index