Recently, Freeside hosted a CryptoParty where I gave an introductory presentation on steganography. Like all my CryptoParty presentations, this wasn't very technical, but I did introduce some (very) basic techniques.
The first tool that everyone should know about is exiftool. exiftool reads and writes to the metadata section of a variety of image formats. I showed an excellent illustrated example of Exif metadata in the JPEG format, which has some great diagrams which show how a JPEG file's bytes are laid out. There's also C# .NET code included to extract and modify this data, if perl's not your thing (Note: perl should not be your thing).
There are many uses for Exif metadata. The most common use is by camera manufacturers. You may have heard that digital photography can record data and store it into the photo itself. This is how and where it happens. It's not just a timestamp, either. Your camera, especially a smartphone camera, can store information like GPS, your phone firmware version, the OS it's running, model number, IMEI, and other information that can unique identify your camera as the source of the photo.
Facebook, Google, and other social media use this feature to conveniently place the location of where the photo was taken when you upload it to their service. This is great when you want to let your friends know that the picture of you standing in front of the Grand Canyon was taken at the location of the Grand Canyon (for those friends of yours that don't know what the Grand Canyon looks like). It's less awesome when you've called in sick to work on Thursday and post a picture of a cool looking bird on Saturday, especially if you work in Atlanta and that bird was on the outskirts of Panama City. Your employer can put two and two together.
Thankfully, there are tools to strip out metadata from images. Consider using some before posting to social media! There's always opt-out, too (you don't have to post everything to Facebook).
You can use exiftool to extract the information from some of the images in this blog post. For example, with the "Snakes are Awesome" image, we can run the following command at the terminal:
$ exiftool -l snake.jpg
Note: "$2" was removed when I wrote the value to the image, because $2 is a variable in Bash shell and the command was looking to substitute a value for it (which was nothing).
In this way, you can "hide" a URL in a picture. It's not very well hidden, but a person or software tuned to detect this sort of thing can fish it out. Still, it's a great way to communicate a "secret" with others that's not immediately obvious. There's also no reason the data you store in metadata can't be encrypted.
Text steganography is the next step up in hiding information in plain sight. For the presentation, I demo'd spammimic, an online tool that takes a string and hides in within spam, a fake PGP signature, or even characters that make it look Russian! Let's say I want to send the message, "The only limit is yourself" - spammimic can make this look like a spam email:
Dear Friend ; Thank-you for your interest in our publication . If you no longer wish to receive our publications simply reply with a Subject: of "REMOVE" and you will immediately be removed from our club ! This mail is being sent in compliance with Senate bill 1627 ; Title 6 , Section 303 ! This is NOT unsolicited bulk mail [...]
The way that works generally is by taking the characters and mapping them to a known snippet of spam. Note how the punctuation is always space-punctuation-space. If you know about spammimic, it's not difficult to write some software to detect and test for this sort of thing. Now, go through your spam folder and see which ones have hidden messages!
So, computers are basically machines that process strings, so anything you do with text is probably easily suited to reverse engineering and therefore, easily detected by three letter government agencies.
What about images within images, man?
There's a very simple technique to hide a zip file within a JPEG or GIF file. The reason this works is that JPEG/GIF files are interpreted and identified by the header, whereas zip files are read from the end of the file. So, in browsers and operating systems, the image will be rendered while the zip file remains obscure.
This technique is not without its drawbacks. For starters, depending on the data, you can really blow up the size of a JPEG or GIF (which are typically less than 500K in size, which is being generous!) A single PDF file could be 1-2MB. So, a naive software detector can simply scrape social media sites like Tumblr and Twitter and put aside images in excess of a certain size threshold. Still, you have to know to look for that. Most casual human observers will see a picture and think nothing of it.
Here's how to execute the technique:
$ cat taxiderpy_original.jpg >> taxiderpy.jpg
$ zip secret.zip microsoft-spy.pdf
$ cat secret.zip >> taxiderpy.jpg
$ ls -sh1 taxiderpy*
This does nothing more than use the *nix command cat to append the zip file to the end of the image. In this case, we have appended a PDF file with Microsoft's menu of services to law enforcement to the back of an image of a taxiderpy polar bear. As you can see from the output of ls, the file size has increased from 40K to 1.6M.
Note: Blogger was able to detect that something was off about the taxiderpy image when attempting to upload it to this post. To fetch the actual file, download the original presentation.
Extraction is easy - you simply attempt to unzip the JPEG or GIF. Note that unzip warns about some extraneous data at the start of the file, which is the image, of course:
warning [taxiderpy.jpg]: 37425 extra bytes at beginning or within zipfile
(attempting to process anyway)
$ open microsoft-spy.pdf
There's some more advanced techniques that hold up better to closer scrutiny. For example, the same technique that professional photographers use to include a watermark can be used to hide a URL or other piece of data in a photo. Video is another great medium to hide information. In a complex animation or sequence, you could flash some secret text to the screen in a subtle way. The "key" that the recipient needs to read the data is the exact frame number.
For more good times, come to the next CryptoParty! We also archive all the past presentations and information discussed at CryptoParty on our wiki. I'll be trying to get these into blog post format, to fill in the blanks between the slides, as it were.