image steganography

In my last post I introduced the field of image steganography, which is the practice of concealing secret messages in digital images. I looked at the history of steganography and presented some recently reported real-life cases (including one from the FBI) where digital steganography was used for malicious purposes.

In this post I would like to present to you the following two very simple ways messages can be hidden in digital images:

JPEG Concealing
Least Significant Bit Technique

These techniques, although trivial and easy to detect, will give you an idea of how simple (and therefore potentially dangerous) digital image steganography can be.

JPEG Concealing

Image files in general are composed of two sections: header data + image data. The header data section can contain metadata information pertaining to the image such as date of creation, author, image resolution, and compression algorithm used if the image is compressed. This is the standard for JPEGs, BMPs, TIFFs, GIFs, etc.

Knowing this, one can work around these file structures to conceal messages.

Let’s take JPEGs as an example. The file structure for this format is as follows:

Notice that every single JPEG file starts and ends with the SOI and EOI markers, respectively.

What this means is that any image interpreting application (e.g. Photoshop or GIMP, any internet browser, the standard photo viewing software that comes with your operating system, etc.) looks for these markers inside the file and knows that it should interpret and display whatever comes between them. Everything else is automatically ignored.

Hence, you can insert absolutely anything after the EOI marker like this:

And even though the hidden message will be part of the JPEG file and travel with this file wherever it goes, no standard application will see anything out of the ordinary. It will just read whatever comes before EOI.

Of course, if you put a lot of data after EOI, your file size will increase significantly and might, therefore, arouse suspicion – so you have to be wary of that. In this case, it might be an idea to use a high resolution JPEG file (that naturally has a large file size) to turn attention away from your hidden message.

If you would like to try this steganography technique out yourself, download a hex editor for your machine (if you use Windows, WinHex is a good program), search for FF D9 (which is the hex version of EOI), paste anything you want after this section marker, and save your changes. You will notice that the file is opened like any other JPEG file. The hidden message simply piggy backs on top of the image file. Quite neat!

(Note: hexadecimal is a number system made up of 16 symbols. Our decimal system uses 10 digits: 0-9. The hex system uses the 10 digits from the decimal system plus the first 6 letters of the alphabet. To cut a long story short, hexadecimal is a shorthand and therefore much easier way to read/write binary digits, i.e. 1s and 0s. Most file formats will not save data in human readable form and we therefore need help if we want to view the raw data of these files – this is why hex is used sometimes used)

The Least Significant Bit Technique

Although easy to detect (if you know what you’re looking for), the Least Significant Bit (LSB) technique is a very sly way of hiding data in images. The way that it works is by taking advantage of the fact that small changes in pixel colour are invisible to the naked eye.

Let’s say we’re encoding images in the RGB colour space – i.e. each pixel’s colour is represented by a combination of a certain amount of red (R), a certain amount of green (G), and a certain amount of blue (B). The amount of red, green, and blue is given in the the range of 0 to 255. So, pure red would be represented as (255, 0, 0) in this colour space – i.e. the maximum amount of red, no green, and no blue.

Now, in this scenario (and abstracting over a few things), a machine would represent each pixel in 3 bytes – one byte for each of red, green and blue. Since a byte is 8 bits (i.e. 8 ones and zeros) each colour would be stored as something like this:

That would be the colour red (11111111 in binary is 255 in our number system).

What about if we were to change the 255 into 254 – i.e. change 11111111 into 11111110? Would we notice the difference in the colour red? Absolutely not. How about changing 11111111 to 11111100 (255 to 252)? We still would not notice the difference – especially if this change is happening to single pixels!

The idea behind the LSB technique, then, is to use this fact that slightly changing the colour of each pixel would be imperceptible to the naked eye.

Since the last few digits in a byte are insignificant this is where LSB gets its name: the Least Significant Bit technique.

We know, then, that the last few bits in each byte can be manipulated. So, we can use this knowledge to set aside these bits of each pixel to store a hidden message.

Let’s look at an example. Suppose we want to hide a message like “SOS“. We choose to use the ASCII format to encode our letters. In this format each character has its own binary representation. The binary for our message would be:

What we do now is split each character into two-bit pairs (e.g. S has the following four pairs: 01, 01, 00, 11) and spread these pairs successively along multiple pixels. So, if our image had four pixels, our message would be encoded like this:

Notice that each letter is spread across two pixels: one pixel encodes the first 3 pairs and the next pixel takes the last pair. Very neat, isn’t it? You can choose to use more than 2 bits per pixel to store your message but remember that by using more bits you risk changes to each pixel becoming perceptible.

Also, the larger the image, the more you can encode. And, since images can be represented in binary, you can store an image inside an image using the exact same technique. In this respect, I would highly recommend that you take a look at this little website that will allow you to do just that using the method described here.

I would also recommend going back to the first post of this series and looking at the image-inside-image steganography examples there provided by the FBI. It shows brilliantly how sneaky image steganography can be.

Summary

In this post I looked at two simple techniques of image steganography. The first technique takes advantage of the fact that image files have an end-of-file (EOF) marker in their metadata. This means that any program opening these images will read everything up to and including this marker. If you were to put anything after the EOF, it would be hidden from view. The second technique takes advantage of the fact that slightly changing the colour of a pixel is imperceptible to the naked eye. In this sense, the least significant bits of each pixel can be used to spread a message (e.g. text or image) across the pixels in an image. A program would then be used to extract this message.

To be informed when new content like this is posted, subscribe to the mailing list (or subscribe to my YouTube channel!):

In this post (part 1 of 2 – part 2 can now be found here) I would like to introduce the topic of image steganography, which is the practice of concealing secret messages in digital images. I’ve always been fascinated by this subject so I have taken the excuse to research for this post as a way to delve into the topic. Turns out that image steganography is a fascinating field that should be garnering much more attention than it is.

I will divide the post into two sections:

Steganography: what it is and its early history
Digital image steganography and some recently reported real-life cases – including one from an FBI report on Russian spying in the US (like something out of the cold war)

In my next post I will detail some simple techniques of hiding messages in images, so stay tuned for that.

What is Steganography

Usually today if we want to send sensitive data (e.g. credit card information), we encrypt this data before sending it across the internet. Sending messages like this, however, can arouse suspicion: there is obviously sensitive/secret data in your encrypted message that you are trying to conceal. Attackers know exactly where to look to try to obtain this information.

But steganography works differently: you hide the message in plain sight in order for your message to not attract any attention at all.

The first recorded case of steganography goes back to 499 BC when the Greek tyrant Histiaeus shaved the head of his slave and “marked” (probably tattooed) a secret message onto it. The message was intended for Aristagoras and it was telling him to start a revolt against the Persians. Histiaeus waited for the slave’s hair to grow back before sending him on his way. When the slave reached Aristagoras, his head was shaved again to reveal the hidden message.

Who would have thought to stop the slave and look for a hidden message tattooed on his head? Ingenious, wasn’t it? (Well, maybe not for the slave who was probably left with that message permanently on his head…).

That’s the way steganography works: through deception.

It is an important topic because of how seemingly common it is becoming. A report from 2017 by the global computer security software company McAfee says that steganography is being used in more ways today than ever before. However, Simon Wiseman, the chief technology officer of the network security firm Deep Secure, argues that it’s not so much that steganography is becoming more popular, just that we are discovering it more often by learning how it is being done: “now that people are waking up to the fact that it’s out there, the discovery rate is going up.”

Either way, as McAfee claims: “Steganography will continue to become more popular.”

Digital Image Steganography

As mentioned earlier, digital image steganography is the hiding of secret messages inside images. Take a look at these two images distributed by the FBI:

You wouldn’t think that both of them contain the following map of an airport, would you?

Well, they do. The FBI doesn’t lie 🙂

It’s a scary thing when you consider the huge number of images being sent across the internet every day. You would really have to know precisely where to scan for this stuff and what to look for otherwise you’re searching for a needle in a haystack.

Now, the first recorded case of image steganography in a cyberattack dates back to 2011. It was called the Duqu malware attack and it worked by encrypting and embedding data into small JPEG image files. These files were then sent to servers to obtain sensitive information (rather than doing destructive work directly like deleting files). McAfee says that it was used to, for example, steal digital certificates from its victims. How Duqu worked exactly, however, remains unknown. Researchers are still trying to work this out (although all sources I could find on this are fairly outdated). Quite amazing.

I found earlier reported cases, however, of image steganography being used for malicious purposes, not necessarily in cyberattacks. My favourite one is from the FBI.

Here’s an official report from them from 2010 accusing the Russian foreign intelligence agency of embedding encrypted text messages inside image files for communications with agents stationed abroad. This reportedly all took place in the 90s in the US. Turns out that the 10 spies mentioned in the report later pleaded guilty to being Russian agents and were used as part of a spy swap between the U.S. and Russian governments. The FBI and the Russians… and spy swapping! Like something out of a movie. Shows you how serious the topic of digital image steganography is.

You can see how this way of embedding communication in images is a much more sophisticated version of the “tattooing a message on a shaved head” example from Ancient Greece described above.

Is image steganography being used like this by ISIS to communicate secretly amongst each other? Chances are it is.

Early this year a communication tool was discovered called MuslimCrypt (poor choice of name, in my opinion). As Wired reports, the tool was found in a private, pro-ISIS Telegram channel on January 20. It is dead simple to use (take a look at the video on the Wired page to see this for yourselves): you select an image, write a message in text form, select a password, and click one button to hide this message inside the image. This image can then be sent across the internet after which the recipient puts it into MuslimCrypt and with one click of a button retrieves the hidden message. Sneaky, dangerous stuff.

What would make detection even more difficult is a hidden message distributed over multiple images. Well, models for this already exist, as the academic paper “Distributed Steganography” (Liao et al., International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2011) presents.

Moreover, a patent for distributed steganography was filed by a certain Charles Easttom II William in 2010. This image from the patent summarises distributed steganography nicely:

distributed-steganography-example — *(Image adapted from the original patent)*

Fascinating stuff, isn’t it?

Stay tuned for my next post where I will look in detail at some simple examples of digital image steganography. (Update: this new post can now be found here)

To be informed when new content like this is posted, subscribe to the mailing list (or subscribe to my YouTube channel!):

Be an Optimist Prime in the world of Computer Vision and AI

Tag: image steganography

Image Steganography – Simple Examples

JPEG Concealing

The Least Significant Bit Technique

Summary

Image Steganography – An Introduction

What is Steganography

Digital Image Steganography