Hacking Two-Dimensional Barcodes

by glutton

Recently, news articles have been trumpeting the "new" technology that lets us scan certain barcodes with our cell phones.

These news bites have grudgingly admitted that "certain Asian nations" have the technology already.  The truth is that keitai girls in Japan have known about this trick for ages; the rest of us are finally catching up.

The codes I'm talking about are called 2D or "matrix" codes.

Instead of bars, the data is stored as squares (typically), and the codes are read both horizontally and vertically, allowing for considerably greater density of information.

For a long time, these codes had primarily industrial applications, such as tracking pallets and containers in warehouses and through shipping routes.

However, with the advent of matrix-reading camphone software, all of a sudden there are potentially hundreds of millions of readers out there, and this has a lot of people giddy.  Imagine scanning a coupon code off a Gap billboard to save 10% off a pair of pre-torn jeans?  The codes, if blown up big enough, can be scanned at a distance.

Know Your Codes

First off, "matrix" doesn't refer to a certain trilogy of movies.

A matrix is a grid, pure and simple.

At one point, databases were called matrices because the information stored in a database can be displayed in a grid.  William Gibson called the then-nascent Internet the Matrix in his short stories and novels, and the Wachowski Brothers played off of that.  But when someone refers to a matrix barcode, it refers to the fact that the bars are in a grid format.

There are several types of matrix barcodes out there.  However, there are four that stand out.

Aztec:  Popular in Japan, it can store up to 3750 ASCII characters.  Identifiable by the square "target" in the center of the code.  No "quiet space" is required with this standard, so the code can be placed on a patterned surface without the pattern being mistaken for data by the scanner.

Semacode:  A variant of the ISO 16022 Data Matrix standard.  It can store a maximum of 3116 ASCII characters.  You can tell a Data Matrix code because it has solid lines along the left and bottom of the code, and a regular pattern of squares and spaces along the top and right edges.  A very flexible standard, Data Matrix can be used to make code ranging from 8x8 to 144x144 bits.  An error-checking algorithm is built in, facilitating scanning of damaged codes.  Data Matrix is a DoD standard, is used by the USPS, and is also used for parts tracking in the electronics industry.

QR Code:  A Japanese standard since 1994, used primarily for inventory management until software was written allowing camera phones to scan them.  You can tell a QR code by the three square "targets" on the upper-left, upper-right, and lower-left corners of the matrix.  A robust standard, the QR code can store over 7000 numeric digits or 4296 alphanumeric, 2953 ASCII or 1817 kanji characters.

MaxiCode:  Used by United Parcel Service, the MaxiCode for various reasons is unlikely to ever be used for cellphone scanning anytime soon; I've just included it here because we see them every day.  The standard MaxiCode is 1" square and consists of up to 884 hexagons in 33 rows surrounding a circular bullseye.  It can be read even if 25% of the code is destroyed.  An older standard, it can only store up to 144 characters.  UPS MaxiCodes typically consist of two pieces of information.  The first has the addressee's postal code, country code and delivery class (e.g., second-day air).  The second piece has the street address.

Security by Obscurity

All barcodes are inherently insecure.

The fact that they are machine-readable only (with the exception of traditional UPC symbols, which display the number below the bars) makes them more insecure, not less, because we rely on machines to verify their authenticity.  What cashier ever looks at the barcode of a package unless there's a problem scanning it?  Furthermore, the data isn't encrypted, so no "password" is required by the scanning machine to access the data stored in a barcode.

Shopping mall hustlers have long known about peeling UPCs off of one product and placing them on a more expensive model.  Now, with the advent of web-based encoders and label printers, this is only going to get worse.  At least a cashier can verify the UPC numbers written below the bars.  A matrix code is too information-dense to permit that.  You pretty much have to scan and pray.

Basically, the promise of cellphone scanning is that complicated URLs can be entered into a cellphone's browser without a lot of thumb strain on the user's part.

However, this is also the technology's greatest vulnerability.  It's the equivalent of clicking on Internet links without looking at the URL before you do so.

While it is suggested that users will peer at the URL in their phone's display before hitting "Go," the reality is that most people are either too ignorant or too lazy to pay attention.

And, let's face it, anyone can make a barcode online these days.  Software like Bar Code Pro has gone by the wayside in favor of web-based utilities that make codes for free.  Coupled with evildoers' ability to print their own matrix stickers, you're going to see a lot of scams where codes are spoofed by covering up the real code with a fake one.

Less-larcenous ploys could involve movie times and transit schedules getting replaced with false information, while protesters and competitors can send would-be shoppers to sites detailing the sweatshop sins of a clothier or to the competition's home page.

Code Cloning

More insidious than creating a blatantly false code is duplicating an existing one.

Recently there was an article in 2600 where some guy had an idea to swipe library books by stacking them on top of each other so both security devices would be deactivated by the automated checkout machine.  A far more clever plan would involve creating new barcode stickers.  In short, clone another book to fool the machine.

So, how does this relate to matrix codes?

In the most recent laptop battery scare, the company I bought my computer from had a program where they'd send you a battery and a mailer and you mail your bad battery back.  They sent me two batteries.

I was a little confused because I only had one bad battery.  So I looked at the DHL tracking number for the two packages.  It was the same tracking number for both boxes, and the number of boxed shipped was listed as 1-of-1.

As far as the database was concerned, there was only one box.  Needless to say I kept the 2nd battery, and no one has raised a fuss.

This made me think that there is a vulnerability in the tracking system.

Obviously, two packages shipped on different days with the same code can go through.

Why?

Because the database seems to automatically believe a plausible tracking number or code.  Maybe it only works when a company ships out thousands of products, but to an automated sorting machine it shouldn't matter whether the shipper sent a million packages or just one.  If the package was shipped it should have a record.

It would be interesting to leave a test box at a UPS drop box with a cloned label and see if it arrives.  Just make sure to leave your real name off of the waybill!

The Future of Matrix Codes

Unfortunately, the state of technology today is such that tacky criminals will ruin a perfectly good opportunity to explore and play pranks.

Who ruined Blue Boxing?

The street hustlers who sold long-distance phone calls from payphones.  The early semi-legal explorations of the Internet were turned into botnets and script-running spammers.

And so, as with other frontiers, this one will be colonized by petty crooks.

More technically, in many ways, the matrix barcode is the predecessor of the RFID chip.  Think about it: an information repository which is not human readable but can be scanned by machines.

So play while you can, because the matrix barcodes will only be around for so long.

Return to $2600 Index