Hacker Perspective: Martin Eberhard

"How long can the regime control what people are allowed to know, without the people caring enough to object?  On current evidence, for quite a while."

So concludes James Fallows' article about the Great Firewall of China in the March 2008 issue of The Atlantic.  (Search for [firewall china fallows]).

The Chinese firewall is a crude but effective system that looks at every single Internet connection in the country, and decides whether or not the user may proceed, based on policies set by the government.  If a Chinese citizen looks too hard for information about, say, Tibetan independence, the Tienanmen Square massacre, or Fulan Gang, not only might her search be blocked, she is also inviting a visit from the police.  An outrageous invasion of privacy, isn't it?

Reading Fallows' article immediately made me think about how to get around the Chinese firewall, and made me wonder how many people there already have.  I guess it's the hacker instinct in me - I go straight from being outraged about the invasion of privacy to wondering how I might hack it if I had to.

I figured out how ordinary locks worked sometime in junior high school, and soon thereafter, I figured out how to pick these locks, how to make keys for them without fancy locksmith machines, and how to re-key locks my way.  Soon thereafter, I discovered computers, which definitely were not personal in those days.

I got kicked out of my 10th grade computer programming (FORTRAN) class for allegedly loading something into the school district's mainframe that brought the whole thing down.  (No comment.)

In those days, such security systems were challenges - picking the lock was an end to itself.  As I grew up, I channeled this energy into getting a decent engineering degree, then into becoming an entrepreneur.  I guess you could say that Tesla Motors was my first try at hacking the global energy system.

Meanwhile we are busily transforming the Land of the Free into a High-Tech Surveillance Society of our own.

In the name of preventing terrorism in this post-9/11 world, we have come to accept the USA PATRIOT Act, video cameras watching us along highways and intersections, more video cameras in other public places, invasive airport screening, scrutinized financial transactions, widespread wiretaps, surveillance of our online activities, efforts to create national identity cards, face recognition equipment at sporting events, and lots more.  (Search for [patriot act spying])

Alarmingly, we give up our privacy not just to protect ourselves from terrorists, but also for mundane convenience: "preference" information gathered by online retailers, credit card usage data, ubiquitous RFID tags embedded in consumer goods, "Club" discount cards at supermarkets, deep personal information posted at social networking sites and then sold to marketers, open wireless networks, etc.

In this article I focus on the ocean of data collected about us by search engine companies.

We know that search engine companies collect and save massive amounts of information about our searches, but then again, search engines are so useful and convenient.  (Search for [search engine data])

They ostensibly use this information to tune the advertising that we get to see.  We also know that many sites sell the data they collect to others.  Who knows to what other ends these data are put?  Some, such as Google says as a matter of policy that they will not be evil.  (Search for [don't be evil])

Unfortunately, your privacy is not a right that is clearly or specifically called out in the U.S. Constitution.

Some specific aspects of your privacy are protected, such as the privacy of your beliefs (in the First Amendment), privacy of your home against demands that it be used to house soldiers (in the Third Amendment), privacy of you and your possessions against unreasonable searches (in the Fourth Amendment), and perhaps most importantly the Fifth Amendment's privilege against self-incrimination, which provides some protection for the privacy of your personal information.  (Search for [right to privacy])

Since about 1923, the U.S. Supreme Court has interpreted the "liberty" guarantee of the Fourteenth Amendment to guarantee an increasingly broad right to privacy, and is the basis of most privacy protection outside those specifically listed.

But the future of this constitutional privacy protection remains an open question.  In our current Supreme Court, the so-called "originalists," like Justices Scalia and Thomas, are not inclined to protect your privacy beyond what is plainly and specifically guaranteed in the Bill of Rights.  (Search for [scalia thomas privacy])

(Supreme Court nominee Robert Bork has derided the right of privacy as "A loose cannon in the law."  (Search for [bork "loose cannon in the law"]) Good thing he never made it onto the Court!)

Beyond constitutional protection, your privacy and your sensitive or personal information are protected somewhat by a patchwork of statutes on a per-industry basis.

The Privacy Act of 1974 prevents the unauthorized disclosure of your personal information that is held by the federal government.  The Fair Credit Reporting Act protects information about you that has been gathered by credit reporting agencies.  The Children's Online Privacy Protection Act restricts what information about your children (age 13 and under) can be collected by websites.  The Sarbanes-Oxley Act, HIPAA, and Gramm-Leach-Bliley Act each contain some protection for some of your personal or confidential information.  Some state laws also provide protection.

Since privacy is not specifically protected in the constitution, there will continue to be a battle between those of us who want our privacy protected and those who want to invade it - often our own government, certainly also businesses who aggregate and sell our eyeballs and, worst of all, cooperation between the two.

Let's not forget most of the phone companies' gleeful cooperation with the U.S. government's widespread warrantless wiretap program.  (Search for [telecom wiretap us cooperation]

You can bet that every service provider company - search engine companies included - is paying close attention to the immunity that Congress is right now granting to these phone companies for their illegal participation in this wiretapping program.  (This is part of the latest Foreign Intelligence Surveillance Act, or FISA bill.  Search for [us telecom wiretap immunity])

What will happen when the government asks your favorite search engine company to divulge what you and I have searched for?  This has happened already.  So far, Google has resisted, but AOL and others did not.

The World Privacy Forum notes:

"In 2006, AOL released about 20 million search queries of over 500,000 of its users.  Those queries were put on the web.  Reporters for the New York Times were able to identify a user from the search queries; others have also been able to identify users.  In 2005, the U.S. Department of Justice subpoenaed Google, Yahoo, MSN, and AOL for tens of millions of users' search queries.  Google successfully fought the request, and was able to limit its disclosure, but it is unknown how much data other companies may have turned over."

Although Ask.com has - subsequently announced that they will delete your searches after 18 months, Google has not.  (Search for [ask eraser)

To get an idea about how long Google is interested in your data, a Google cookie on your machine expires in the year 2038!  (Search for [google cookie expires])

So the Google search you made three years ago for, say, "file sharing music" could come back to haunt you three years from now when some new, even more odious version of the Digital Millennium Copyright Act comes into law.

Can even Google forever be trusted not to be evil?  To what new ends will they put all that data about us?  Anyway, doesn't it creep you out knowing that they are saving and analyzing every search you have ever made?

And now, with Google's acquisition of DoubleClick, they will be able to correlate your searches with the rest of your web browsing - and maybe make it more painful to block cookies from DoubleClick and Google.

Strategies to Protect Your Privacy:

An anonymizer tool or a proxy site will mask your IP address and some of the info about your computer when you surf the web.

(To get an idea about what websites, including search engines already know about you, check out this site: ipid.shat.net.  Spooky.)

I use an Ironkey when I can, and there are both free sites and pay sites that can make your surfing anonymous.  But some websites don't work well with these tools.

The World Privacy Forum suggests several strategies to help protect your privacy while using search engines:

  • Do not accept search engine cookies.  If you already have some on your computer, delete them.
  • Do not sign up for email at the same search engine where you regularly search.
  • Mix it up.  Use a variety of search engines.
  • Watch what you search for.
  • Read your news on one search engine, have your email on another, and use a handful of other separate search engines for web research.
  • Vary the physical location you search from.
  • If you surf using a cable modem, or a static (unchanging) Internet connection, ask your service provider to give you a new IP address.
  • Be aware that your online purchases can be correlated to your search activity at some search engines.

These search strategies are cumbersome and not especially effective.

We certainly cannot count on the government to respect or help to protect our privacy.  And I would rather not have to trust Google and Ask.com to protect my privacy.  What we need is a simple tool that requires little of our attention, and provides pretty good privacy - something as simple to use as a browser plug-in.

This is an opportunity for a little constructive hacking, and browsers that allow plug-ins provide the perfect opportunity.  What I am proposing is a simple plug-in for the Firefox browser (and any other browser that supports plug-ins) that will bury your searches in noise.  Let's call this plug-in "Haystack."  (Search for [firefox "how to write"])

Here is how it works: Haystack generates a relatively low-level background of random searches across a variety of search engines whenever your computer and your network connection are not too busy.  The goal is to generate hundreds to thousands of random (hay) searches for every real search you do, such that your searches are a small needle in the haystack of these automatically-generated searches.

Search engines generally run analytic software that constantly looks for attacks - denial-of-service attacks, bogus click-throughs to pump up somebody's advertising costs, etc.  Since the goal of Haystack is to protect our privacy, not to bring any search engine down, it must be written in such a way that, from the search engine's point-of-view, it looks like you are just manually searching.

  • Search Engine Variety:  Through a setup option, you can select which search engines Haystack uses, matching the ones you normally use yourself.
  • Frequency:  I think one search every 15 seconds on average is about right, though the interval should be random, varying from, say, 5 seconds to about 5 minutes.  If your machine is on for 10 hours per day, this will generate 2,400 "hay" searches per day.  Remember, the goal is to look as much like a lot of human-generated searches as possible, not to jam up the search engine.
  • Search Terms:  This needs to be very broad, random, and always changing.  I suggest seeding the program with a search word list, and then pulling new search terms from the search results themselves, as well as occasionally from the text on the front pages of news sites like (((cnn.com))).  The searches must include a spectrum of provocative terms, so that any such search that you might do will not stand out.
  • Search Complexity:  Like search terms, broad and random.  Search for single words, as well as several words at a time, and even with excluded words.
  • Computer Usage:  Ideally, Haystack should not initiate searches when either your computer is very busy or your network connection is very busy.  Since the actual search results are not valuable, Haystack should even abort an initiated search by closing the connection to the search engine if CPU usage suddenly increases.
  • User Controls:
    • On/off radio button.
    • Check boxes to enable one or more search engine sites.
    • Slider for search - frequency (2 seconds to 10 minutes?)
    • Button to clear search engine cookies and private data.
    • Button to get latest version.

  • Output:  Haystack should not bother the user with an open tab; the search results should be silently loaded and discarded (after gleaning a new search term or two from the data).  A small icon on the toolbar indicating that Haystack is running should be good enough, perhaps also indicating the ratio of Haystack searches to your own searches.

If you and I both run Haystack, then the "information" search engines collect from our searches is mostly noise.

Perfect.

But think what happens if millions of us run Haystack...  It does throw a monkey wrench into their lovely data collection machinery, doesn't it?  Such is the cost of asserting our right to privacy.

So why am I writing this?  Simple: I am a hardware hacker.

My software abilities are limited to some really tight assembly language code.  I am also spending most of my time planning my next big hack into the world of oil consumption, perhaps the subject of a future column here.

Although I care a lot about privacy and recognize its defense as a patriotic act, I am not the one to write Haystack.  Are you?

Martin Eberhard has founded three companies: Tesla Motors, NuvoMedia (makers of the Rocket eBook), and NCD Inc.  His interest in tech probably started when he disassembled his father's snazzy Omega Seamaster watch when he was six, though the experience of trying to get it back together again (and his father's wrath at his failure to do so) led him to go get an engineering degree or two, so that he actually knew what he was doing.

Return to $2600 Index