Active Defenses for Industrial Espionage

by Anonymous

I was a hired gun for many large corporations, finding dirt on targets, doxing their family homes, and providing a written report as if it was an ethical, professional service rendered.

Oh, you too?

Yes, this is a profession, yes, you can get paid to dox people on the Internet, and I would bet that someone you know does the same thing.  And we suck.

I've also been targeted by corporations who didn't like some reverse engineering I was doing.  Their goons tried to track me down to send me a legal threat and I at least confused them for six months before they had to resort to using my hosting company to find me.

Almost every single large organization hasan industrial espionage team that might fly under a different name like "competitive intelligence" or "business analysis division."  Noone thinks Brenda from Business Analysis is a threat, but we should be afraid of her.

Their job is to find threats to their organization, be it a competing company that could affect their stock price or a kid in an IRC channel trying to build support for a protest which is just bad PR.  And those teams have teams of third-party vendors that do some of the dirty work for them.  Not always because they can't do it themselves, but because they want the deniability if something goes wrong.

I was one of these vendors and I want to share things that might help you if you're ever targeted by a goon like me.

Know Your Enemy

One of the things that motivates me is being told I'm not allowed to do something and then proving them wrong.

So when you block my access to your Facebook page or delete your Twitter account, I just work harder to find dirt on you.  I bet, in some ways, you're like this too.  So are people working in corporate intel.  We can use this information to coordinate a better defense.

We are focusing on one threat here: that of the salaried, 401(k) contributing, 9-5 corporate intelligence goon.  They are not nation state adversaries, they are not local law enforcement.  They have specific operating constraints that can be exploited for defensive purposes.

Here's what you should know:

1.)  They are resource constrained.  Unless you've done something particularly nefarious, you're not worth all their time.  Or for the third-parties working for corporations, you can't spend a month on a person and not have actionable intel.  You have to determine whether it's worth it at the beginning of the project.  Let's see if we can't waste some of their important time.

2.)  They need to produce a result.  In enterprise environments, you don't get paid to start projects that don't go anywhere.  If they are targeting you, they are going to produce a result.  It's a simple boolean conclusion: threat, no threat.  And they must provide supporting evidence to justify this conclusion.

"She is a threat because she's building support for a protest in front of the building."

"He is not a threat because he's 13, lives with his mom, and posted to Stack Overflow 'How Do I hack?'"

If there is no supporting evidence for a report, how will they come to a conclusion?  If we help them do their job, arrive at a conclusion, and move on quickly.

3.)  They are automated and fast.  During the initial phases, a lot of them are going to be fast and loose because they're looking for quantity of information, not quality.  They'll eventually whittle that down to something more actionable later.  This is when they are at their weakest.  They'll usually leverage shared hosting environments (Facebook, Twitter) and their APIs to collect the data at first before moving on to crawling your personal website.

The first thing that their bosses and lawyer-types want are screenshots of everything you've ever written on the Internet.  They'll crawl your blog, company website, Twitter, Reddit, you name it.  It's all about collection.  These requests are going to be coming from other people's IPs if it's over the Tor Network, EC2 instance, or VPN.

Crawling Defenses

Hosting your own website and having it crawled is a great way to figure out that you're being targeted.  Here are some tactics to consider:

Redirect Loops - Web servers like NGINX let you configure it in all kinds of fun ways, such as allowing every path on your site to return arbitrary content.  But to fool their crawling bots, I've seen bots taken down by redirect loops.  In short, you redirect their crawlers to other content infinitely where they waste their time collecting arbitrary contents.  This wastes the crawlers' time, bandwidth, and storage.
Here's an example of NGINX configuration:
location = /content/secret {
  return 302 /secretcontent/moresecrets;
}
location = /secretcontent/moresecrets {
  return 302 /content/secret;
}
Link Bait - Most crawler bots look at the HTML first and try to find "<a href="" tags to follow them. Many of the crawlers will blindly follow the links and download anything. Fill your personal blog or website with hidden links to crap content like so:
<div name="secret" style="height: 0px;width: 0px;overflow: hidden;">
<a href="/secret_path">Secret</a>
</div>
Don't forget to actually add content to these paths or, even better, randomly generate the content every time they visit.  You can be more sly about this than just hiding it with CSS, can't you?

Random Content - It's a terrible feeling to finish scraping a site and find that there's way too much content to really go through.  Fill your sites with random content and pages that don't affect users, but love to get eaten up by bots.  The larger the better, especially pages that look like real content.

Don't make fake admin pages unless you're prepared for the consequences.  There's nothing that would motivate me more to look at your site than finding an admin page.

Social Defenses

O.K., you're using social networks.  I get it.  How can we enjoy society but also defend against people hunting us down?

Facebook FUD - OPSEC rules would say don't use Facebook, but you're going to.  Try to set up a fake account for yourself without angering the Facebook gods.  Use some of your real personal information like your name, and overshare all kinds of information about you like your home address, work location, etc. - making sure all of it is a real location, just not related to you.

Then you need content.  I think pictures of food seems like a legitimate use of Facebook.  You can use a service like buffer.com, which lets you schedule posts to your Facebook profile.  Load up buffer with fantastic images and queue it up to post on a regular basis.

If you have the time and effort to build a profile with relevant content, even better.  Come up with your own persona.  Maybe you want to post some personal information about breaking up with your significant other.

The attack here is trying to bore them into not looking for you.

Canary Tokens - Canary tokens are a simple service that alerts you when a token is accessed.  Consider throwing canary tokens all over some of your most obscure online locations, like email signatures in a mailing list, Facebook posts, PDFs on GitHub, everywhere.

You can run your own service, but Canary Tokens from Thinkst (canarytokens.org/generate) offer all kinds of useful tokens that can alert you when:

Your website is crawled.

Someone visits a custom DNS name.

Someone reads a Word document or PDF.

When a special URL is visited.

Social Obfuscation - Is your name John Smith?  You're in pretty good shape when it comes to someone tracking you down.  Do you go by the hackername "xXx_StackSmasher_NYC_xXx"?  We will find you and it will be easy.

It may be too late to change your accountsat this point, but you can always obfuscate the situation with false information.

If you're interested in this subject, I'd recommend the book Obfuscation: A User's Guide for Privacy and Protest by Finn Brunton and Helen Nissenbaum.

Domains and Self Hosting

Hosting your own infrastructure gives you better insight into who is targeting you and when.  The reason I found out that I was being targeted is because I was alerted to my site being crawled heavily by a specific set of IPs in a specific city.

WHOIS You - You can always set up privacy guards to protect your WHOIS information for domains that you own, and it's illegal to falsify the information on a domain registration so I would never recommend that you do something illegal.  You would never want to change your WHOIS information for your domain to someone else's to fool someone trying to look up information on you.  Even if doing so is not regularly policed and has no major repercussions.

Domain Purchases - Did you know that most corporations have a feed into all the new purchases of a domain?  Every time you buy a domain that says ihateCOMPANYNAME.io, the company gets an alert.  That alone is enough to start a campaign against you.

And these same services will log what a domain registration has been historically.  If you don't set your WHOIS privacy at the time of purchase or you let it lapse for a month, that'll show up in the logs and they will find you... or whomever you put in as the registered owner.

Be smart about these purchases.  If you need to trigger one of these alerts, make sure you're prepared for at least a little follow-up.

Who Will Attack the Attackers?

It may fall into the category of "hack back," but we can specifically target the people that are targeting us.

Malicious Content - If they're going to look at your content, and you can identify which IPs they're coming from, why not add some interesting JavaScript to track them.  With a few lines of code, you can identify the real IP address of the users using WebRTC.

samy.pl/evercookie

diafygi.github.io/webrtc-ips

Tool Targeting - They use the same tools you would: Requests, Selenium, GNU Wget, HTTPTrack, Chromium, whatever.  Every single tool has a very specific fingerprint.  Every one.  Yes, you can figure it out through the User-Agent but there are also very tiny details of each tool that make it different from the packet flow perspective.

If you can detect which tool they are using to hunt you, you can decide how you want to defend against them.

For example, if you think that it's Python Requests, then you may cause some kind of memory exhaustion from a very large web page that you redirect to.  With Selenium, you can inject JavaScript or HTML5 that is CPU intensive.  Maybe you can put their CPU to good use to mine some crypto for you.

Try this, throw some HTML into a file called body.html and then run a command like this:
#!/bin/zsh
for i in {1..50000}; do cat body.
html >> bigbody.html; done
python3 -m http.server 8000
If you wrote a Python script that used "Requests" to access the page, it would look like this:
Filename: get_bightml.py
Line #    Mem usage    Increment   Line Contents
================================================
     4     24.1 MiB      24.1 MiB  @profile
     5                             def get_html():
     6    618.2 MiB    594.1 MiB   r = requests.
get("http://127.0.0.1:8000/bigbody.html")
     7    618.2 MiB       0.0 MiB  return r
By consuming the entire file and putting it into memory, if they haven't restricted the memory usage of their script, it will crash when memory runs out. Tie this in with the redirect loop above and you can start causing machines to reboot.

Conclusions

Look, all of the things I've listed above can be mitigated by the corporate goons who give fractions of a damn. But that's partially the point. Remember, they don't have time to mess around with edge cases like you (unless you're doing much nastier things, in which case you'll need even more OPSEC), they aren't using secret spy tools to find you, and all they really want to do is conclude whether you're a threat. So why not help them out and bring them to the conclusion that you want?

And if you are like I was, working as a shady corporate spy, do something better with your brain than helping corporations bully people.

Return to $2600 Index