Waxy.org
Waxy.org is the sandbox of Andy Baio, a writer and tech entrepreneur in Portland, OR. I work with Expert Labs, helped build Kickstarter, founded Upcoming, made an album, and other stuff too.

Contact Me: Email, AOL IM, or follow me on Twitter.

Google Analytics A Potential Threat to Anonymous Bloggers

Posted Nov 16, 2011

Last month, an anonymous blogger popped up on WordPress and Twitter, aiming a giant flamethrower at Mac-friendly writers like John Gruber, Marco Arment and MG Siegler. As he unleashed wave after wave of spittle-flecked rage at "Apple puppets" and "Cupertino douchebags," I was reminded again of John Gabriel's theory about the effects of online anonymity.

Out of curiosity, I tried to see who the mystery blogger was.

He was using all the ordinary precautions for hiding his identity -- hiding personal info in the domain record, using a different IP address from his other sites, and scrubbing any shared resources from his WordPress install.

Nonetheless, I found his other blog in under a minute -- a thoughtful site about technology and local politics, detailing his full name, employer, photo, and family information. He worked for the local government, and if exposed, his anonymous blog could have cost him his job.

I didn't identify him publicly, but let him quietly know that he wasn't as anonymous as he thought he was. He stopped blogging that evening, and deleted the blog a week later.

So, how did I do it? The unlucky blogger slipped up and was ratted out by an unlikely source: Google Analytics.

Reverse Lookups

Typically, Google will only reveal a user's identity with a federal court order, as they did with a Blogger user who harassed a Vogue model in 2009.

But anonymous bloggers are at serious risk of outing themselves, simply by sharing their Google's Analytics ID across the sites they own.

If you're watching your pageviews, odds are you're using Google to do it. Launched in 2005, Analytics is the most popular web statistics service online, in use by half of Alexa's top million domains.

For the last few years, online SEO tools have published Analytics and AdSense IDs for the domains they crawl publicly, typically for competitive intelligence, such as ferreting out your competitor's other websites.

But in the last year, several free services such as eWhois and Statsie have started offering reverse lookup of Analytics IDs. (Most also allow searching on the Google AdSense ID, though I wasn't able to find an anonymous blogger sharing an AdSense ID across two sites.)

Finding anonymous bloggers from Analytics is less likely than other methods. It's still more likely that someone would slip up and leave their personal info in their domain or share a server IP than to share a Google Analytics account. But it's also more accurate. Hundreds or thousands of people can share an IP address on a single server and domain information can be faked, but a shared Google Analytics is solid evidence that both sites are run by the same person.

And unlike any other method, it can unmask people using hosted blogging services. Tumblr, Typepad and Blogger all have built-in support for Google Analytics, though reverse lookup services haven't comprehensively indexed them. (Note that Wordpress.com doesn't support Analytics or custom Javascript, so their users aren't affected.)

Just to be clear, this technique isn't new. The first Google Analytics reverse lookup services started in 2009, so the technique's been possible for at least two years. My concern is that it isn't nearly well-known enough. It's not mentioned in any guide to anonymous blogging I could find and several established bloggers, engineers, and entrepreneurs I spoke to were unaware of it.

Unmasking an anti-Mac blogger may not be life-changing, but if you're an anonymous blogger writing about Chinese censorship or Mexican drug cartels, the consequences could be dire.

I decided to see how pervasive this problem is. Using a sample of 50 anonymous blogs pulled from discussion forums and Google news, only 14 were using Google Analytics, much less than the average. Half of those, about 15% of the total, were sharing an analytics ID with one or more other domains.

In about 30 minutes of searching, using only Google and eWhois, I was able to discover the identities of seven of the anonymous or pseudonymous bloggers, and in two cases, their employers. One blog about Anonymous' hacking operations could easily be tracked to the founder's consulting firm, while another tracking Mexican cartels was tied to a second domain with the name and address of a San Diego man.

I've contacted each to let them know their potential exposure.

Protecting Yourself

Some of the most important and vital voices online are anonymous, and it's important to understand how you're exposed. Forgetting any of these can lead to lawsuits, firings, or even death.

If you're aware of the problem, it's very easy to avoid getting discovered this way. Here are my recommendations for making sure you stay anonymous.

  1. Don't use Google Analytics or any other third-party embed system. If you have to, create a new account with an anonymous email. At the very least, create a separate Analytics account to track the new domain. (From the "My Analytics Accounts" dropdown, select "Create New Account.")
  2. Turn on domain privacy with your registrar. Better, use a hosted service to avoid domain payments entirely.
  3. If you're hosting your own blog, don't share IP addresses with any of your existing websites. Ideally, use a completely different host; it's easy to discover sites on neighboring IPs.
  4. Watch your history. Sites like Whois Source track your history of domain and nameserver changes permanently, and Archive.org may archive old versions of your site. Being the first person to follow your anonymous Twitter account or promote the link could also be a giveaway.
  5. Is your anonymity a life-or-death situation? Be aware that any service you use, including your own ISP, could be forced to reveal your IP address and account details under a court order. Use shared computers and an anonymous proxy or Tor when blogging to mask your IP address. Here's a good guide.

Stay safe.

28 Comments (Add Yours)

Nov 16, 2011
5:39 PM  
Jessamyn West wrote:

I'm glad you explained all this in such detail. I remember when you told me this when it happened and I Could Not Figure it Out. This is a great post for American Censorship Day.


Nov 16, 2011
6:02 PM  
Jason wrote:

Awesome explanation; I, like Jessamyn, was amazed at your Internet Wizardry when you mentioned that you had figured the anti-Apple blogger's ID out. Nice work.


Nov 17, 2011
1:42 AM  
mgk wrote:

I am a little disappointed, I was just starting the enjoy daring no balls...


Nov 17, 2011
3:04 AM  
Gundars wrote:

it's easy to discover sites on neighboring IPs.
how?


Nov 17, 2011
6:32 AM  
Andy Baio wrote:

Statsie supports it, among others.


Nov 17, 2011
7:37 PM  
Guy wrote:

Way to take something negative "wave after wave of spittle-flecked rage" and turning it into something positive by protecting "important and vital voices online." Nice work.


Nov 17, 2011
11:23 PM  
scottru wrote:

Andy, I found his identity using the Google Analytics tag as well - but you didn't even need to know about one of the reverse lookup services.

All you needed to do was Google the prefix - i.e. search for "UA-XXXXXXX" brought up a page which listed a range of sites (I've since forgotten which services and the ID has changed), and this one was linked to the others.

You can test the same with other prefixes, like yours above (though that one returns for other reasons).


Nov 18, 2011
3:45 AM  
manuel piñeiro wrote:

This type of Analytics API string-matching is offered as a premium service at sites like DomainTools.

I wanted to add a different privacy concern as this article deals with protecting the privacy of the webmaster. The Indymedia Network (which has dealt with numerous server seizures and court-ordered demands for IP address within web server logs) has long had a policy of using a FIFO to log IP addresses but scrubbing them in real-time as logs are saved to disk.

Although not officially codified, it logically follows that Indymedia web properties should not use Google Analytics or other dot-com stats offerings. The idea is that Indymedia tech will do everything in its power to prevent data retention of IP address data.

This policy has been tested to work; Indymedia has successfully brushed off subpoenas that normally would have costs thousands in legal fees. "We don't have them" was the only response needed.

Note that the US is far more progressive on this matter than Europe which has an on-going struggle between lawmakers and activists against "data retention legislation" requiring ISPs to monitor certain datapoints and keep them on file for 6-12 months.

An alternative is to use awstats (an open source, real-time stats program) and configure it however you wish.


Nov 18, 2011
12:30 PM  
Andy Baio wrote:

Very true. The reverse lookup sites make it even more brainless, you don't even need to view source to get the analytics ID.


Nov 18, 2011
4:09 PM  
Arne wrote:

Wrote a little follow-up on this:

http://konver.se/how-to-be-truly-more-anonymous-with-your-anonymous-blog/


Nov 18, 2011
4:20 PM  
Dawn wrote:

Wow, I didn't really know that you could track an anonymous blogger like this. Not saying that they cannot be trust worthy and now I understand why some people remain anonymous but I honestly did not always take them as seriously in the past. I never really thought until now that some people may need to remain anonymous.

I haven't had a problem with any anonymous blogger but this is great to know if I ever have a problem. I respect you for how you handled the situation, some people would have publicly exposed them. You handled it quite well!


Nov 18, 2011
5:40 PM  
Don't like it wrote:

I cannot stand anonymous commenting for the most part, I am sorry, I know it is very popular and wide-spread. I use it myself sometimes. But I have been an eye-witness to the devastating effect of online bullies on Topix who have the power to sit behind a keyboard and spew hate, slander, defamation, cruelty and lies, all day and night long.

Its total naivety to assume that words cannot have close to as devastating an affect on other human beings as can other violent weapons. And violent, harmful words are being put out by the truckload, words used as weapons.

The First Amendment was not written to allow people to harm each other. And yet harm is being done, at a volume that is unreported, but massive.


Nov 19, 2011
2:03 AM  
Daniel wrote:

This is scary. Thanks for your explanation.
You can imagine what evil governments can find out when observing even more traces. For example, inserting a blind gif somewhere on the anonymous blogger's board and then comparing IPs with comment publishing times.


Nov 19, 2011
4:58 AM  
Venessa wrote:

I can understand "Don't Like It" because I feel like commenters should stand behind their words, and bullies don't deserve protection. If you have convictions, have the courage to speak them...and sign your name. But I'm spoiled too.

I don't have to speak out about human rights violations and crime. I think this is important for those who live under the threat of these things to know how they can be exposed. Sadly, it does protect hate spewing bullies at the same time. It's a double edged sword, but worth it for those out there who could die for speaking out against criminals and murderers.


Nov 19, 2011
8:06 AM  
Tess Giles Marshall wrote:

This is really interesting. And what's very strange is I did a search on Statsie for my site and found not only the other site I expected which I also run but another, completely new to me, site (French) sharing the same IP address. I looked at it, but it doesn't seem to be a current site. No idea what it is though or how it's possible to share my IP address.


Nov 19, 2011
9:07 PM  
Ally wrote:

Wow. I am really impressed that you didn't call out the anonymous bloggers. You have a good heart.


Nov 20, 2011
2:31 PM  
Don't like it wrote:

Venessa: People who speak out against human right's violations and crime are also being attacked and hard by anonymous commentator's.

The common defense of anonymous commenting is that people living in repressive societies are protected from putting their real name out would earn them persecution.

What is being totally overlooked by this defense is what is happening in our own country (and around the world) by anonymous bullies who have been given full protection from accountability. They are not only causing grave harm to very large numbers of innocent people but also causing harm to those trying to make a positive difference in the world.

The entire issue of anonymous commenting needs to be re-examined based on the facts of how it is actually being used.


Nov 20, 2011
10:06 PM  
Scott Bartell wrote:

I've been waiting for a search engine - with an index as extensive as Google's - that has the ability to search within source code.

I doubt Google will ever allow it but that ability would be a game changer.

By the way... in Google Analytics you can create new "accounts" which will assign an entirely new account number to whatever websites you create profiles for in the account. The only one who could tell that the account numbers are under the same user is Google.

I'm pretty sure Google is smart enough to figure out (if they wanted to or needed to) who created or uses a Google Analytics account... even if you used a dummy email address. e.g. similar user activity, by IP addresses and login time, geo location, multi-account sign in, search history, ect, ect.

Just sayin'.


Nov 21, 2011
2:12 AM  
dd wrote:

"What is being totally overlooked by this defense is what is happening in our own country (and around the world) by anonymous bullies who have been given full protection from accountability. They are not only causing grave harm to very large numbers of innocent people but also causing harm to those trying to make a positive difference in the world."

This is bullshit. Anonymous bullies never killed anyone with their words you can simply ignore them or develop a thick skin. On the other hand exposing anonymous posters can actually lead to death of them. So it is better to put up with a few word bullies who you can ignore or flame back than it is to support unmasking anonymous posters.

Nice article by the way.


Nov 22, 2011
8:46 AM  
Aagya wrote:

I do feel safe and satisfied that I read this post on your site. I never knew about this issue. Great recommendations also.


Nov 22, 2011
2:46 PM  
Terri Ann wrote:

In a previous life we used this strategy with a bot to crawl sites and identify which ones were likely part of a network of sites or under a potential single owner. I can't go into detail why but the sales guys loved having that data and it was so easy to find and regex out! Heck you didn't even need a regex most of the time.

Dynamically populating the AnalyAnalytics tics ID of you site through an external script will usually work too and also aids development when toggling between dev and production environments. Though I'm not sure how through the bots on the free services you mentioned are at getting past that trick.


Nov 28, 2011
1:05 PM  
theo wrote:

Very nice write up.
Regarding protecting yourself, point 2.
I wouldn't recommend that. Keep your domain name and your hosting provider seperate.
To many cases like that go south and the domain owners being left high and dry.

Have a great day.



Dec 4, 2011
11:59 AM  
john wrote:

Even as a longtime waxy.org fan, I was struck by how much this is a great post.

I've been online since before the golden years of Usenet (remember when email was delivered in packets?), and my sympathy for anonymity has shifted back and forth. Much more forth, of late.

Andy, this is a beautiful articulation of the pros and cons—weighing the need to allow "spittle-flecked rage" if that also allows for communicating in a way that admits and fosters change.

Thanks.


Jan 25, 2012
12:09 PM  
David wrote:

It is great to see a criticism (that is valid) offer a solution, so many big news sights so called experts just make sensational claims but can't offer any remedy, well done. (i do however note that you have not aloud me to remain anonymous) :-)


Mar 18, 2012
9:34 AM  
Mark wrote:

Another tip! MONEY is a BIG identity-revealer. if ANYTHING, connected to what you're doing, even if 3-4 steps away form what it is (things can be traced in a line no problem), costs money and you have to pay for the service you're using (like a VPN, site hosting), USE AN ANONYMOUS PREPAID CREDIT CARD.

for usa, simon gift card, for uk, idt prime card, and there's others in other countries too.

if you're even more about being identified in cctv cameras of the stores you buy the prepaid card from: just cover yourself accordingly there too, and also pick a location that doesn't give away your location. in the center of a large city is best.

Another tip: if you have your own domain and need protection from domain being taken down or info extracted from your domain account: off STRONG offshore domain registrar.

one of the very best is katzglobal.com. bit more money but you get these double legal protections, it's in a country that won't cooperate with most others and has local laws securing it.

It helps.......


Mar 24, 2012
1:32 AM  
tauseef wrote:

This is scary. For example, inserting a blind gif somewhere on the anonymous blogger's board and then comparing IPs with comment publishing times.


Apr 20, 2012
2:56 PM  
Mark wrote:

Wow! A great read, although a little scary as I now realise my analytics is a real Achilles heel. Have been thinking of changing stat collection, I may have to do that now.
Thanks, Mark.


May 15, 2012
3:01 PM  
Joe S wrote:

A bully is someone who threatens you with harm *and won't let you walk away.* Someone typing on a computer may harass, libel, or make you feel bad, but it's you who are choosing to stay there and read it.


 

Leave a comment





Waxy Links
Ads via The Deck
May 21, 2012
Makies — customizable 3D printed doll creator, founded by Alice Taylor
May 20, 2012
Euphony — piano visualization built on three.js and MIDI.js, source is on Github
Paul Lamere calculates the most musical American cities, per capita — using the Echonest API and the top 50,000 artists
Endless, Nameless — Adam Cadre's new interactive fiction inspired by BBSes and old-school text adventures
Community's 8-bit episode on Hulu — chock full of retro references, from Mega Man to Minecraft
May 19, 2012
Dan Harmon on getting fired from Community — a damn shame, this guy's the soul of the show; I can't believe he only owns 10%
Benjamin Valentine's PERFECTION — submit your own to see our collective attempts (via)
Super Chemical Bros. — the classic Star Guitar video remade in Mario (via)
May 18, 2012
What Love Looks Like — the physics of relationships
io9 charts how visions of the future changed over time — tracking how near- or distant-future science fiction is, decade by decade
How Facebook hacked the NASDAQ button to push an Open Graph action — "Mark listed a company on NASDAQ"
NYT visualization of the Facebook IPO vs. historical IPOs — 60% of IPOs since 2010 have had negative returns so far (via)
May 17, 2012
Nekogames' Parameters — abstract, but shockingly good, casual RPG; figuring out the rules is part of the fun
Law & Order & Food — "you have the right to remain delicious"
Ill Doctrine on hip hop conspiracy theories — and, more critically, the rise of gangsta rap and incarceration rates
May 16, 2012
Ze Frank on finishing — unblinking inspiration
Trailer for Ed Piskor's WIZZYWIG — awesome graphic novel inspired by real-life hackers, I highly recommend buying it
May 15, 2012
Ignore Hitler — Draw Something spawns a meme; I like the meta one (via)
Austin Seraphin on learning echolocation — he's a real-life Daredevil
Mat Honan's feature on Yahoo's mismanagement of Flickr — a depressing read, especially while seeing the team release great new features
May 14, 2012
Make interviews Bunnie Huang on the end of Chumby — sad end to a promising product, I received one of the prototypes at Foo Camp in 2006
Rebecca Sugar's Singles — file under: scenarios I'd like to play in a videogame
SMBC on hell — sounds about right
GameBoy Color emulator in JS — the source is on Github (via)
60,000 Dominoes — 65 hours over eight days; the blooper reel was hypnotic (via)
OAuth Is Your Future — Dan Hon snaps some screenshots from the near future
May 13, 2012
Fracuum — winner of Ludum Dare 23; every winner is worth playing
May 11, 2012
Welcome to Life — "the Singularity, ruined by lawyers" (via)
BusinessWeek on the post-Kickstarter life of Diaspora — the founders talk about the Ilya's tragic suicide for the first time
Anachronism detection in Mad Men episodes — language studies from the person who did the frequency analysis for Downtown Abbey (via)

Andy Baio lives here. Some rights reserved, for your pleasure.