Waxy.org
Waxy.org is the sandbox of Andy Baio, a writer and tech entrepreneur in Portland, OR. I work with Expert Labs, helped build Kickstarter, founded Upcoming, made an album, and other stuff too.

Contact Me: Email, AOL IM, or follow me on Twitter.

Google Analytics A Potential Threat to Anonymous Bloggers

Posted Nov 16, 2011

Last month, an anonymous blogger popped up on WordPress and Twitter, aiming a giant flamethrower at Mac-friendly writers like John Gruber, Marco Arment and MG Siegler. As he unleashed wave after wave of spittle-flecked rage at "Apple puppets" and "Cupertino douchebags," I was reminded again of John Gabriel's theory about the effects of online anonymity.

Out of curiosity, I tried to see who the mystery blogger was.

He was using all the ordinary precautions for hiding his identity -- hiding personal info in the domain record, using a different IP address from his other sites, and scrubbing any shared resources from his WordPress install.

Nonetheless, I found his other blog in under a minute -- a thoughtful site about technology and local politics, detailing his full name, employer, photo, and family information. He worked for the local government, and if exposed, his anonymous blog could have cost him his job.

I didn't identify him publicly, but let him quietly know that he wasn't as anonymous as he thought he was. He stopped blogging that evening, and deleted the blog a week later.

So, how did I do it? The unlucky blogger slipped up and was ratted out by an unlikely source: Google Analytics.

Reverse Lookups

Typically, Google will only reveal a user's identity with a federal court order, as they did with a Blogger user who harassed a Vogue model in 2009.

But anonymous bloggers are at serious risk of outing themselves, simply by sharing their Google's Analytics ID across the sites they own.

If you're watching your pageviews, odds are you're using Google to do it. Launched in 2005, Analytics is the most popular web statistics service online, in use by half of Alexa's top million domains.

For the last few years, online SEO tools have published Analytics and AdSense IDs for the domains they crawl publicly, typically for competitive intelligence, such as ferreting out your competitor's other websites.

But in the last year, several free services such as eWhois and Statsie have started offering reverse lookup of Analytics IDs. (Most also allow searching on the Google AdSense ID, though I wasn't able to find an anonymous blogger sharing an AdSense ID across two sites.)

Finding anonymous bloggers from Analytics is less likely than other methods. It's still more likely that someone would slip up and leave their personal info in their domain or share a server IP than to share a Google Analytics account. But it's also more accurate. Hundreds or thousands of people can share an IP address on a single server and domain information can be faked, but a shared Google Analytics is solid evidence that both sites are run by the same person.

And unlike any other method, it can unmask people using hosted blogging services. Tumblr, Typepad and Blogger all have built-in support for Google Analytics, though reverse lookup services haven't comprehensively indexed them. (Note that Wordpress.com doesn't support Analytics or custom Javascript, so their users aren't affected.)

Just to be clear, this technique isn't new. The first Google Analytics reverse lookup services started in 2009, so the technique's been possible for at least two years. My concern is that it isn't nearly well-known enough. It's not mentioned in any guide to anonymous blogging I could find and several established bloggers, engineers, and entrepreneurs I spoke to were unaware of it.

Unmasking an anti-Mac blogger may not be life-changing, but if you're an anonymous blogger writing about Chinese censorship or Mexican drug cartels, the consequences could be dire.

I decided to see how pervasive this problem is. Using a sample of 50 anonymous blogs pulled from discussion forums and Google news, only 14 were using Google Analytics, much less than the average. Half of those, about 15% of the total, were sharing an analytics ID with one or more other domains.

In about 30 minutes of searching, using only Google and eWhois, I was able to discover the identities of seven of the anonymous or pseudonymous bloggers, and in two cases, their employers. One blog about Anonymous' hacking operations could easily be tracked to the founder's consulting firm, while another tracking Mexican cartels was tied to a second domain with the name and address of a San Diego man.

I've contacted each to let them know their potential exposure.

Protecting Yourself

Some of the most important and vital voices online are anonymous, and it's important to understand how you're exposed. Forgetting any of these can lead to lawsuits, firings, or even death.

If you're aware of the problem, it's very easy to avoid getting discovered this way. Here are my recommendations for making sure you stay anonymous.

  1. Don't use Google Analytics or any other third-party embed system. If you have to, create a new account with an anonymous email. At the very least, create a separate Analytics account to track the new domain. (From the "My Analytics Accounts" dropdown, select "Create New Account.")
  2. Turn on domain privacy with your registrar. Better, use a hosted service to avoid domain payments entirely.
  3. If you're hosting your own blog, don't share IP addresses with any of your existing websites. Ideally, use a completely different host; it's easy to discover sites on neighboring IPs.
  4. Watch your history. Sites like Whois Source track your history of domain and nameserver changes permanently, and Archive.org may archive old versions of your site. Being the first person to follow your anonymous Twitter account or promote the link could also be a giveaway.
  5. Is your anonymity a life-or-death situation? Be aware that any service you use, including your own ISP, could be forced to reveal your IP address and account details under a court order. Use shared computers and an anonymous proxy or Tor when blogging to mask your IP address. Here's a good guide.

Stay safe.

24 Comments (Add Yours)

Nov 16, 2011
5:39 PM  
Jessamyn West wrote:

I'm glad you explained all this in such detail. I remember when you told me this when it happened and I Could Not Figure it Out. This is a great post for American Censorship Day.


Nov 16, 2011
6:02 PM  
Jason wrote:

Awesome explanation; I, like Jessamyn, was amazed at your Internet Wizardry when you mentioned that you had figured the anti-Apple blogger's ID out. Nice work.


Nov 17, 2011
1:42 AM  
mgk wrote:

I am a little disappointed, I was just starting the enjoy daring no balls...


Nov 17, 2011
3:04 AM  
Gundars wrote:

it's easy to discover sites on neighboring IPs.
how?


Nov 17, 2011
6:32 AM  
Andy Baio wrote:

Statsie supports it, among others.


Nov 17, 2011
7:37 PM  
Guy wrote:

Way to take something negative "wave after wave of spittle-flecked rage" and turning it into something positive by protecting "important and vital voices online." Nice work.


Nov 17, 2011
11:23 PM  
scottru wrote:

Andy, I found his identity using the Google Analytics tag as well - but you didn't even need to know about one of the reverse lookup services.

All you needed to do was Google the prefix - i.e. search for "UA-XXXXXXX" brought up a page which listed a range of sites (I've since forgotten which services and the ID has changed), and this one was linked to the others.

You can test the same with other prefixes, like yours above (though that one returns for other reasons).


Nov 18, 2011
3:45 AM  
manuel piñeiro wrote:

This type of Analytics API string-matching is offered as a premium service at sites like DomainTools.

I wanted to add a different privacy concern as this article deals with protecting the privacy of the webmaster. The Indymedia Network (which has dealt with numerous server seizures and court-ordered demands for IP address within web server logs) has long had a policy of using a FIFO to log IP addresses but scrubbing them in real-time as logs are saved to disk.

Although not officially codified, it logically follows that Indymedia web properties should not use Google Analytics or other dot-com stats offerings. The idea is that Indymedia tech will do everything in its power to prevent data retention of IP address data.

This policy has been tested to work; Indymedia has successfully brushed off subpoenas that normally would have costs thousands in legal fees. "We don't have them" was the only response needed.

Note that the US is far more progressive on this matter than Europe which has an on-going struggle between lawmakers and activists against "data retention legislation" requiring ISPs to monitor certain datapoints and keep them on file for 6-12 months.

An alternative is to use awstats (an open source, real-time stats program) and configure it however you wish.


Nov 18, 2011
12:30 PM  
Andy Baio wrote:

Very true. The reverse lookup sites make it even more brainless, you don't even need to view source to get the analytics ID.


Nov 18, 2011
4:09 PM  
Arne wrote:

Wrote a little follow-up on this:

http://konver.se/how-to-be-truly-more-anonymous-with-your-anonymous-blog/


Nov 18, 2011
4:20 PM  
Dawn wrote:

Wow, I didn't really know that you could track an anonymous blogger like this. Not saying that they cannot be trust worthy and now I understand why some people remain anonymous but I honestly did not always take them as seriously in the past. I never really thought until now that some people may need to remain anonymous.

I haven't had a problem with any anonymous blogger but this is great to know if I ever have a problem. I respect you for how you handled the situation, some people would have publicly exposed them. You handled it quite well!


Nov 18, 2011
5:40 PM  
Don't like it wrote:

I cannot stand anonymous commenting for the most part, I am sorry, I know it is very popular and wide-spread. I use it myself sometimes. But I have been an eye-witness to the devastating effect of online bullies on Topix who have the power to sit behind a keyboard and spew hate, slander, defamation, cruelty and lies, all day and night long.

Its total naivety to assume that words cannot have close to as devastating an affect on other human beings as can other violent weapons. And violent, harmful words are being put out by the truckload, words used as weapons.

The First Amendment was not written to allow people to harm each other. And yet harm is being done, at a volume that is unreported, but massive.


Nov 19, 2011
2:03 AM  
Daniel wrote:

This is scary. Thanks for your explanation.
You can imagine what evil governments can find out when observing even more traces. For example, inserting a blind gif somewhere on the anonymous blogger's board and then comparing IPs with comment publishing times.


Nov 19, 2011
4:58 AM  
Venessa wrote:

I can understand "Don't Like It" because I feel like commenters should stand behind their words, and bullies don't deserve protection. If you have convictions, have the courage to speak them...and sign your name. But I'm spoiled too.

I don't have to speak out about human rights violations and crime. I think this is important for those who live under the threat of these things to know how they can be exposed. Sadly, it does protect hate spewing bullies at the same time. It's a double edged sword, but worth it for those out there who could die for speaking out against criminals and murderers.


Nov 19, 2011
8:06 AM  
Tess Giles Marshall wrote:

This is really interesting. And what's very strange is I did a search on Statsie for my site and found not only the other site I expected which I also run but another, completely new to me, site (French) sharing the same IP address. I looked at it, but it doesn't seem to be a current site. No idea what it is though or how it's possible to share my IP address.


Nov 19, 2011
9:07 PM  
Ally wrote:

Wow. I am really impressed that you didn't call out the anonymous bloggers. You have a good heart.


Nov 20, 2011
2:31 PM  
Don't like it wrote:

Venessa: People who speak out against human right's violations and crime are also being attacked and hard by anonymous commentator's.

The common defense of anonymous commenting is that people living in repressive societies are protected from putting their real name out would earn them persecution.

What is being totally overlooked by this defense is what is happening in our own country (and around the world) by anonymous bullies who have been given full protection from accountability. They are not only causing grave harm to very large numbers of innocent people but also causing harm to those trying to make a positive difference in the world.

The entire issue of anonymous commenting needs to be re-examined based on the facts of how it is actually being used.


Nov 20, 2011
10:06 PM  
Scott Bartell wrote:

I've been waiting for a search engine - with an index as extensive as Google's - that has the ability to search within source code.

I doubt Google will ever allow it but that ability would be a game changer.

By the way... in Google Analytics you can create new "accounts" which will assign an entirely new account number to whatever websites you create profiles for in the account. The only one who could tell that the account numbers are under the same user is Google.

I'm pretty sure Google is smart enough to figure out (if they wanted to or needed to) who created or uses a Google Analytics account... even if you used a dummy email address. e.g. similar user activity, by IP addresses and login time, geo location, multi-account sign in, search history, ect, ect.

Just sayin'.


Nov 21, 2011
2:12 AM  
dd wrote:

"What is being totally overlooked by this defense is what is happening in our own country (and around the world) by anonymous bullies who have been given full protection from accountability. They are not only causing grave harm to very large numbers of innocent people but also causing harm to those trying to make a positive difference in the world."

This is bullshit. Anonymous bullies never killed anyone with their words you can simply ignore them or develop a thick skin. On the other hand exposing anonymous posters can actually lead to death of them. So it is better to put up with a few word bullies who you can ignore or flame back than it is to support unmasking anonymous posters.

Nice article by the way.


Nov 22, 2011
8:46 AM  
Aagya wrote:

I do feel safe and satisfied that I read this post on your site. I never knew about this issue. Great recommendations also.


Nov 22, 2011
2:46 PM  
Terri Ann wrote:

In a previous life we used this strategy with a bot to crawl sites and identify which ones were likely part of a network of sites or under a potential single owner. I can't go into detail why but the sales guys loved having that data and it was so easy to find and regex out! Heck you didn't even need a regex most of the time.

Dynamically populating the AnalyAnalytics tics ID of you site through an external script will usually work too and also aids development when toggling between dev and production environments. Though I'm not sure how through the bots on the free services you mentioned are at getting past that trick.


Nov 28, 2011
1:05 PM  
theo wrote:

Very nice write up.
Regarding protecting yourself, point 2.
I wouldn't recommend that. Keep your domain name and your hosting provider seperate.
To many cases like that go south and the domain owners being left high and dry.

Have a great day.



Dec 4, 2011
11:59 AM  
john wrote:

Even as a longtime waxy.org fan, I was struck by how much this is a great post.

I've been online since before the golden years of Usenet (remember when email was delivered in packets?), and my sympathy for anonymity has shifted back and forth. Much more forth, of late.

Andy, this is a beautiful articulation of the pros and cons—weighing the need to allow "spittle-flecked rage" if that also allows for communicating in a way that admits and fosters change.

Thanks.


Jan 25, 2012
12:09 PM  
David wrote:

It is great to see a criticism (that is valid) offer a solution, so many big news sights so called experts just make sensational claims but can't offer any remedy, well done. (i do however note that you have not aloud me to remain anonymous) :-)


 

Leave a comment





Waxy Links
Ads via The Deck
February 22, 2012
Stripe's Capture the Flag — hack six levels and get a t-shirt, and maybe a job
Kickstumbler — try the video-only mode
February 21, 2012
10 Seconds from Every Top 100 Song Ever — grabbing the loudest point is surprisingly useful for spotting choruses
Google to sell HUD glasses by year-end — coming soon, the entire history of you
Wired profile on the GitHub team — the article is on Github, and they're accepting pull requests!
Dutch scientists to create first lab-grown hamburger this fall — $316,000, cheap; take that, Fleur de Lys
February 20, 2012
Eternal copyright: a modest proposal — Adrian Hon plays with the absurdities of copyright law (via)
February 17, 2012
36 Copyrighted Suns
Peanutty — learn to code while playing puzzles; related: Code Hero
YouBeMom — 4chan for moms; anonymous, ridiculously active, and often brutally honest
Twitter Friends Map — simple app that I've wanted for ages, spawned from Paul Irish's Lazyweb issue tracker
Caterina Fake announces Pinwheel — sign up for the beta
Unmanned, a game by MolleIndustria and Jim Munroe — shave, pilot a UAV, play videogames, sing One Vision, and contemplate your actions
February 16, 2012
escapes.js — nice JavaScript library for rendering ANSI art
Gawker digs up Facebook's internal content moderation guidelines — they use oDesk contractors to moderate flagged material
John Gruber gets a one-on-one demo of Apple's Mountain Lion — moving much further in the direction of iOS
Everything Is A Remix, Part 4: System Failure — final episode of the absolutely essential film series; go support Kirby's new project
2QWOP — finally, multiplayer flailing
February 14, 2012
The Verge's analysis on apps that upload your contact list — finally, the data journalism article that everyone wanted after the Path debacle
Paul Ford's 100 Ways to Say I Love You — "60. Yell it over your shoulder as you are pushed into the squad car."
Super Mario Bros. Crossover 2.0 — new skins span multiple eras of console gaming
Texting Level: Expert — Adam Ellis's photo conversation from Vegas to New York
A Ship Adrift — an imaginary airship piloted by an AI autopilot based on real weather patterns; follow it on Twitter
Bret Victor's Inventing on Principle — amazing talk that gets increasingly amazing; I want that code editor and iPad app
February 13, 2012
iOS '86
The Dark Room — a silly one-room YouTube adventure
Eclectic Method recreates 99 Problems with film clips — like a frenetic take on Matthijs Vlot's Hello
February 12, 2012
Reddit bans child porn subreddits — a near-immediate response to SomethingAwful's campaign
Wildlife Control's Analog or Digital — HTML5 pixel art video using the Soundcloud API (via)
February 11, 2012
Anatomy of a Tear-Jerker — or: why some songs give us chills and provoke an emotional response

Andy Baio lives here. Some rights reserved, for your pleasure.