Google Analytics A Potential Threat to Anonymous Bloggers

Last month, an anonymous blogger popped up on WordPress and Twitter, aiming a giant flamethrower at Mac-friendly writers like John Gruber, Marco Arment and MG Siegler. As he unleashed wave after wave of spittle-flecked rage at “Apple puppets” and “Cupertino douchebags,” I was reminded again of John Gabriel’s theory about the effects of online anonymity.

Out of curiosity, I tried to see who the mystery blogger was.

He was using all the ordinary precautions for hiding his identity — hiding personal info in the domain record, using a different IP address from his other sites, and scrubbing any shared resources from his WordPress install.

Nonetheless, I found his other blog in under a minute — a thoughtful site about technology and local politics, detailing his full name, employer, photo, and family information. He worked for the local government, and if exposed, his anonymous blog could have cost him his job.

I didn’t identify him publicly, but let him quietly know that he wasn’t as anonymous as he thought he was. He stopped blogging that evening, and deleted the blog a week later.

So, how did I do it? The unlucky blogger slipped up and was ratted out by an unlikely source: Google Analytics.

Reverse Lookups

Typically, Google will only reveal a user’s identity with a federal court order, as they did with a Blogger user who harassed a Vogue model in 2009.

But anonymous bloggers are at serious risk of outing themselves, simply by sharing their Google’s Analytics ID across the sites they own.

If you’re watching your pageviews, odds are you’re using Google to do it. Launched in 2005, Analytics is the most popular web statistics service online, in use by half of Alexa’s top million domains.

For the last few years, online SEO tools have published Analytics and AdSense IDs for the domains they crawl publicly, typically for competitive intelligence, such as ferreting out your competitor’s other websites.

But in the last year, several free services such as eWhois and Statsie have started offering reverse lookup of Analytics IDs. (Most also allow searching on the Google AdSense ID, though I wasn’t able to find an anonymous blogger sharing an AdSense ID across two sites.)

Finding anonymous bloggers from Analytics is less likely than other methods. It’s still more likely that someone would slip up and leave their personal info in their domain or share a server IP than to share a Google Analytics account. But it’s also more accurate. Hundreds or thousands of people can share an IP address on a single server and domain information can be faked, but a shared Google Analytics is solid evidence that both sites are run by the same person.

And unlike any other method, it can unmask people using hosted blogging services. Tumblr, Typepad and Blogger all have built-in support for Google Analytics, though reverse lookup services haven’t comprehensively indexed them. (Note that WordPress.com doesn’t support Analytics or custom Javascript, so their users aren’t affected.)

Just to be clear, this technique isn’t new. The first Google Analytics reverse lookup services started in 2009, so the technique’s been possible for at least two years. My concern is that it isn’t nearly well-known enough. It’s not mentioned in any guide to anonymous blogging I could find and several established bloggers, engineers, and entrepreneurs I spoke to were unaware of it.

Unmasking an anti-Mac blogger may not be life-changing, but if you’re an anonymous blogger writing about Chinese censorship or Mexican drug cartels, the consequences could be dire.

I decided to see how pervasive this problem is. Using a sample of 50 anonymous blogs pulled from discussion forums and Google news, only 14 were using Google Analytics, much less than the average. Half of those, about 15% of the total, were sharing an analytics ID with one or more other domains.

In about 30 minutes of searching, using only Google and eWhois, I was able to discover the identities of seven of the anonymous or pseudonymous bloggers, and in two cases, their employers. One blog about Anonymous’ hacking operations could easily be tracked to the founder’s consulting firm, while another tracking Mexican cartels was tied to a second domain with the name and address of a San Diego man.

I’ve contacted each to let them know their potential exposure.

Protecting Yourself

Some of the most important and vital voices online are anonymous, and it’s important to understand how you’re exposed. Forgetting any of these can lead to lawsuits, firings, or even death.

If you’re aware of the problem, it’s very easy to avoid getting discovered this way. Here are my recommendations for making sure you stay anonymous.

  1. Don’t use Google Analytics or any other third-party embed system. If you have to, create a new account with an anonymous email. At the very least, create a separate Analytics account to track the new domain. (From the “My Analytics Accounts” dropdown, select “Create New Account.”)
  2. Turn on domain privacy with your registrar. Better, use a hosted service to avoid domain payments entirely.
  3. If you’re hosting your own blog, don’t share IP addresses with any of your existing websites. Ideally, use a completely different host; it’s easy to discover sites on neighboring IPs.
  4. Watch your history. Sites like Whois Source track your history of domain and nameserver changes permanently, and Archive.org may archive old versions of your site. Being the first person to follow your anonymous Twitter account or promote the link could also be a giveaway.
  5. Is your anonymity a life-or-death situation? Be aware that any service you use, including your own ISP, could be forced to reveal your IP address and account details under a court order. Use shared computers and an anonymous proxy or Tor when blogging to mask your IP address. Here’s a good guide.

Stay safe.

Comments

    I’m glad you explained all this in such detail. I remember when you told me this when it happened and I Could Not Figure it Out. This is a great post for American Censorship Day.

    Awesome explanation; I, like Jessamyn, was amazed at your Internet Wizardry when you mentioned that you had figured the anti-Apple blogger’s ID out. Nice work.

    Way to take something negative “wave after wave of spittle-flecked rage” and turning it into something positive by protecting “important and vital voices online.” Nice work.

    Andy, I found his identity using the Google Analytics tag as well – but you didn’t even need to know about one of the reverse lookup services.

    All you needed to do was Google the prefix – i.e. search for “UA-XXXXXXX” brought up a page which listed a range of sites (I’ve since forgotten which services and the ID has changed), and this one was linked to the others.

    You can test the same with other prefixes, like yours above (though that one returns for other reasons).

    This type of Analytics API string-matching is offered as a premium service at sites like DomainTools.

    I wanted to add a different privacy concern as this article deals with protecting the privacy of the webmaster. The Indymedia Network (which has dealt with numerous server seizures and court-ordered demands for IP address within web server logs) has long had a policy of using a FIFO to log IP addresses but scrubbing them in real-time as logs are saved to disk.

    Although not officially codified, it logically follows that Indymedia web properties should not use Google Analytics or other dot-com stats offerings. The idea is that Indymedia tech will do everything in its power to prevent data retention of IP address data.

    This policy has been tested to work; Indymedia has successfully brushed off subpoenas that normally would have costs thousands in legal fees. “We don’t have them” was the only response needed.

    Note that the US is far more progressive on this matter than Europe which has an on-going struggle between lawmakers and activists against “data retention legislation” requiring ISPs to monitor certain datapoints and keep them on file for 6-12 months.

    An alternative is to use awstats (an open source, real-time stats program) and configure it however you wish.

    Wow, I didn’t really know that you could track an anonymous blogger like this. Not saying that they cannot be trust worthy and now I understand why some people remain anonymous but I honestly did not always take them as seriously in the past. I never really thought until now that some people may need to remain anonymous.

    I haven’t had a problem with any anonymous blogger but this is great to know if I ever have a problem. I respect you for how you handled the situation, some people would have publicly exposed them. You handled it quite well!

    I cannot stand anonymous commenting for the most part, I am sorry, I know it is very popular and wide-spread. I use it myself sometimes. But I have been an eye-witness to the devastating effect of online bullies on Topix who have the power to sit behind a keyboard and spew hate, slander, defamation, cruelty and lies, all day and night long.

    Its total naivety to assume that words cannot have close to as devastating an affect on other human beings as can other violent weapons. And violent, harmful words are being put out by the truckload, words used as weapons.

    The First Amendment was not written to allow people to harm each other. And yet harm is being done, at a volume that is unreported, but massive.

    This is scary. Thanks for your explanation.

    You can imagine what evil governments can find out when observing even more traces. For example, inserting a blind gif somewhere on the anonymous blogger’s board and then comparing IPs with comment publishing times.

    I can understand “Don’t Like It” because I feel like commenters should stand behind their words, and bullies don’t deserve protection. If you have convictions, have the courage to speak them…and sign your name. But I’m spoiled too.

    I don’t have to speak out about human rights violations and crime. I think this is important for those who live under the threat of these things to know how they can be exposed. Sadly, it does protect hate spewing bullies at the same time. It’s a double edged sword, but worth it for those out there who could die for speaking out against criminals and murderers.

    This is really interesting. And what’s very strange is I did a search on Statsie for my site and found not only the other site I expected which I also run but another, completely new to me, site (French) sharing the same IP address. I looked at it, but it doesn’t seem to be a current site. No idea what it is though or how it’s possible to share my IP address.

    Venessa: People who speak out against human right’s violations and crime are also being attacked and hard by anonymous commentator’s.

    The common defense of anonymous commenting is that people living in repressive societies are protected from putting their real name out would earn them persecution.

    What is being totally overlooked by this defense is what is happening in our own country (and around the world) by anonymous bullies who have been given full protection from accountability. They are not only causing grave harm to very large numbers of innocent people but also causing harm to those trying to make a positive difference in the world.

    The entire issue of anonymous commenting needs to be re-examined based on the facts of how it is actually being used.

    I’ve been waiting for a search engine – with an index as extensive as Google’s – that has the ability to search within source code.

    I doubt Google will ever allow it but that ability would be a game changer.

    By the way… in Google Analytics you can create new “accounts” which will assign an entirely new account number to whatever websites you create profiles for in the account. The only one who could tell that the account numbers are under the same user is Google.

    I’m pretty sure Google is smart enough to figure out (if they wanted to or needed to) who created or uses a Google Analytics account… even if you used a dummy email address. e.g. similar user activity, by IP addresses and login time, geo location, multi-account sign in, search history, ect, ect.

    Just sayin’.

    “What is being totally overlooked by this defense is what is happening in our own country (and around the world) by anonymous bullies who have been given full protection from accountability. They are not only causing grave harm to very large numbers of innocent people but also causing harm to those trying to make a positive difference in the world.”

    This is bullshit. Anonymous bullies never killed anyone with their words you can simply ignore them or develop a thick skin. On the other hand exposing anonymous posters can actually lead to death of them. So it is better to put up with a few word bullies who you can ignore or flame back than it is to support unmasking anonymous posters.

    Nice article by the way.

    I do feel safe and satisfied that I read this post on your site. I never knew about this issue. Great recommendations also.

    In a previous life we used this strategy with a bot to crawl sites and identify which ones were likely part of a network of sites or under a potential single owner. I can’t go into detail why but the sales guys loved having that data and it was so easy to find and regex out! Heck you didn’t even need a regex most of the time.

    Dynamically populating the AnalyAnalytics tics ID of you site through an external script will usually work too and also aids development when toggling between dev and production environments. Though I’m not sure how through the bots on the free services you mentioned are at getting past that trick.

    Very nice write up.

    Regarding protecting yourself, point 2.

    I wouldn’t recommend that. Keep your domain name and your hosting provider seperate.

    To many cases like that go south and the domain owners being left high and dry.

    Have a great day.

    Even as a longtime waxy.org fan, I was struck by how much this is a great post.

    I’ve been online since before the golden years of Usenet (remember when email was delivered in packets?), and my sympathy for anonymity has shifted back and forth. Much more forth, of late.

    Andy, this is a beautiful articulation of the pros and cons—weighing the need to allow “spittle-flecked rage” if that also allows for communicating in a way that admits and fosters change.

    Thanks.

    It is great to see a criticism (that is valid) offer a solution, so many big news sights so called experts just make sensational claims but can’t offer any remedy, well done. (i do however note that you have not aloud me to remain anonymous) 🙂

    Another tip! MONEY is a BIG identity-revealer. if ANYTHING, connected to what you’re doing, even if 3-4 steps away form what it is (things can be traced in a line no problem), costs money and you have to pay for the service you’re using (like a VPN, site hosting), USE AN ANONYMOUS PREPAID CREDIT CARD.

    for usa, simon gift card, for uk, idt prime card, and there’s others in other countries too.

    if you’re even more about being identified in cctv cameras of the stores you buy the prepaid card from: just cover yourself accordingly there too, and also pick a location that doesn’t give away your location. in the center of a large city is best.

    Another tip: if you have your own domain and need protection from domain being taken down or info extracted from your domain account: off STRONG offshore domain registrar.

    one of the very best is katzglobal.com. bit more money but you get these double legal protections, it’s in a country that won’t cooperate with most others and has local laws securing it.

    It helps…….

    This is scary. For example, inserting a blind gif somewhere on the anonymous blogger’s board and then comparing IPs with comment publishing times.

    Wow! A great read, although a little scary as I now realise my analytics is a real Achilles heel. Have been thinking of changing stat collection, I may have to do that now.

    Thanks, Mark.

    A bully is someone who threatens you with harm *and won’t let you walk away.* Someone typing on a computer may harass, libel, or make you feel bad, but it’s you who are choosing to stay there and read it.

Comments are closed.