February 28, 2011
Tracking the biggest losers in Google's content farm algorithm changes — love the hand-wringing in the comments
The economics of indie writers in the Kindle Marketplace — also, how indie successes are shifting the culture of openness
Antimatière — experimental gameplay navigating a 3D world turned 2D
Gina Trapani runs the numbers on ThinkUp's first year — an amazing community working on an amazing project
Turtlecalls.com — “the copycats may be cheaper but they barely even sound like real turtles”
For the last three weeks, I’ve indexed The Daily. Now that my free trial’s up, I’ve had an intimate look at what they have to offer and, sad to say, I don’t plan on subscribing. As a result, I’m ending The Daily: Indexed, my unofficial table of contents for every article they published publicly.
I’m surprised and grateful that The Daily executive and legal team never tried to shut it down. On the contrary, when asked directly about it, publisher Greg Clayman said, “If people like our content enough to put it together in a blog and share it with folks, that’s great! It drives people back to us.” They seem like a nice bunch of folks, and I hope they succeed with their big publishing experiment.
But now that I’m ending it, I can finally address the most common question — how did I do it?
The Daily: Indexed is just a list of article headlines, bylines, and links to each article on The Daily’s official website. Anyone can grab the links from the Daily iPad app by clicking each article’s “Share by Email” button, but that would’ve taken me far too long. So, how to automate the process?
When you first start The Daily application, it connects to their central server to check for a new edition, and then downloads a 1.5MB JSON file with the complete metadata for that issue. It includes everything — the complete text of the issue, layout metadata, and the public URLs.
But how can you get access to that file? My first attempt was to proxy all of the iPad’s traffic through my laptop and use Wireshark to inspect it. As it turns out, The Daily encrypts all traffic between your iPad and their servers. I was able to see connections being made to various servers, but couldn’t see what was being sent.
Enter Charles, a brilliantly-designed web debugging proxy for Mac, Windows, and Linux. By default, Charles will listen to all of your HTTP network traffic and show you simple, but powerful, views of all your web requests. But it can also act as an SSL proxy, sitting in the middle of previously-secure transactions between your browser and an SSL server.
After grabbing the JSON, I was able to write a simple Python script to extract the metadata I needed and spit out the HTML for use on the Tumblr page. Here’s how to do it.
2. For Mac users, start Network Utility to get your desktop’s local IP address. Start your iPad, make sure it’s on the same wireless network as your desktop, and go into Settings>Network>Wi-Fi. Select the wireless network, and click the right arrow next to it to configure advanced settings. Under “HTTP Proxy,” select “Manual.” Enter the IP address of your desktop for “Server” and enter in “8888” for the port.
3. Now, start Charles on your desktop and, on the iPad, try loading any website. You should see assets from that website appear in Charles. If so, you’re ready to sniff The Daily’s iPad app.
Indexing the Daily
1. Start the Daily app on the iPad. Wait for it to download today’s issue. In Charles, drill down to https://app.thedaily.com/ipad/v1/issue/current, and select “JSON Text.”
2. Copy and paste the raw JSON into a text file.
3. This Python script takes the JSON file as input, and spits out a snippet of HTML suitable for blogging. I simply pasted the output from that script into Tumblr, made a thumbnail of the cover, and published.
So, that’s it! Hope that was helpful. If any fan of The Daily out there wants to take over publishing duties, I’ll happily pass the Tumblr blog on to you.
How DJ Stolen blackmailed shoutouts from Lady Gaga, Ke$ha, and others — a young German DJ hacked hundreds of accounts before getting caught; you can still hear the shoutouts here (via)
Slate crunches the numbers on the Jeopardy archives — when in doubt, respond with “What is Australia?”
Pixelfari, Neven Mrgan's 8-bit Safari — like viewing the web on an Apple II
Ars Technica on HBGary's proposed plan to discredit Wikileaks and Glenn Greenwald — also: step-by-step details on how Anonymous hacked HBGary
Dissertation Haiku — distilling years of work into 17 syllables
The Atlantic writer competes for the Most Human award in the Loebner's Turing test — a look at different strategies used by chatbots and confederates to appear human
IBM's Watson sizes up Ken Jennings before their Jeopardy match — watch day 1 and day 2; here’s a statistical breakdown of the first match
Hulu Plus gets the complete Criterion Collection — 150 films available today, more added monthly
Kottke shares the first Gawker design — I totally forgot he designed the logo
Interim Apple Chief Under Fire After Unveiling Grotesque New MacBook — “you need to shave the USB ports every couple days”
Stuart McMillen's "St. Matthew Island" — “How big is our island?”
Unofficial SXSW Music 2011 torrent now available — free songs from 792 showcasing artists
Byrne Reese on why MovableType lost to WordPress — trivia: he says the Huffington Post never paid for MT, which they still use today
Threatened BBC websites crawled and shared as 1.88GB torrent — the torrent holds 172 websites set for closure
Video: DJ Filetype SWF — Joel Holmberg mixes the background music from different browser tabs
Clement Valla's Seed Drawings — Mechanical Turk workers copying each other’s drawings in a visual game of Telephone
2D Boy on World of Goo's iPad launch and sales figures — sold 125k copies in the first month, by far the fastest selling by both units and revenue
Washington Post's visualization of global weight gain since 1980 — more analysis in the article (via)
Google releases Translate for iPhone — the Babelfish is near, just need a version that translates constantly
OkCupid crunches the data on best first-date questions — as always, some incredibly entertaining correlations