November 19, 2008

Interactive augmented reality demo in Flash with Papervision — very quick with my iSight, I get much higher FPS than in the video (via) #

Very lucky kid on The Price is Right — still, nothing beats Daniel #

Space Invaders anime video produced for 30th anniversary — "All this time we've been... and all they wanted to do was... We're horrible people!" (via) #

Derek Powazek revives Kvetch! a decade later — appropriately backed by Twitter, a primary outlet for kvetching #

Musicians Get Meta in Guitar Hero and Rock Band

Posted November 19, 2008 by Andy Baio

There’s something satisfyingly self-referential about watching talented musicians try to play their own music in Rock Band and Guitar Hero. Especially when they’re worse than you.

Here’s a list of every video I could find. Let me know if I missed any.

Anthrax’s Scott Ian, “Madhouse” at Best Buy

“You suck. You’re going to have to write easier songs… 20 years ago.”

Continue reading “Musicians Get Meta in Guitar Hero and Rock Band” →

November 18, 2008

Bike Hero, biking a Guitar Hero level in the real world — most likely a commercial viral, and maybe even fake, but does it matter? beyond awesome #

Chuck Klosterman reviews Chinese Democracy — mostly posting this just to beat Rex to it #

The A.V. Club's 27 popular websites that became books — though they missed Belle de Jour, The Washingtonienne, Fucked Company, Fark, and ZUG #

Speed Guitar goes to the Los Angeles County Museum of Art — every hour, on the hour, for one solid minute of metal complete with gothic arch and smoke machine #

MGMT's "Kids" on the iPhone Ocarina — "the iPhone Ocarina officially replaces the recorder as the nerdiest instrument I can play" #

Mena Trott responds to Valleywag article about their Disneyland vacation — my favorite was Space Mountain Snob #

LIFE Magazine photo archive hosted by Google — millions of high-res photos, most never published #

Amazon launches CloudFront, their pay-as-you-go CDN — very complementary with S3 #

Deconstructing Google Mobile's Voice Search on the iPhone

Posted November 18, 2008 by Andy Baio

I’ve experimented with audio transcription lately, but always with big, clumsy humans. I’d happily use ~~cyborgs~~ speech recognition software, but even today, automatic conversion of voice-to-text is still flawed. Naturally, I was intrigued when Google announced they were adding voice searching to their Google Mobile iPhone app.

Google’s flirted with voice-to-text conversion in the past, with GOOG-411 and their Audio Indexing of political videos on YouTube. But this is the first time they’re offering a web-accessible interface for speech conversion, albeit completely undocumented, so I decided to poke around a bit to see what I could find.

Over the last few hours, I’ve been analyzing the traffic proxied through my network, trying to reverse-engineer it to get to something usable, but I’ve hit my limits. I’m posting this with the hopes that someone out there can run with it and find out more.

Behind the Scenes

Here’s what we know so far: When you first start speaking into the microphone, the app opens a connection to Google’s server and starts sending over chunks of audio, almost certainly encoded with the open-source Speex codec.

The waveform image is generated on the phone and displayed along with a “Working” indicator and the adorable “beep-boop” sounds. In the background, a tiny file is being sent as a POST request to http://www.google.com/m/appreq/gmiphone. Here’s what the headers look like:

POST /m/appreq/gmiphone HTTP/1.1

User-Agent: Google/0.3.142.951 CFNetwork/339.3 Darwin/9.4.1

Content-Type: application/binary

Content-Length: 271

Accept: */*

Accept-Language: en-us

Accept-Encoding: gzip, deflate

Pragma: no-cache

Connection: keep-alive

Connection: keep-alive

Host: www.google.com

The response from Google is an even smaller attachment. These two files are the same for every query, so don’t contain any meaningful information.

HTTP/1.1 200 OK

Content-Type: application/binary

Content-Disposition: attachment

Date: Tue, 18 Nov 2008 13:06:53 GMT

X-Content-Type-Options: nosniff

Expires: Tue, 18 Nov 2008 13:06:53 GMT

Cache-Control: private, max-age=0

Content-Length: 114

Server: GFE/1.3

After the audio’s sent to Google, they return an HTML page with the results and a second request is triggered, this time a GET request to clients1.google.com with the converted voice-to-text string.

GET /complete/search?client=iphoneapp&hjson=t&types=t

&spell=t&nav=2&hl=en&q=chicken%20soup HTTP/1.1

User-Agent: Google/0.3.142.951 CFNetwork/339.3 Darwin/9.4.1

Accept: */*

Accept-Language: en-us

Accept-Encoding: gzip, deflate

Pragma: no-cache

Connection: keep-alive

Connection: keep-alive

Host: clients1.google.com

The response is an array of search terms in JSON format, for use in search autocompletion.

["chicken soup",[["http://www.chickensoup.com/","Chicken Soup for the Soul",5,""],["http://www.chickensoupforthepetloverssoul.com/","Chicken Soup for the Pet Lover's Soul",5,""],["chicken soup recipe","489,000 results",0,"2"],["chicken soup for the soul","1,470,000 results",0,"3"],["chicken soup dog food","462,000 results",0,"4"],["chicken soup with rice","467,000 results",0,"5"],["chicken soup diet","453,000 results",0,"6"],["chicken soup from scratch","364,000 results",0,"7"],["chicken soup for the soul quotes","398,000 results",0,"8"],["chicken soup crock pot","604,000 results",0,"9"]]]

Help!

Unfortunately, until we can isolate and decode the audio stream, playing with the voice recognition features is out of reach.

Any ideas on cracking this mystery would be hugely appreciated. Anonymity for Google insiders is guaranteed!

Updates

As several commenters figured out, and confirmed to me by Google, the audio is being sent to Google’s servers for voice recognition. The two binaries I posted above aren’t the actual transmission, and are actually identical for every query, so can be disregarded. Sorry about the red herring.

Gummi Hafsteinsson, product manager for Google’s Voice Search, says, “I can confirm that we split the audio down to a smaller byte stream, which is then sent to Google for recognition, but we can’t really provide any details beyond that.” Responding to my request for a public API, he added, “I appreciate the suggestion to provide voice recognition as a service. Right now we have nothing to announce, but we’ll take this feedback as we look at future product ideas.”

Also, Chris Messina discovered some secret settings in the application’s preferences file, including alternate color schemes and sound sets for “Monkey” and “Chicken.” Beep-boop!

Next step: As Paul discovered in the comments, the Legal Notices page says clearly that the app uses the open-source Speex codec for voice encoding. Can anyone capture and decode the audio being sent to Google?

November 19: I rewrote most of this entry to reflect the new information, since it was confusing new readers.

November 17, 2008

John Hodgman, Jonathan Coulton, and the Long Winters perform "Tonight You Belong to Me" — "Thank you, normal-sized man." #

Jerry Yang stepping down from Yahoo's CEO post — it never really fit him well, though I'll miss his e.e. cummings memos #

Woman asks Apple community about an unusual iPhone glitch — no, raunchy photos don't accidentally attach themselves to outbound email #

Greasemonkey script to pull WikiDashboard visualization into Wikipedia — I made a LazyWeb plea for this last week, and Paul Irish came through #

Lee Byron's Fireflies, anaglyph 3D game for Mac — part of Kokoromi's Gamma 3D showcase of anaglyph games #

Flickr Boundaries, tool to explore Flickr's shapefiles — read Tom Taylor's entry for more information #

Cooking Mama, the Unauthorized PETA Edition — a strangely obscure target for their attention, with a petition to write to the game's publisher (via) #

Boing Boing launches gaming blog, Offworld — good writing in a nice design from Brandon Boyer, former news editor of Gamasutra #

"Violet" wins the Interactive Fiction Comp 2008 — play it online; glancing at the charts, it looks like Buried in Shoes was the most divisive #

Trailer for J.J. Abrams' Star Trek prequel — looks surprisingly good, but I'm a sucker for origin stories; I even liked Enterprise #

What would Depression 2009 look like? — Tim sums up the thought-provoking Boston Globe article #

The Pirate Bay hits 25 million simultaneous peers — that's not unique people, but concurrent connections; Napster peaked at 26M users #

Peter Hirschberg releases Adventure as a free iPhone app — related: Chasing Ghosts will finally be released on ~~BitTorrent~~ Showtime in December (via) #

The Big Picture on the California wildfires — also: first-person coverage on Twitter and YouTube, like this freeway on fire and aftermath #

Tim-Tams available at Target until March, first time available in the U.S. — best chocolate cookies ever, the Tim Tam Slam is a chocolaty revelation (via) #

JS-909, a Javascript drum machine without Flash — through a hack, it even works in IE 6 #

November 14, 2008

Esquire's hosting Between, the new two-player networked game by Jason Rohrer — from the creator of Passage #

"What's that buzzing noise from my BBQ?" — he thought he was killing a few bees, but ends up annihilating an entire colony (via) #

November 13, 2008

Kottke explains how to embed high-quality YouTube videos — I knew how to save, link, and change the default, but the embedding hack was new to me #

Web 2.0 Origami — lazyweb, please build a converter that creates folding patterns from an uploaded image #

Pixar's Burn-E short on YouTube — here's an interview with the director #

Valleywag folded into Gawker, all but Owen Thomas laid off — I won't miss it; they hurt a lot of good people and interesting projects in the quest for pageviews (via) #

YouTube engineer adds "Actually Good" tab when viewing Onion video — here's a screenshot in case it goes away #

November 12, 2008

MSNBC's Rachel Maddow wears pajamas on-air in solidarity with bloggers — maybe Palin was too busy reading every newspaper to actually read a blog #

Jimi Hendrix drummer Mitch Mitchell, dead at 61 — I wish more newspapers would link to YouTube videos #

Brandon Hardesty reenacts Alec Baldwin's Glengarry Glen Ross monologue — I first linked to Brandon way back in March 2006 #

Videos of CNN's election-night countdown globally — the collective response was spontaneous and virtually identical around the world (via) #

Washington Post blogger shuts down company sending out two-thirds of all spam — in his investigative report, he turned over four months of data-gathering to the colo, who sut them down #

Michael Lewis revisits "Liar's Poker" and writes about the current Wall Street meltdown — a gripping look at who foresaw and acted on the mess (via) #

Japanese isometric PSA about the future of food — lovely design, this could easily be adapted to the US (via) #

QWOP Olympics, ragdoll physics running game — hard to believe, but with practice, it's possible to sprint (via) #

Sluggo ponders the fundamental question of existence #

LittleBigPlanet's TV commercials, built entirely with the in-game tools — the first one seems inspired by You Suck at Photoshop, no? (via) #

What We Own and Where It's Made — Dorothy's slowly been categorizing all her possessions #

Slate on aXXo, the most popular movie distributor on BitTorrent — this comment explains the drama between aXXo and other competing groups #

The Yes Men pranksters distribute fake New York Times issue across NYC — 100,000 copies with the headline "Iraq War Ends" post-dated July 4, 2009 #