January 31, 2014
Adrian Holovaty on Chicago and bootstrapping
— this could easily apply to Portland, or countless other nascent tech scenes #
Alexis Madrigal tracks down the teen creators of @HistoryInPics
— the only thing saving them from lawsuits, for now, is that it's non-commercial and poorly-indexed #
Candy Quest 3: Edge of Sweetness
— Michael Brough takes on Twine; made for The Candy Jam, a statement against trademark trolls #
PBS Idea Channel on the experience of being trolled
— Mike Rugnetta's epiphany makes for a very interesting episode (via) #
Boston Globe profiles Darius Kazemi
— the prolific bot creator and one of my favoritest people on the Internet right now (via) #
Steve Jobs' first public demo of the Mac in 1984
— unreleased until now, similar to the famous shareholders' meeting (via) #
Backer, crowdfunding for features
— originally built to gauge whether App.net should add Bitcoin, they extended it to everyone #
Dirty, Fast, and Free Audio Transcription with YouTube
Five years ago, I wrote about how I transcribe audio with Amazon’s Mechanical Turk, splitting interviews into small segments and distributing the work among dozens of anonymous people. It ended up as one of my most popular posts ever, continuing to draw traffic and comments every day.
Lately, I’ve been toying with a free, fast way to generate machine transcriptions: repurposing YouTube’s automatic captions feature.
How It Works
Every time you upload a video, YouTube tries to generate a caption file. If there’s audible text, you can grab a subtitle file within a few minutes of uploading the video.
But how’s the quality? Pretty mediocre! It’s about as good as you’d expect from a free machine-generated transcript. The caption files have no punctuation between sentences, speakers aren’t broken out separately, and errors are very common.
But if you’re transcribing interviews, it’s often easier to edit a flawed transcript than starting from scratch. And YouTube provides a solid interface for editing your transcript audio and getting the results in plaintext.
I used TunesToTube, a free service for uploading MP3s to YouTube, to upload the first 15 minutes of our New Disruptors interview, with permission from Glenn Fleishman.
It took about 30 seconds for TunesToTube to generate the 15-minute-long video, three seconds to upload it, and about a minute for the video to be viewable on my account.
It takes a bit more time for YouTube to generate the audio transcriptions. Testing in the middle of a weekday, it took about six minutes to transcribe a two-minute video, and around 30 minutes for the 15-minute video. Fortunately, there’s nothing you need to do while it processes. Just upload and wait.
I ran a number of familiar film monologues through the YouTube’s transcription engine, and the results vary from solid to laughably bad. I’ve posted the videos below with the automatic transcription and their actual text.
As you’d expect, it works best with clear enunciation and spoken word. Soft words over background music, like in the Breakfast Club clip, falls apart pretty quick. But some, like Independence Day, aren’t terrible.
Continue reading “Dirty, Fast, and Free Audio Transcription with YouTube”
Greg Knauss on the reactions to Romantimatic
— the iOS app sends periodic reminders to express your love, an idea that's proven divisive #
This Is Not A Conspiracy Theory
— the first installment of Kirby Ferguson's followup to Everything Is A Remix #
Tracking 20 years of computer history using Law & Order
— a preview on Jeffrey Thompson's blog, including every URL ever mentioned on the show #
Moving the Race Conversation Forward
— Jay Smooth breaks down a new report on media coverage of race #
Mathematician hacks OKCupid to find the perfect mate
— scraping survey data with multiple accounts to cluster results himself #
The Verge on angry smartphone fanboys
— "But it isn't necessarily about loving the phone... It's about what the phone represents." #
What Grantland Got Wrong
— ESPN's Christina Kahrl on the disastrous article that resulted in the suicide of a trans woman; Bill Simmons' apology #
Coding Math
— Keith Peters' free video lessons teach the math useful in coding; support his work (via) #
New Republic on the evolving usage of the period
— ending sentences with a period can feel abrupt and harsh in text messages #
Everpix releases internal metrics, financials, VC feedback
— after its recent closure, the ultimate postmortem; fun to see the rejections #
Ellen DeGeneres' "Walter Mitty" Screener Leaks Online
It’s Oscar piracy season, that time of the year where screeners of newly-released critical darlings are leaked online as DVD and Blu-Ray screeners are sent out around the world to Academy voters and secretly loaned to their friends and relatives.
Yesterday was a busy day with screener copies of Frozen, Her, and The Wolf of Wall Street all appearing online.
Today, I got a tip that there was a very unusual watermark in the screener of The Secret Life of Walter Mitty that leaked online today. I dug it up, and sure enough, a very familiar name pops up in the first scene of the screener. I made a GIF of it:
Oh, Ellen! If the watermark’s accurate, this screener belonged to Ellen DeGeneres. But was it actually an Oscar screener? Probably not.
The watermark shows that the screener was created on November 26, 2013. According to Ken Rudolph’s Academy screener list, he received the Walter Mitty DVD screener via UPS on December 19.
That’s a pretty huge gap, indicating that Ellen’s screener wasn’t for Oscar consideration, but instead given to her for review in advance of Ben Stiller’s December 4 appearance on her show.
Of course, there’s a chance, albeit small, that this watermark was added by someone besides 20th Century Fox — by someone trying to hide the identity of the actual source, maybe.
More likely, the watermark is accurate and Ellen’s screener simply ended up in the wrong hands. A postal worker, one of her employees, friend, family member, or countless others in the production and distribution chain could be responsible for ripping the DVD and putting it online.
It’s very common for screeners to leak, but rare for a celebrity’s name to be identified as the source. In 2011, a screener copy of Super 8 leaked online with Howard Stern’s name clearly watermarked on it. Stern vehemently denied leaking the film on air.
Curious to see if Ellen responds the same way.
As usual, I’ll update my spreadsheet of Oscar screener piracy statistics as soon as the nominees are announced on the morning of January 16.
The Year in Kickstarter 2013
— Oscars, helicopters, space, VR, and nearly 20,000 more funded projects in the world #
Steven Levy's How the NSA Almost Killed the Internet
— the story of the Snowden leaks from inside tech giants #
HitRecord first episode debuts online
— Joseph Gordon-Leavitt takes his community media project to TV (via) #
The NYT editorial board argues clemency for Edward Snowden
— "He may have committed a crime to do so, but he has done his country a great service." #
The Atlantic reverse-engineers Netflix's subgenre categories
— awesome data journalism by Alexis Madrigal with a genre generator by Ian Bogost #