Skip to content
Waxy.org
About
Mastodon
Contact

Turning Photos into 2.5 Parallax Animations with Machine Learning

Posted November 20, 2019March 10, 2021 by Andy Baio

For years, filmmakers have used 2.5D parallax to make static photos feel more dynamic, as in The Kid Stays In The Picture, the 2002 documentary about film producer Robert Evans that popularized the technique.

Traditionally, a video editor would use Photoshop to isolate the photo elements on separate layers, fill in the removed objects to complete the background, and animate the layers in a tool like After Effects. Here’s a typical tutorial, showing the time-consuming and tedious process.

Last September, a team at Adobe Research released a paper and video demonstrating a new technique for animating a still image with a virtual camera and zoom, adding parallax to create what they call “the 3D Ken Burns effect.”

This new technique uses deep learning and computer vision to turn a static photo into a 2.5D parallax animation in seconds, using a neural network to estimate depths, render a virtual camera through space, and fill in the missing areas.

On Monday night, researcher Simon Niklaus finally got permission to release the code, posting it to Github with a CC-NC license, allowing anyone to experiment with it for themselves.

Sample Animations

It’s incredibly fun to play with. I ran some famous images through it, and then put a call out to Twitter for more ideas. Here are the results. Click anywhere on the video to play it.

John Rooney/Associated Press, Ali vs. Liston (1965)
Alfred Eisenstaedt, V-J Day in Times Square (1945)
Elvis Presley and Richard Nixon (1970)
Pete Souza, Situation Room
Matt McClain/Washington Post, Gordon Sondland testifies
Disaster Girl
The Unexplainable Picture

Surprisingly, it even works on illustrations and paintings.

Martin Handford, Where’s Waldo
Georges Seurat, A Sunday Afternoon on the Island of La Grande Jatte

Try It Yourself

Unlike the Spleeter library, the Ken Burns 3D library requires using PyTorch with an Nvidia GPU with CUDA drivers. Sadly, Apple phased out CUDA support in Mojave, but there’s an even easier way to play around with it.

I created a Google Colab notebook here, which will let you process images on Google’s GPUs entirely in your browser.

If you’re unfamiliar with Colab, it can be a bit intimidating. Try this, and let me know if you get stuck.

  1. Open the Google Colab notebook.
  2. Click File->Open In Playground Mode to run it yourself.
  3. Click “Connect” to connect to a hosted runtime, a temporary Google server where the commands will run.
  4. From the “Runtime” menu, click “Run All.” A warning will pop up, you can click “Run Anyway.”
  5. On the left-hand side of the window, click the tab to open the Files sidebar.
  6. The final command processes the “doublestrike.jpg” sample image, and generates a new file in the /images directory called “autozoom.mp4.”
  7. Upload your own images by right-clicking the “images” folder and clicking Upload. Change the input/output filenames, and click the play button to run the final command again.

Good luck!

Update: This Google Colab notebook by Manuel Romero is much faster and easier to use, with a handy widget to upload files, process images in bulk, and download all the finished animations.

29 Comments

The Deletion of Yahoo! Groups and Archive Team’s Rescue Effort

Posted November 5, 2019April 6, 2021 by Andy Baio
Yahoo! Groups circa July 2001

On December 14, everything ever posted to Yahoo! Groups in its 20-year history will be permanently deleted from the web. Groups will continue running as email-only mailing lists, but all public content and archives — messages, attachments, photos, and more — will be deleted.

You have until then to find your Yahoo login, sign into their Privacy Dashboard, and request an archive of your Yahoo! Groups.

For me, it took ten full days to get an email that my archive was ready to download — are they doing this by hand!? — but it appears complete: it contained a folder for every group I belonged to, each containing their own ZIP files for messages, files, and links.

The messages archive is a single plain-text file in Mbox email format with every message every posted to the group. That’s enough for me, but if you wanted, you could import into Thunderbird or any other mail app that support Mbox.

In the late ’90s and early 2000s, I belonged to several Yahoo! Groups (and its earlier incarnation, eGroups) for niche online communities, former jobs, small groups of friends, and weird internet side projects. Until the launch of Google Groups, it was the de facto free way to easily set up a hosted mailing list and discussion forum.

The Archive Team wiki charts the rise and fall of Yahoo Groups, showing a peak in 2006, and rapid fall after that.

Yahoo groups date created.png

Many of these private groups are effectively darkweb, accessible only to members of the group. If you don’t save a copy of the private groups you belong to, it may very well be lost for good.

Archive Team’s Rescue Effort

As you’d expect, the volunteer team of rogue archivists known as Archive Team are working hard to preserve as much of Yahoo! Groups as possible before its shutdown.

Their initial crawl discovered nearly 1.5 million groups with public message archives that can be saved, with an estimated 2.1 billion messages between them. As of October 28, they’ve archived an astounding 1.8 billion of those public messages.

Unfortunately, archiving the files, photos, attachments, and links in those groups is much harder: you have to be signed in as a member to view that content, which requires answering a reCaptcha. If you’d like to help answer reCaptchas, they made a Chrome extension to assist with the coordination effort.

If you’d like to nominate a public Yahoo! Group to be saved by Archive Team, you can submit this form. If you’d like them to archive a private group, you can send a membership invite to this email address and it’ll be scheduled for archiving. More details are on the wiki.

5 Comments

Painting with Pure CSS

Posted November 4, 2019April 14, 2020 by Andy Baio
Detail of Diana Smith's "Lace"

Last year, I fell in love with Diana Smith’s stunning CSS paintings: Francine, Vignes, and Zigario. (I loved them so much, I asked her to speak at XOXO’s Art+Code event last year.)

This stunning illustration by @cyanharlow is pure HTML/CSS. Every element was typed by hand, drawing with only a text editor and Chrome dev tools. https://t.co/kzf7BhkLA5 pic.twitter.com/CmYG6LcnRh

— Andy Baio (@waxpancake) May 1, 2018

Incredibly, Diana types these out by hand, layering HTML elements and CSS properties with only a text editor and Chrome Developer Tools. In this post, she talks about the CSS properties she relies on most, with links to what her work would look like without each.

She just released her latest illustration, Lace, inspired by Flemish/baroque art and coded in two weekends, and it’s my favorite so far.

Did another CSS-only art.
Flemish/baroque inspired.
Two weekends. Made for Chrome. pic.twitter.com/d4Z9kkvu1R

— Diana Smith (@cyanharlow) November 4, 2019

Her illustrations are designed for Chrome, but don’t let that stop you from viewing them in other browsers, especially older ones. Each collapses and distorts in unexpected ways, revealing the subtle differences between browsers as they evolved over time.

It’s only designed for Chrome, but don’t let that stop you from trying it in other browsers: the older, the better! Here it is in Chrome 17, Firefox 3.6, Chrome 9, and (my favorite) Internet Explorer 5.1.7 for Mac. pic.twitter.com/dFNYKi8Myf

— Andy Baio (@waxpancake) May 1, 2018

Here’s what “Lace” looks like in some other browsers I tried.

Internet Explorer 8 for Windows 7
Chrome 45 for Windows
Safari 13
Safari 10.1
Firefox 70
2 Comments

Fast and Free Music Separation with Deezer’s Machine Learning Library

Posted November 4, 2019April 14, 2020 by Andy Baio

Cleanly isolating vocals from drums, bass, piano, and other musical accompaniment is the dream of every mashup artist, karaoke fan, and producer. Commercial solutions exist, but can be expensive and unreliable. Techniques like phase cancellation have very mixed results.

The engineering team behind streaming music service Deezer just open-sourced Spleeter, their audio separation library built on Python and TensorFlow that uses machine learning to quickly and freely separate music into stems. (Read more in today’s announcement.)

The team at @Deezer just released #Spleeter, a Python music source separation library with state-of-the-art pre-trained models! 🎶✨

Straight from command line, you can extract voice, piano, drums… from any music track! Uses @TensorFlow and #Keras.https://t.co/e4lyVtT2lR pic.twitter.com/tDsBMSYiJD

— 👩‍💻 Paige Bailey @ 127.0.0.1 🏡 #BLM (@DynamicWebPaige) November 2, 2019

You can train it yourself if you have the resources, but the three models they released already far surpass any available free tool that I know of, and rival commercial plugins and services. The library ships with three pre-trained models:

  • Two stems – Vocals and Other Accompaniment
  • Four stems – Vocals, Drums, Bass, Other
  • Five stems – Vocals, Drums, Bass, Piano, Other

It took a couple minutes to install the library, which includes installing Conda, and processing audio was much faster than expected.

On my five-year-old MacBook Pro using the CPU only, Spleeter processed audio at a rate of about 5.5x faster than real-time for the simplest two-stem separation, or about one minute of processing time for every 5.5 minutes of audio. Five-stem separation took around three minutes for 5.5 minutes of audio.

When running on a GPU, the Deezer team report speeds 100x faster than real-time for four stems, converting 3.5 hours of music in less than 90 seconds on a single GeForce GTX 1080.

Sample Results

But how are the results? I tried a handful of tracks across multiple genres, and all performed incredibly well. Vocals sometimes get a robotic autotuned feel, but the amount of bleed is shockingly low relative to other solutions.

I ran several songs through the two-stem filter, which is the fastest and most useful. The 30-second samples are the separations from the simplest two-stem model, with links to the original studio tracks where available.

🎶 Lizzo – “Truth Hurts”

Lizzo (Vocals Only)
Lizzo (Music Only)

Compare the above to the isolated vocals generated by PhonicMind, a commercial service that uses machine learning to separate audio, starting at $3.99 per song. The piano is audible throughout PhonicMind’s track.

🎶 Led Zeppelin – “Whole Lotta Love”

Led Zeppelin (Vocals Only)
Led Zeppelin (Music Only)

The original isolated vocals from the master tapes for comparison. Spleeter gets a bit confused with the background vocals, with the secondary slide guitar bleeding into the vocal track.

🎶 Lil Nas X w/Billy Ray Cyrus – “Old Town Road (Remix)”

Lil Nas X (Vocals Only)
Lil Nas X (Music Only)

Part of the beat makes it into Lil Nas X’s vocal track. No studio stems are available, but a fan used the Diplo remix to create this vocals-only track for comparison.

🎶 Marvin Gaye – “I Heard It Through the Grapevine”

Marvin Gaye (Vocals Only)
Marvin Gaye (Music Only)

Some of the background vocals get included in both tracks here, which is probably great for karaoke, but may not be ideal for remixing. Compare this to 1:10 in the studio vocals.

🎶 Billie Eilish – “Bad Guy”

Billie Eilish (Vocals Only)
Billie Eilish (Music Only)

I thought this one would be a disaster—the vocals are heavily processed and lower in the mix with a dynamic bass dominating the song—but it worked surprisingly well, though some of the snaps bleed through.

🎶 Van Halen – “Runnin’ With The Devil”

Van Halen – “Runnin’ With The Devil” (Vocals Only)
Van Halen – “Runnin’ With The Devil” (Music Only)

Spleeter had a difficult time with this one, but still not bad. You can compare the results generated by Spleeter to the famously viral isolated vocals by David Lee Roth, dry with no vocal effects applied.

Open-Unmix

The release of Spleeter comes shortly after the release of Open-Unmix, another open-source separation library for Python that similarly uses deep neural networks with TensorFlow for source separation.

In my testing, Open-Unmix separated audio at about 35% of the speed of Spleeter, didn’t support MP3 files, and generated noticeably worse results. Compare the output from Open-Unmix below for Lizzo’s isolated vocals, with drums clearly audible once they kick in at the 0:18 mark.

The quality issues can likely be attributed to the model released with Open-Unmix, which was trained on a relatively small set of 150 songs available in the MUSDB18 dataset. The team behind Open Unmix is also working on “UMX PRO,” a more extensive model trained on a larger dataset, but it’s not publicly available for testing.

What Now?

Years ago, I made a goofy experiment called Waxymash, taking four random isolated music tracks off YouTube, and colliding them into the world’s worst mashup. But I was mostly limited to a small number of well-known songs that had their stems leak online, or the few that could be separated cleanly with channel manipulation.

With processing speeds at 100 times faster than real-time playback on a single GPU, it’s now possible to turn all recorded music into a mashup or karaoke without access to the source audio. It may not be legal, but it’s definitely possible.

What would you build with it? I’d love to hear your ideas.

Thanks to Paige for the initial tip!

Updates

This thing is dangerously fun.

nobody should have this kind of power pic.twitter.com/4vbl2MGK4Z

— Andy Baio (@waxpancake) November 5, 2019

November 11. You can now play with Spleeter entirely in the browser with Moises.ai, a free service by Geraldo Ramos. After uploading an MP3, it will email you a link to download the stems.

Also, the Deezer team made Spleeter available as a Jupyter notebook within Google Colab. In my testing, larger audio files won’t play directly within Colab, and will need to be downloaded first to listen to.

25 Comments

Panic Announces Playdate

Posted May 22, 2019April 14, 2020 by Andy Baio

Today, Panic announced Playdate, their lovingly retro-modern handheld gaming system with a lineup of custom-made indie games from some of my favorite game designers in the world.

I’m fortunate enough to know the Panic folks, some of the kindest and most creative people I’ve ever met, and I’ve followed this project over the last few years—almost since its inception.

Designed and engineered in partnership with Teenage Engineering, Playdate features a unique crank in addition to standard directional arrow and button controls, which enables new kinds of experimental gameplay.

Introducing Playdate, a new handheld gaming system from Panic.

It fits in your pocket. It's got a black and white screen. It includes a season of brand-new games from amazing creators. Oh and… there's a crank???? https://t.co/WiIPUkpjSq

Yes. A thread… pic.twitter.com/47BwSOtiiP

— Playdate (@playdate) May 22, 2019

Best of all, the game lineup is a complete surprise and comes with the purchase of every Playdate: a dozen secret games released weekly in the first “season,” automatically downloaded to the device every Monday, from a team of all-star indie designers including Keita Takahashi, Bennett Foddy, Zach Gage, and Shaun Inman.

About those games. We reached out to some of our favorite people, like @KeitaTakahash, @bfod, @helvetica, @shauninman, and many more.

Here's a peek at one: Crankin's Time Travel Adventure, from Keita. It's fun and funny. pic.twitter.com/0Ibwqr5k3I

— Playdate (@playdate) May 22, 2019

It’s very special, and I’m so excited to see it finally announced after years of secret development. I can’t wait to buy one.

⇠ Older Posts
Newer Posts ⇢
Waxy.org | About