Skip to content
Waxy.org
About
Mastodon
Contact

A mysterious voice is haunting American Airlines’ in-flight announcements and nobody knows how

Posted September 23, 2022September 26, 2022 by Andy Baio

Here’s a little mystery for you: there are multiple reports of a mysterious voice grunting, moaning, and groaning on American Airlines’ in-flight announcement systems, sometimes lasting the duration of the flight — and nobody knows who’s responsible or how they did it.

Actor/producer Emerson Collins was the first to post video, from his Denver flight on September 6:

The weirdest flight ever.
These sounds started over the intercom before takeoff and continued throughout the flight.
They couldn’t stop it, and after landing still had no idea what it was. pic.twitter.com/F8lJlZHJ63

— Emerson Collins (@ActuallyEmerson) September 23, 2022

Here’s an MP3 of the audio with just the groans, moans, and grunts, with some of the background noise filtered out.

This is the only video evidence so far, but Emerson is one of several people who have experienced this on multiple different American Airlines flights. This thread from JonNYC collected several different reports from airline employees and insiders, on both Airbus A321 and Boeing 737-800 planes.

Other people have reported similar experiences, always on American Airlines, going as far back as July. Every known incident has gone through the greater Los Angeles area (including Santa Ana) or Dallas-Fort Worth. Here are all the incidents I’ve seen so far, in chronological order:

  • July – American Airlines, JFK to LAX. Bradley P. Allen wrote, “My wife and I experienced this during an AA flight in July. To be clear, it was just sounds like the moans and groans of someone in extreme pain. The crew said that it had happened before, and had no explanation. Occurred briefly 3 or 4 times early in the flight, then stopped.” (Additional flight details via the LA Times.)
  • August 5 – American Airlines 117. JFK to LAX. Wendy Wanderman wrote, “It happened on my flight August 5 from JFK to LAX and it was an older A321 that I was on. It was Flight 117. There was flight crew that was on the same plane a couple days earlier and the same thing happened. It was funny and unsettling.”
  • September 6 – American Airlines. Santa Ana, CA to Dallas-Fort Worth. Emerson Collins’ flight. “These sounds started over the intercom before takeoff and continued throughout the flight. They couldn’t stop it, and after landing still had no idea what it was… I filmed about fifteen minutes, then again during service. It was calmer for a while mid flight.”
  • Mid-September – American Airlines, Airbus A320. Orlando, FL to Dallas-Fort Worth. Doug Boehner wrote, “This happened to me last week. It wasn’t the whole flight, but periodically weird phrases and sounds. Then a huge ‘oh yeah’ when we landed. We thought the pilot left his mic open.”
  • September 18 – American Airlines 1631, Santa Ana, CA to Dallas-Fort Worth. Boeing 737-800. An anonymous report passed on by JonNYC, “Currently on AA1631 and someone keeps hacking into the PA and making moaning and screaming sounds 😨 the flight attendants are standing by their phones because it isn’t them and the captain just came on and told us they don’t think the flight systems are compromised so we will finish the flight to DFW. Sounded like a male voice and wouldn’t last more than 5-10 seconds before stopping. And has [intermittently] happened on and off all flight long.” (And here’s a second person on the same flight.)

Interestingly, JonNYC followed up with the person who reported the incident on September 18 and asked if it sounded like the same voice in the video. “Very very similar. Same voice! But ours was less aggressive. Although their volume might have been turned up more making it sound more aggressive. 100% positive same voice.“

Official Response

View from the Wing’s Gary Leff asked American Airlines about the issue, and their official response is that it’s a mechanical issue with the PA amplifier. The LA Times followed up on Saturday, with slightly more information:

“Our maintenance team thoroughly inspected the aircraft and the PA system and determined the sounds were caused by a mechanical issue with the PA amplifier, which raises the volume of the PA system when the engines are running,” said Sarah Jantz, a spokesperson for American.

Jantz said the P.A. systems are hardwired with no external access and no Wi-Fi component. The airline’s maintenance team is reviewing the additional reports. Jantz did not respond to questions about how many reports it has received and whether the reports are from different aircrafts.

This explanation feels incomplete to me. How can an amplifier malfunction broadcast what sounds like a human voice without external access? On multiple flights and aircraft? They seem to be saying the source is artificial, but has anyone heard artificial noise that sounds this human?

Why This Is So Bizarre

By nature, passenger announcement systems on planes are hardwired, closed systems, making them incredibly difficult to hack. Professional reverse engineer/hardware hacker/security analyst Andrew Tierney (aka Cybergibbons) dug up the Airbus 321 documents in this thread.

So… We've had a good dig into this.

The A321 passenger announcement system looks to be physically discrete to the interphone and other systems.

We're struggling to see a path. https://t.co/qVdJR6cUm0

— Cybergibbons 🚲🚲🚲 (@cybergibbons) September 23, 2022

They don't tend to put things in planes that are not needed.

You can speak on the interphone (the cabin telephone system) from lots of places including the belly panel and engines. But it's wired.

— Cybergibbons 🚲🚲🚲 (@cybergibbons) September 23, 2022

“And on the A321 documents we have, the passenger announcement system and interphone even have their own handsets. Can’t see how IFE or WiFi would bridge,” Tierney wrote. “Also struggling to see how anyone could pull a prank like this.”

This report found by aviation watchdog JonNYC, posted by a flight attendant on an internal American Airlines message board, points to some sort of as-yet-undiscovered remote exploit.

pic.twitter.com/5Ol4zuSb9u

— 🇺🇦 JonNYC 🇺🇦 (@xJonNYC) September 20, 2022

We also know that, at least on Emerson Collins’ flight, there was no in-seat entertainment, eliminating that as a possible exploit vector.

They did not! This was a watch inflight entertainment on your phone flight

— Emerson Collins (@ActuallyEmerson) September 23, 2022

Theories

So, how could this happen? There are a handful of theories, but they’re very speculative.

Medical Intercom

The first to emerge was this now-debunked theory came from “a former avionics guy” posting in r/aviation on Reddit:

The most likely culprit IMHO is the medical intercom. There are jacks mounted in the overhead bins at intervals down the full length of the airplane that have both receive, transmit and key controls. All somebody would need to do is plug a homemade dongle with a Bluetooth receiver into one of those, take a trip to the lav and start making noises into a paired mic.

The fact that the captain’s announcements are overriding (ducking) it but the flight attendants aren’t is also an indication it’s coming from that system.

If this was how it was done, there’s no reason the prankster would need to hide in the bathrooms: they could trigger a soundboard or prerecorded audio track from their seat.

However, this theory is likely a dead end. JonNYC reports that an anonymous insider confirmed they no longer exist on American Airlines flights. And even if they existed, the medical intercoms didn’t patch into the announcement system. They only allow flight crew to talk to medical staff on the ground.

someone says:
“These don’t exist on AA. We use an app on our iPhone to contact medical personnel on the ground. No such port exists, not since the super80 and they were inop’d.”

— 🇺🇦 JonNYC 🇺🇦 (@xJonNYC) September 24, 2022

Pre-Recorded Audio Message Bug

Another theory, also courtesy of JonNYC, is that there’s an issue with the pre-recorded audio messages (“PRAM”), which were replaced in the last 60 days, within the timeframe of all these incidents. Perhaps some test audio was added to the end of a message, maybe by an engineer who worked on it, and it’s accidentally playing that extra audio?

.. as an intermittent inflight announcement? Maybe the IT version of blowing a slide with a beer in your hand. 😂"

⬆️THE "CRAZY THEORY" PART PLEASE ⬆️

— 🇺🇦 JonNYC 🇺🇦 (@xJonNYC) September 25, 2022

It's probably the PRAM… Pre-Recorded Announcement Machine.

These have solid state storage, techs just load files they get from *somewhere*, test procedure for audio is less than 20 minutes to check it out, and it can be interrupted by inflight announcements.

— Mɪᴄʜᴀᴇʟ Tᴏᴇᴄᴋᴇʀ (@mtoecker) September 23, 2022

Artificial Noise

Finally, some firmly believe that it’s not a human voice at all, but artificial noise or audio feedback filtered through the announcement system.

Nick Anderegg, an engineer with a background in linguistics and phonology, says it’s the results of “random signal passed through a system that extracts human voices.”

An amp malfunction that inputs the signal through algorithms meant to isolate the human voice. All the non-human aspects of the random signal will be stripped out, and the result will appear human. https://t.co/2bXYVCFs2l

— Nick Anderegg loudly supports human rights (@NickAnderegg) September 26, 2022

Anderegg points to a sound heard at the 1:20 mark in Emerson’s video, a “sweep across every frequency,” as evidence that American Airlines’ explanation is accurate.

The tone sweep is just a sign that it’s artificial. Random signals (i.e. interference), when passed through systems designed to isolate the human voice, will make them sound human. It’s attempting to extract a coherent signal where there is none, so it’s approximating one

— Nick Anderegg loudly supports human rights (@NickAnderegg) September 26, 2022

Personally, I struggle with this explanation. The wide variation of utterances heard during Emerson’s three-hour flight are so wildly different, from groans and grunts to moans and shouts, that it’s difficult to imagine it as anything else but human. It’s far from impossible, but I’d love to see anyone try to recreate these sounds with random noise or feedback.

Any Other Ideas?

Any other theories how this might be possible? I’d love to hear them, and I’ll keep this post updated. My favorite theory so far:

Flying hurts the clouds and their screams are picked by the PA system. Seems pretty obvious 🙄

— Horus First (@HorusFirst) September 23, 2022
Photoillustration of an American Airlines jet headed towards a scared cartoon cloud
9 Comments

Online Art Communities Begin Banning AI-Generated Images

Posted September 9, 2022September 13, 2022 by Andy Baio

As AI-generated art platforms like DALL-E 2, Midjourney, and Stable Diffusion explode in popularity, online communities devoted to sharing human-generated art are forced to make a decision: should AI art be allowed?

Collage of dozens of images made with Stable Diffusion, indexed by Lexica

On Sunday, popular furry art community Fur Affinity announced that AI-generated art was not allowed because it “lacked artistic merit.” (In July, one AI furry porn generator was uploading one image every 40 seconds before it was banned.) Their new guidelines are very clear:

Content created by artificial intelligence is not allowed on Fur Affinity.

AI and machine learning applications (DALL-E, Craiyon) sample other artists’ work to create content. That content generated can reference hundreds, even thousands of pieces of work from other artists to create derivative images.

Our goal is to support artists and their content. We don’t believe it’s in our community’s best interests to allow AI generated content on the site.

Last year, the 27-year-old art/animation portal Newgrounds banned images made with Artbreeder, a tool for “breeding” GAN-generated art. Late last month, Newgrounds rewrote their guidelines to explicitly disallow images generated by new generation of AI art platforms:

AI-generated art is not allowed in the Art Portal. This includes using tools such as Midjourney, Dall-E, and Craiyon, in addition fractal generators and websites like ArtBreeder, where the user selects two images and they are combined into a new image via machine learning.

There are cases where some use of AI is ok, for example if you are primarily showcasing your character art but use an AI-generated background. In these cases, please note any elements where AI was used so that it is clear to users and moderators.

Tracing and coloring over AI-generated art is something best shared on your blog, as it is much like tracing over someone else’s art.

Bottom line: We want to keep the focus on art made by people and not have the Art Portal flooded with computer-generated art.

It’s not just long-running online communities: InkBlot is a budding art platform funded on Kickstarter in 2021 that went into open beta just this week. They’ve already taken a “no tolerance” policy against AI art, and updating their terms of service to exclude it.

Hi, we mentioned a few days ago that we have a no tolerance for AI art & working on updating our ToS in coming day for this which you can see in tweet here: https://t.co/5NCCKDYVWv

— 🦋InkBlot @ MEMBERSHIP DRIVE (@inkblot_art) September 9, 2022

Platforms that haven’t taken a stand are now facing public pressure to clarify their policies.

DeviantArt is one of the most popular online art communities, and increasingly, members are complaining that their feeds are getting flooded with AI-generated art. One of the most popular threads in their forums right now asks the staff to “combat AI art” by limiting daily uploads, either by segregating it under a special category or to ban it entirely.

@DeviantArt You were kind of the last art site dedicated to art, but everyday I check the site now more and more its Ai. 10 out of 25 on your front page is Ai gen images. I guess this actually might be the end of a lot of art sites? I hope someone steps in and makes a new site pic.twitter.com/1Kez5FFQQF

— Zakuga Art (@ZakugaMignon) September 6, 2022

ArtStation has also been quiet as AI-generated images grow in popularity there. “Trending on ArtStation” is one of the most popular prompts for AI art because of the particular aesthetic and quality of work found there, which nudges the AI to generate work scraped from it, leading to a future ouroboros where AI models will be trained on AI-generated art found there.

Every time I go to DA or Artstation these days the front pages are flooded with unmodified AI generated slop. Its ugly and makes the sites feel lesser. I go to these places to be inspired, not demoralized.

— RJ Palmer (@arvalis) September 9, 2022

However you feel about the ethics of AI art, online art communities are facing a very real problem of scale: AI art can be created orders of magnitude faster than traditional human-made art. A powerful GPU can generate thousands of images an hour, even while you sleep.

Lexica, a search engine that solely indexed images from Stable Diffusion’s beta tests in Discord, has over 10 million images in it. It would take a lifetime to explore everything in it, a corpus made by a relatively small group of beta testers in a few weeks.

Left unchecked, it’s not hard to imagine AI art crowding out illustrations that took days or weeks for someone to make.

To keep their communities active, community admins and moderators will have to decide what to do with AI art: allow it, segregate it, or ban it entirely.

14 Comments

Perfect Tides, a Coming-of-Age Point-and-Click Adventure, Kickstarts a Sequel

Posted August 31, 2022September 3, 2022 by Andy Baio

There’s no shortage of amazing games so far this year, but my personal favorite is an underdog: Perfect Tides, a ’90s-esque point-and-click adventure about growing up as a teen on a sleepy island resort town in the early 2000s, finding an escape from real-life feelings of loneliness and loss in discussion forums and late-night AIM chats.

Mara and her friend Lily on the beach… definitely not on drugs

The first game from Meredith Gran, creator of the decade-long comic series Octopus Pie, it approaches challenging subjects with the confidence of someone who created narrative comics every week for ten years. I can’t think of another comics artist who has dived into game design like this, but it pays off with uniquely charming pixel art and animation, colorful writing, and a story that genuinely moved me by the end. It navigates complex feelings about family, old friends, and new loves, while also being genuinely funny.

Let’s put it this way: I’ve been playing videogames for the last 35 years, but Perfect Tides is the first time I felt compelled to write a walkthrough (spoilers!) and actively participate in forums to help people finish it.

This is a long way of saying that you should play Perfect Tides on Steam or Itch, and then go back the Kickstarter for its sequel, Perfect Tides: Station to Station, which has only six days to go and still needs another $20,000 to cross the finish line. (Update: It hit the goal!)

you should probably go back this project

But you don’t have to take my word for it! Kotaku said the original game was “one of the year’s best,” the “kind of game you don’t even see coming, yet turns out to be incredible” and “perfectly captures the intensity and struggle of adolescence.” AV Club called it a “harrowing, funny, beautiful, horrifying, and ultimately reassuring work of art.” Polygon summed it up as “devastatingly honest.” My favorite review was from Buried Treasure’s John Walker, who wrote, “It is the most extraordinary exploration of what it is to be a teenager, told with such heart, such truth.”

Spoilers Ahoy

If you’ve already played Perfect Tides, I want to mention two key moments that are so wonderful, and yet so easy to miss in your first playthrough, they’re worth replaying it for. THESE ARE SPOILERS!

First, if you didn’t manage to patch things up with Lily, you missed a long sequence with her in the final season of the game. (To get the full experience of that sequence, you’ll need to find a specific MP3 and put into the game directory when prompted: a remarkable breaking-the-fourth-wall sidestep around copyright licensing that I’ve never seen in a game before.)

Second, there are two major endings. If it feels anticlimactic, you likely didn’t resolve your conflicts with Lily, Simon, and your family. There are 95 possible points, but you don’t need them all to get the best ending. Feel free to use my 100% completion guide for help getting there.

Perfect Tides isn’t perfect. Like any classic point-and-click adventure, there are some clunky bits here and there, and you’ll likely need the occasional hint or glance at a playthrough to finish. But it’s so worth it.

Mara talks to an online friend
Leave a comment

Exploring 12 Million of the 2.3 Billion Images Used to Train Stable Diffusion’s Image Generator

Posted August 30, 2022September 5, 2022 by Andy Baio

One of the biggest frustrations of text-to-image generation AI models is that they feel like a black box. We know they were trained on images pulled from the web, but which ones? As an artist or photographer, an obvious question is whether your work was used to train the AI model, but this is surprisingly hard to answer.

Sometimes, the data isn’t available at all: OpenAI has said it’s trained DALL-E 2 on hundreds of millions of captioned images, but hasn’t released the proprietary data. By contrast, the team behind Stable Diffusion have been very transparent about how their model is trained. Since it was released publicly last week, Stable Diffusion has exploded in popularity, in large part because of its free and permissive licensing, already incorporated into the new Midjourney beta, NightCafe, and Stability AI’s own DreamStudio app, as well as for use on your own computer.

But Stable Diffusion’s training datasets are impossible for most people to download, let alone search, with metadata for millions (or billions!) of images stored in obscure file formats in large multipart archives.

So, with the help of my friend Simon Willison, we grabbed the data for over 12 million images used to train Stable Diffusion, and used his Datasette project to make a data browser for you to explore and search it yourself. Note that this is only a small subset of the total training data: about 2% of the 600 million images used to train the most recent three checkpoints, and only 0.5% of the 2.3 billion images that it was first trained on.

Screenshot of the LAION-Aesthetic data browser, showing results from a search for Swedish artist Simon Stålenhag with thumbnail images
Screenshot of the LAION-Aesthetic data browser in Datasette

Go try it right now at laion-aesthetic.datasette.io!

Read on to learn about how this dataset was collected, the websites it most frequently pulled images from, and the artists, famous faces, and fictional characters most frequently found in the data.

Data Source

Stable Diffusion was trained off three massive datasets collected by LAION, a nonprofit whose compute time was largely funded by Stable Diffusion’s owner, Stability AI.

All of LAION’s image datasets are built off of Common Crawl, a nonprofit that scrapes billions of webpages monthly and releases them as massive datasets. LAION collected all HTML image tags that had alt-text attributes, classified the resulting 5 billion image-pairs based on their language, and then filtered the results into separate datasets using their resolution, a predicted likelihood of having a watermark, and their predicted “aesthetic” score (i.e. subjective visual quality).

Collage of some of the images with the highest “aesthetic” score, largely watercolor landscapes and portraits of women

Stable Diffusion’s initial training was on low-resolution 256×256 images from LAION-2B-EN, a set of 2.3 billion English-captioned images from LAION-5B‘s full collection of 5.85 billion image-text pairs, as well as LAION-High-Resolution, another subset of LAION-5B with 170 million images greater than 1024×1024 resolution (downsampled to 512×512).

Its last three checkpoints were on LAION-Aesthetics v2 5+, a 600 million image subset of LAION-2B-EN with a predicted aesthetics score of 5 or higher, with low-resolution and likely watermarked images filtered out.

For our data explorer, we originally wanted to show the full dataset, but it’s a challenge to host a 600 million record database in an affordable, performant way. So we decided to use the smaller LAION-Aesthetics v2 6+, which includes 12 million image-text pairs with a predicted aesthetic score of 6 or higher, instead of the 600 million rated 5 or higher used in Stable Diffusion’s training.

This should be a representative sample of images used to train Stable Diffusion’s last three checkpoints, but skewing towards more aesthetically-attractive images. Note that LAION provides a useful frontend to search the CLIP embeddings computed from their 400M and 5 billion image datasets, but it doesn’t allow you to search the original captions.

Source Domains

We know the captioned images used for Stable Diffusion were scraped from the web, but from where? We indexed the 12 million images in our sample by domain to find out.

Nearly half of the images, about 47%, were sourced from only 100 domains, with the largest number of images coming from Pinterest. Over a million images, or 8.5% of the total dataset, are scraped from Pinterest’s pinimg.com CDN.

User-generated content platforms were a huge source for the image data. WordPress-hosted blogs on wp.com and wordpress.com represented 819k images together, or 6.8% of all images. Other photo, art, and blogging sites included 232k images from Smugmug, 146k from Blogspot, 121k images were from Flickr, 67k images from DeviantArt, 74k from Wikimedia, 48k from 500px, and 28k from Tumblr.

Shopping sites were well-represented. The second-biggest domain was Fine Art America, which sells art prints and posters, with 698k images (5.8%) in the dataset. 244k images came from Shopify, 189k each from Wix and Squarespace, 90k from Redbubble, and just over 47k from Etsy.

Unsurprisingly, a large number came from stock image sites. 123RF was the biggest with 497k, 171k images came from Adobe Stock’s CDN at ftcdn.net, 117k from PhotoShelter, 35k images from Dreamstime, 23k from iStockPhoto, 22k from Depositphotos, 22k from Unsplash, 15k from Getty Images, 10k from VectorStock, and 10k from Shutterstock, among many others.

It’s worth noting, however, that domains alone may not represent the actual sources of these images. For instance, there are only 6,292 images sourced from Artstation.com’s domain, but another 2,740 images with “artstation” in the caption text hosted by sites like Pinterest.

Artists

We wanted to understand how artists were represented in the dataset, so used the list of over 1,800 artists in MisterRuffian’s Latent Artist & Modifier Encyclopedia to search the dataset and count the number of images that reference each artist’s name. You can browse and search those artist counts here, or try searching for any artist in the images table. (Searching with quoted strings is recommended.)

Of the top 25 artists in the dataset, only three are still living: Phil Koch, Erin Hanson, and Steve Henderson. The most frequent artist in the dataset? The Painter of Light™ himself, Thomas Kinkade, with 9,268 images.

From a list of 1,800 popular artists, the top 10 found most frequently in the captioned images

Using the “type” field in the database, you can see the most frequently-found artists in each category: for example, looking only at comic book artists, Stan Lee’s name is found most often in the image captions. (As one commenter pointed out, Stan Lee was a comic book writer, not an artist, but people are using his name to generate images in the style of comic book art he was associated with.)

Some of the most-cited recommended artists used in AI image prompting aren’t as pervasive in the dataset as you’d expect. There are only 15 images that mention fantasy artist Greg Rutkowski, whose name is frequently used as a prompt modifier, and only 73 from James Gurney.

(It’s worth saying again that these images are just a subset of one of three datasets used to train the AI, so an artist’s work may have been used elsewhere in the data even if they’re not found in these 12M images.)

Famous People

Unlike DALL-E 2, Stable Diffusion doesn’t have any limitations on generating images of people named in the dataset. To get a sense of how well-represented well-known people are in the dataset, we took two lists of celebrities and other famous names and merged it into a list of nearly 2,000 names. You can see the results of those celebrity counts here, or search for any name in the images table. (Obviously, some of the top searches like “Pink” and “Prince” include results that don’t refer to that person.)

Donald Trump is one of the most cited names in the image dataset, with nearly 11,000 photos referencing his name. Charlize Theron is a close runner-up with 9,576 images.

Collage of generated portraits of Donald Trump and Charlize Theron from Stable Diffusion

A full gender breakdown would take more time, but at a glance, it seems like many of the most popular names in the dataset are women.

Strangely, enormously popular internet personalities like David Dobrik, Addison Rae, Charli D’Amelio, Dixie D’Amelio, and MrBeast don’t appear in the captions from the dataset at all. My hunch was that the CommonCrawl data was too old to include these more recent celebrities, but based on the URLs, there are tens of thousands of images from last year in the data. (If you can solve this mystery, get in touch or leave a comment!)

Fictional Characters

Finally, we took a look at how popular fictional characters are represented in the dataset, since this is subject matter that’s enormously popular using Stable Diffusion and Craiyon, but often impossible with DALL-E 2, as you can see in this Mickey Mouse example from my previous post.

“realistic 3d rendering of mickey mouse working on a vintage computer doing his taxes” on DALL·E 2 (left) vs. Stable Diffusion (right)

For this set of searches, we used this list of 600 fictional characters from pop culture to search the image dataset. You can browse the results here, or search for any other character in the images table. (Again, be aware that one-word character names like “Link,” “Data,” and “Mario” are likely to have many more results unrelated to that character.)

Characters from the MCU like Captain Marvel (4,993 images), Black Panther (4,395), and Captain America (3,155) are some of the best represented in the dataset. Batman (2,950) and Superman (2,739) are neck and neck. Luke Skywalker (2,240) has more images than Darth Vader (1.717) and Han Solo (1,013). Mickey Mouse barely breaks the top 100 with 520 images.

NSFW Content

Finally, let’s take a brief look at the representation of adult material, another huge difference between Stable Diffusion and any other model. OpenAI rigorously removed sexual/violent content from its training data and blocked potentially NSFW keywords from prompts.

The Stable Diffusion team built a predictor for adult material and assigned every image a NSFW probability score, which you can see in the “punsafe” field in the images table, ranging from 0 to 1. (Warning: Obviously, sorting by that field will show the most NSFW images in the dataset.)

In their announcements of the full LAION-5B dataset, LAION team member Romain Beaumont estimated that about 2.9% of the English-language images were “unsafe,” but in browsing this dataset, it’s not clear how their predictors defined that.

There’s definitely NSFW material in the image dataset, but surprisingly little of it. Only 222 images got a “1” unsafe probability score, indicating 100% confidence that it’s unsafe, about 0.002% of the total images — and those are definitely porn. But nudity seems to be unusual outside of that confidence level: even images with a 0.9999 punsafe score (99.99% confidence) rarely have nudity in them.

It’s plausible that filtering on aesthetic ratings is removing huge amounts of NSFW content from the image dataset, and the full dataset contains much more. Or maybe their definitions of what is “unsafe” are very broad.

More Info

Again, huge thanks to Simon Willison for working with me on this: he did all the heavy lifting of hosting the data. He wrote a detailed post about making the search engine if you want more technical detail. His Datasette project is open-source, extremely flexible, and worth checking out. If you’re interested in playing with this data yourself, you can use the scripts in his GitHub repo to download and import it into a SQLite database.

If you find anything interesting in the data, or have any questions, feel free to drop them in the comments.

23 Comments

Opening the Pandora’s Box of AI Art

Posted August 26, 2022January 12, 2023 by Andy Baio

Last month, I finally got access to OpenAI’s DALL·E 2 and immediately started exploring the text-to-image AI’s potential for creative shitposting, generating horror after horror: the Eames Lounge Toilet, the Combination Pizza Hut and Frank Lloyd Wright’s Fallingwater, toddler barflies, Albert Einstein inventing jorts, and the can’t-unsee “close up photo of brushing teeth with toothbrush covered with nacho cheese.”

DALL-E 2 rendered illustration of man with glasses working on a computer
“plasticine nerd working on a 1980s computer”
Detailed closeup realistic rendering of a human finger with a tiny gummy worm crawling on it with warm colors
“macro photo of beautiful living gummy candy worm on a human hand”
Bizarre rendering of a building with elements of both Pizza Hut and Frank Lloyd Wright's Fallingwater, in a forest setting
“Combination Pizza Hut and Frank Lloyd Wright’s Fallingwater”

DALL·E 2 diligently hallucinated each image out of noise from the compressed latent space, multi-dimensional patterns discovered in hundreds of millions of captioned images scraped from the internet.

DALL-E 2 generated rendering of a macro photo of two slugs in the grass, one wearing a headdress of honeycomb and the other with stacked cottonballs on their head
Rendering of a macro photo of two slugs drapes with golden cloths adorned with floral decorative elements
“two slugs in wedding attire getting married, stunning editorial photo for bridal magazine shot at golden hour”

The prompt that finally melted my brain was the one above, with images of slugs getting married at golden hour. I originally specified a “tuxedo and wedding dress” with predictable results, but changing it to “wedding attire” gave the AI the flexibility to depict variations of what slugs might marry in, like headdresses made of cotton balls and honeycomb.


I’ve never felt so conflicted using an emerging technology as DALL·E 2, which feels like borderline magic in what it’s capable of conjuring, but raises so many ethical questions, it’s hard to keep track of them all.

There are the many known issues that OpenAI’s acknowledged and worked to mitigate, like racial or gender biases in its image training set, or the lengths they’ve gone to avoid generating sexual/violent content or recognizable celebrities and trademarked characters.

But it opens profound questions about the ethics of laundering human creativity:

  • Is it ethical to train an AI on a huge corpus of copyrighted creative work, without permission or attribution?
  • Is it ethical to allow people to generate new work in the styles of the photographers, illustrators, and designers without compensating them?
  • Is it ethical to charge money for that service, built on the work of others?

There are basic fundamental questions about whether it’s even legal: these are largely untested waters in copyright law and it seems destined to end up in court. Training deep learning models on copyrighted material may be fair use, but only a judge can decide that. (The fact that OpenAI’s removing some results from the image training set, like celebrity faces and Disney/Marvel characters, suggests they’re well aware of angering the biggest litigants.)

A 3D rendered grey mouse wearing a t-shirt and working on a computer/typewriter hybrid
a 3D rendering of Mickey Mouse sitting at a retro-like computer monitor and keyboard
“realistic 3d rendering of mickey mouse working on a vintage computer doing his taxes” on DALL·E 2 (left) vs. Stable Diffusion (right)

As these models improve, it seems likely to reduce demand in some paid creative services, from stock photography to commissioned illustrations. I empathize with the concerns of artists whose work was silently used to train commercial products in their style, without their consent and with no way to opt-out.


The world was just starting to grapple with the implications of this technology when, on Monday, a company called Stability AI released its Stable Diffusion text-to-image AI publicly.

Stable Diffusion is free, open-source, runs on your own computer, and ships without any of the guardrails and content filters of its predecessors. It comes with a Safety Classifier enabled by default that tries to determine if a generated image is NSFW, but it’s easily disabled.

Realistic rendering of Barack Obama kissing a somber Donald Trump's head
“Obama comforting Trump”
Series of photorealistic black-and-white rendered studio portraits of Scarlett Johansson
“photo of Scarlett Johansson by Diane Arbus”
One of a series of rendered photorealistic images of Kanye West wearing a turban and tunic shirt
“Kanye West in the Taliban”
Samples of celebrity images generated by Stable Diffusion users

Unlike existing AI platforms like DALL·E 2 and Midjourney, Stable Diffusion can generate recognizable celebrities, nudity, trademarked characters, or any combination of those. (Try searching Lexica, the newly-launched Stable Diffusion search engine, for example output.)

Releasing an uncensored dream machine into the wild had some predictable results. Two days after its release, Reddit banned three subreddits devoted to NSFW imagery made with Stable Diffusion, presumably because of the rapid influx of AI-generated fake nudes of Emma Watson, Selena Gomez, and many others.

Screenshot of message explaining the "Stable Diffusion NSFW" subreddit was banned for violating Reddit's rules against non-consensual intimate media

The permissive license on Stable Diffusion allows commercial services to implement its AI model, such as NightCafe, which encourages paying customers to generate art in the styles of living artists like Pendleton Ward, Greg Rutkowski, Amanda Sage, Rebecca Sugar, and Simon Stålenhag, who has spoken out against the practice.

Screenshot from NightCafe with a list of artist names recommended as modifiers for art prompts
List of artist modifiers in NightCafe

On top of it, Stable Diffusion’s terms state that every image generated with their Dream Studio is effectively public domain, under the CC0 1.0 Public Domain license. They make no claim over the copyright of images generated with the self-hosted Stable Diffusion model. (OpenAI’s terms says that images created with DALL·E 2 are their property, with customers granted a license to use them commercially.)


A common argument I’ve seen is that training AI models is like an artist learning to paint and finding inspiration by looking at other artwork, which feels completely absurd to me. AI models are memorizing the features found in hundreds of millions of images, and producing images on demand at a scale unimaginable for any human—thousands every minute.

The results can be surprising and funny and beautiful, but only because of the vast trove of human creativity it was trained on. Stable Diffusion was trained on LAION-Aesthetic, a 120-million image subset of a 5 billion image crawl of image-text pairs from the web, winnowed down to the most aesthetically attractive images. (OpenAI has been more cagey about its sources.)

There’s no question it takes incredible engineering skill to develop systems to analyze that corpus and generate new images from it, but if any of these systems required permission from artists to use their images, they likely wouldn’t exist.


Stability AI founder Emad Mostaque believes the good of new technology will outweigh the harm. “Humanity is horrible and they use technology in horrible ways, and good ways as well,” Mostaque said in an interview two weeks ago. “I think the benefits far outweigh any negativity and the reality is that people need to get used to these models, because they’re coming one way or another.” He thinks that OpenAI’s attempts to minimize bias and mitigate harm are “paternalistic,” and a sign of distrust of their userbase.

Today we all made the World a more creative, happier and communicative place.

More to come in the next few days but I for one can’t wait to see what you all create.

Let’s activate humanity’s potential.

— Emad (@EMostaque) August 22, 2022

In that interview, Mostaque says that Stability AI and LAION were largely self-funded from his career as a hedge fund manager, and with additional resources, they’ve created a 4,000 A100 cluster with the support of Amazon that “ranks above JUWELS Booster as potentially the tenth fastest supercomputer.”

On Monday, Mostaque wrote that they plan to use those compute resources to expand to other AI-generated media: audio next month, and then 3D and video. I’d expect Stability AI to approach these new models in the same way, with little concern over their potential for misuse by bad actors, and with even less attention spent addressing the concerns of the artists and creators whose work makes them possible.


Like I said, I’m conflicted. I love playing with new technology, and I’m excited about the creative potential of these new tools. I want to feel good about the tools I use.

I don’t trust OpenAI for a bunch of reasons, but at least they seemed to try to do the right thing with their various efforts to reduce bias and potential harm, even if it’s sometimes clumsy.

Stable Diffusion’s approach feels irresponsible by comparison, another example of techno-utopianism unmoored from the reality of the internet’s last 20 years: how an unwavering commitment to ideals of free speech and anti-censorship can be deployed as a convenient excuse not to prevent abuse.

For now, generative AI platforms are some of the most resource-intensive projects in the world, leading to a vanishingly small number of participants with access to vast compute resources. It would be nice if those few companies would act responsibly by, at the very least, providing an opt-out for those who don’t want their work in future training data, finding new ways to help artists that do choose to participate, and following the lead of OpenAI in trying to minimize the potential for harm.

I don’t pretend to know where these things will go: the risks may be overblown and we may be at the start of a massive democratization in the creation of art, or these platforms may make the already-precarious lives of artists harder, while opening up new avenues for deepfakes, misinformation, and online harassment and exploitation. I’d really like to see more of the former, but it won’t happen on its own.

4 Comments
⇠ Older Posts
Newer Posts ⇢
Waxy.org | About