Language generation algorithms are known to incorporate racist and sexist ideas. They are trained in the language of the internet, including the dark corners of Reddit and Twitter which can include hate speech and disinformation. All harmful ideas present in these forums are normalized as part of their learning.
Researchers have now demonstrated that the same can be true for image generation algorithms. Give her a photo of a cropped man just below her neck, and 43% of the time, she’ll automatically complete him wearing a suit. Give the same a cropped photo of a woman, even a famous woman like US Rep. Alexandria Ocasio-Cortez, and 53% of the time she will automatically complement it with a low neckline or bikini top. This has implications not only for image generation, but for all computer vision applications, including video-based candidate assessment algorithms, facial recognition and surveillance.
Ryan Steed, doctoral student at Carnegie Mellon University, and Aylin Caliskan, assistant professor at George Washington University, looked at two algorithms: OpenAI IGPT (a version of GPT-2 which is trained on pixels instead of words) and Google SimCLR. Although each algorithm approaches training images differently, they share one important characteristic: they both completely use unsupervised learning, which means they don’t need humans to label the images.
This is a relatively new innovation from 2020. Previous computer vision algorithms mainly used supervised learning, which consists of giving them manually labeled images: cat photos with the “cat” tag and baby photos with the “baby” tag. But in 2019, researcher Kate Crawford and artist Trevor Paglen discovered that these man-made labels in ImageNet, the most fundamental image dataset for training computer vision models, sometimes contain disturbing language, like “bitch” for women and racial insults for minorities.
The last article demonstrates an even deeper source of toxicity. Even without these human labels, the images themselves encode unwanted patterns. The problem parallels what the natural language processing (NLP) community has already discovered. The huge datasets compiled to power these data-hungry algorithms capture everything on the internet. And the internet has an overrepresentation of scantily clad women and other often harmful stereotypes.
To conduct their study, Steed and Caliskan cleverly adapted a technique Caliskan previously used to examine biases in unsupervised NLP models. These models learn to manipulate and generate language using Word Embedding, a mathematical representation of language that groups commonly used words and separates commonly found words. In one 2017 article published in Science, Caliskan measured the distances between the different pairs of words that psychologists used to measure human biases the implicit association test (IAT). She found that these distances recreated IAT results almost perfectly. Stereotypical word combinations like man and career or woman and family were close, while opposite pairs like man and family or woman and career were very distant.
The iGPT is also based on nests: it groups or separates the pixels according to the frequency at which they coexist in its training images. These pixel embeddings can then be used to compare the proximity or distance of two images in mathematical space.
In their study, Steed and Caliskan once again found that these distances reflect the results of the IAT. Photos of men, ties, and suits appear close together, while photos of women appear more distant. The researchers obtained the same results with SimCLR, although it uses a different method to derive embeddings from images.
These results have worrying implications for image generation. Other image generation algorithms, such as generative antagonistic networks, led to an explosion of deepfake pornography this targets almost exclusively women. The iGPT in particular adds yet another way for people to generate sexualized photos of women.
But the potential downstream effects are much greater. In the field of NLP, unsupervised models have become the backbone of all kinds of applications. Researchers start with an existing, unsupervised model like BERT or GPT-2 and use bespoke data sets to “fine tune” it for a specific purpose. This semi-supervised approach, a combination of unsupervised and supervised learning, has become a de facto standard.
Likewise, the computer field of vision is starting to see the same trend. Steed and Caliskan are concerned about what these entrenched biases might mean when algorithms are used for sensitive applications such as law enforcement or hiring, where models are already analyzing candidate video recordings to decide if they are a good fit for the job. job. “These are very dangerous applications that make big decisions,” Caliskan says.
Deborah Raji, a Mozilla member who co-wrote an influential study revealing the biases of facial recognition, the study should serve as a reminder of the computer field of vision. “For a long time, a lot of the criticism of bias was about how we label our images,” she says. Now this article says that “the actual composition of the dataset causes these biases. We are accountable for how we maintain these datasets and collect this information. “
Steed and Caliskan are calling for greater transparency from the companies that develop these models to open them up and let the university community continue its research. They also encourage fellow researchers to do more testing before deploying a vision model, for example using the methods they developed for this article. Finally, they hope the field will develop more responsible ways to compile and document what is included in training datasets.
Caliskan says the goal ultimately is to gain greater awareness and control when applying computer vision. “We have to be very careful how we use them,” she says, “but at the same time, now that we have these methods, we can try to use them for social good.”