When Google’s photo organising service tagged Blacks as ‘gorillas’ it walked AI into Hall of Shame

When Google’s photo organising service tagged Blacks as ‘gorillas’ it walked AI into Hall of Shame


When a person dies in a car crash in the US, data on the incident is typically reported to the National Highway Traffic Safety Administration. Federal law requires that civilian airplane pilots notify the National Transportation Safety Board of in-flight fires and some other incidents.

The grim registries are intended to give authorities and manufacturers better insights on ways to improve safety. They helped inspire a crowd-sourced repository of artificial intelligence incidents aimed at improving safety in much less regulated areas, such as autonomous vehicles and robotics.

The AI Incident Database launched late in 2020 and now contains 100 incidents, including #68, the security robot that flopped into a fountain, and #16, in which Google’s photo organising service tagged Black people as “gorillas.” Think of it as the AI Hall of Shame.

The AI Incident Database is hosted by Partnership on AI, a non-profit founded by large tech companies to research the downsides of the technology. The roll of dishonour was started by Sean McGregor, who works as a machine learning engineer at voice processor start-up Syntiant.

He says it’s needed because AI allows machines to intervene more directly in people’s lives, but the culture of software engineering does not encourage safety.

“Often I’ll speak with my fellow engineers and they’ll have an idea that is quite smart, but you need to say ‘Have you thought about how you’re making a dystopia?’” McGregor says. He hopes the incident database can work as both a carrot and stick-on tech companies, by providing a form of public accountability that encourages companies to stay off the list, while helping engineering teams craft AI deployments less likely to go wrong.

“My fellow engineers will have an idea that is quite smart, but you need to say ‘Have you thought about how you’re making a dystopia?’”

The database uses a broad definition of an AI incident as a “situation in which AI systems caused, or nearly caused, real-world harm.” The first entry in the database collects accusations that YouTube Kids displayed adult content, including sexually explicit language.

The most recent, #100, concerns a glitch in a French welfare system that can incorrectly determine people owe the state money. In between, there are autonomous vehicle crashes, like Uber’s fatal incident in 2018 and wrongful arrests due to failures of automatic translation or facial recognition.

Anyone can submit an item to the catalogue of AI calamity. McGregor approves additions for now and has a sizeable backlog to process but hopes eventually the database will become self-sustaining and an open-source project with its own community and curation process.

One of his favourite incidents is an AI blooper by a face-recognition-powered jaywalking-detection system in Ningbo, China, which incorrectly accused a woman whose face appeared in an ad on the side of a bus.

The 100 incidents logged so far include 16 involving Google, more than any other company. Amazon has seven, and Microsoft two.  “We are aware of the database and fully support the partnership’s mission and aims in publishing the database,” Amazon said in a statement. “Earning and maintaining the trust of our customers is our highest priority, and we have designed rigorous processes to continuously improve our services and customers’ experiences.” Google and Microsoft did not respond to requests for comment.

Georgetown’s Center for Security and Emerging Technology is trying to make the database more powerful. Entries are currently based on media reports, such as incident 79, which cites Wired reporting on an algorithm for estimating kidney function that by design rates Black patients’ disease as less severe. Georgetown students are working to create a companion database that includes details of an incident, such as whether the harm was intentional or not, and whether the problem algorithm acted autonomously or with human input.

Helen Toner, director of strategy at CSET, says that exercise is informing research on the potential risks of AI accidents. She also believes the database shows how it might be a good idea for lawmakers or regulators eyeing AI rules to consider mandating some form of incident reporting, similar to that for aviation.

EU and US officials have shown growing interest in regulating AI, but the technology is so varied and broadly applied that crafting clear rules that won’t be quickly outdated is a daunting task. Recent draft proposals from the EU were accused variously of overreach, techno-illiteracy and being full of loopholes. Toner says requiring reporting of AI accidents could help ground policy discussions.

“I think it would be wise for those to be accompanied by feedback from the real world on what we are trying to prevent and what kinds of things are going wrong,” she says.

Missy Cummings, director of Duke’s Humans and Autonomy lab, says the database is a good idea but to make a difference will need wider support and recognition from companies building AI systems and institutions with an interest in safety.

Some aviation incident databases achieve high quality and broad coverage in part thanks to legal mandates on pilots and other crew. Others, like the Aviation Safety Reporting System operated by NASA, rely on a culture of confidential reporting from maintenance staff and others.

For the AI Incident Database to earn similar clout will take time, and buy-in from industries driving high stakes AI projects such as autonomous vehicles, and perhaps regulators, too.

Cummings, previously a Navy fighter pilot, says federal agencies so far appear to be too short on resources and AI talent to properly engage with the dangers of more autonomous systems on roads and in the air.

A federal investigation into two fatal Boeing 737 Max crashes advised the Federal Aviation Administration to pay more attention to how autonomous software can confuse and overwhelm pilots when it suddenly throws control back to humans.

Cummings argues that companies working to automate road and air transport should also find it in their own interest to collaborate on voluntary standards and protocols for safety.

“None of those industries are fully prepared to wrestle with this alligator,” she says.

  • A Wired report
About author

Your email address will not be published. Required fields are marked *