After defeating the all-time human champions of the game show, Jeopardy in 2011, IBM’s Watson supercomputer was scheduled for an upgrade. Wanting to give it access to a broader vocabulary, its programmers introduced the AI to the Urban Dictionary, a web-based collection of cultural phrases and slang. Unfortunately, because of this, Watson was soon swearing profusely.
Something similar happened in 2016, when Microsoft introduced it’s new Twitter chatbot, Tay, to the world. Based on the persona of a teenage girl, the program devolved into a racist neo-Nazi after less than 24 hours on the internet and had to be removed.
Why do natural language AIs frequently degenerate in this way, cursing and spouting biased, racist and otherwise toxic messages. I recently explored these issues for Geekwire, speaking with a team of researchers from the Allen Institute of Artificial Intelligence and the University of Washington who are exploring this problem. While the problem is very challenging, their work may help address a problem that is becoming increasingly important as we increasingly rely on AIs to communicate with us using natural language.