The First Rule of AI: Don’t Be Creepy

The topic of AI is one of the hottest and most controversial around, changing our understanding of what is possible faster than we can easily keep up with. This is a trend that has been building for years, and like any visible and popular topic of conversation, there is a lot of heat and smoke, but not yet much genuine illumination.

One of the issues that we are seeing around AI is actually a more general issue in technology circles: “When you really want to use a new shiny hammer, anything can be made to look like a nail.” There are many examples of someone doing something “in the cloud” or “on the blockchain” for no apparent reason, and the same can happen with AI. Normally this would not be an issue, but because of the power of AI, some of these misapplications of technology come with the potential to cause significant real-world impact.

It is important to consider the wider consequences and impacts of what you are building. This requirement is not specific to AI; automation and IT have always impacted people’s lives. However, these new techniques are capable of very different applications: to augment human capabilities, or to attempt to replace humans, seen as expensive and unreliable.

In addition to the ethical argument, there is also a purely utilitarian argument as to why humans cannot be replaced wholesale by AI. Automating low-level routine tasks performed by humans will not eliminate any but the lowest-level, most circumscribed jobs. The very hard and measurable benefit is that getting rid of that friction will free up those expensive humans to do what they do best: the sort of creative, imaginative, non-linear work that AI is and will remain deeply unsuited to.

Instead of artificial intelligence, we should talk about augmented intelligence: human skills, empowered by digital assistants, performing orders of magnitude above what either could achieve alone.

Artificial Intelligence, Real World Application

We saw one of these examples play out with the Google Duplex demo at Google I/O 2018.

In case you missed it, Duplex was probably the most noteworthy item to come out of Google I/O this year. It’s a voice assistant, but it does not work like the assistants that we know that live in our phones – or these days, in cylinders and hockey puck shapes in our kitchens and living rooms. Instead, it makes phone calls on our behalf – doing things like booking appointments or checking opening hours at our favorite shops. To do this, Duplex interacts with human operators on the other end of the phone call, and presumably to make this interaction smoother, Google has programmed it to simulate human patterns of speech, including hesitations and other filler sounds.

This is all incredibly impressive from a technical point of view, but there are important points to consider, which are not just about Duplex specifically but apply more generally to the whole emerging field of AI.

Firstly, all we saw so far is a recorded video, not a live demo. Now I do not mean to imply anything about the state of the product; very often, a recorded or click-through demo is absolutely the way to go, especially in a big keynote situation like at Google I/O. However, what we have not seen in the video is how Duplex handles error conditions. When everything goes as planned, Duplex functions smoothly and produces a good imitation of a human. If it goes off the rails of the expected responses, what happens then? What if the person who picks up the phone has an accent, or there is significant background noise? What if the person doesn’t understand what Duplex is saying?

The handling of errors and corner cases is always important with any degree of automation, but it is doubly so with AI, because there is generally feedback and training involved. That is, these techniques are rarely used to build products that are frozen in amber; rather, they “learn” from each interaction, gathering feedback on what is a good or valid response, and attempting to generalize that understanding and produce more responses like the successful ones.

If you are looking at implementing any sort of AI or machine learning, make sure that you have mechanisms to give that feedback — and also to review it, to make sure that your users are not reinforcing negative behaviors in your model.

Secondly, Duplex has been widely criticized for deceiving humans and generally shifting the burden of work to someone who is neither the customer nor a willing user of the service. There is certainly an argument for making both information and functionality such as Google showed in their video demo more easily available, and having just spent four days attempting, and ultimately failing, to make a restaurant booking over the phone, I can sympathise. However, Duplex crosses a line, in the same way that many adtech products do when they apply similar techniques to the processing of personal data. Context matters, and users will accept AI support when they are in control of the interaction and understand it: an assistant on the phone that I invoke myself, or a loyalty card for a particular shop that I visit regularly.

The same technology becomes something very different when it is applied in other contexts, such as an AI that pretends to be human in order to extract information that is not otherwise available. Many people are already very frustrated with a “surveillance economy” dedicated to following them around the web, solely to pester them with incessant ads for the products that they just bought. Duplex, in its current form, crosses the line between “useful” and “creepy.”

AI is Already Everywhere

Finally, and related to the previous point, Google has chosen an incredibly difficult domain for Duplex to operate in. Google of course has any number of very successful AI products in its portfolio, most famously its core Search functionality, but also in less visible areas such as the operation of its own back-end infrastructure. All of these applications share one characteristic: easy access to huge volumes of pre-classified data. While Duplex presumably shares its voice recognition engine with the general Assistant functionality that has been built in to Android for years, the domains and expectations are very different.

The selection of a use case is more important than the specific technology. We can all think of projects where the priority was placed on the technology — infrastructure, programming language, or algorithm — rather than on the end results. Just because you can, or because someone else has, doesn’t mean you should. By carefully considering how a technology will be used, as well as how it works internally, it becomes easier to avoid both technological and ethical failures.

The good news is that capabilities enabled or democratized by these new technologies are driving new attention to the ethics of their applications, whether in academic circles, with ethics added to the computer science curriculum at MIT, Harvard, and Stanford, or in commercial ones, with Google’s new AI ethics board. Some are even calling for a new Hippocratic oath for data scientists, along the lines of the age-old one that medical professionals take.

As these sentiments become more widespread in the mainstream conversation about technology, the human dimension can be included right from the design phase, ensuring products that develop human potential to new levels, instead of trying to mimic or control it. Now there’s something worthy of positive excitement.